Take your first step in data manipulation with dplyr. This tutorial will present you the basic functions: select(), filter(), mutate(), transmute(), group_by() and summarise().
Welcome, friend :)
This is another tutorial about spark using the sparklyr package. In this way, I am going to present you how tuning your model parameters. It’s not so difficult but there is some details that I have to tell you. If you are not confident about trainning your models in spark yet, check my previous post and come back here later :)
Let’s get to action…
Pipeline
First of all, you need to create a pipeline.
Welcome, friend :)
In this tutorial, I am going to present you how to perform supervised learning in R using the sparklyr package. The models that I am going to use are:
Linear Regression
Naive Bayes
Decision Tree
Random Forest
Logistic Regression
Multilayer Perceptron
Gradient Boosted Tree
Support Vector Machine
If you don’t know how to connect spark in R, don’t worry…check this out. If you have any question or suggestions, don’t hesitate to contact me on samuelmacedo@recife.
Welcome, friend :)
In this tutorial, I am going to present you how to connect spark with R in your local machine. This will be a very brief tutorial but you will need it to understand when you read the next tutorials about feature transformation, supervised and unsupervised learning. If you have any doubts, don’t hesitate to contact me on samuelmacedo@recife.ifpe.edu.br.
Let’s get to action… to do this task I use the sparklyr package.
Welcome, friend :)
In this tutorial, I am going to present you the basic functions in the dplyr package: select(), filter(), mutate(), transmute(), group_by() and summarise(). If you have any doubts, don’t hesitate to contact me on samuelmacedo@recife.ifpe.edu.br.
Let’s get to action…first of all, to install dplyr, please use the commands below:
install.packages("dplyr")
library(dplyr)
Before I start: as_tibble()
Tibble is a modern reimagining of the data.frame. You don’t need to change your data.