dplyr: basic functions

April 26, 2018 in tutorials

Take your first step in data manipulation with dplyr. This tutorial will present you the basic functions: select(), filter(), mutate(), transmute(), group_by() and summarise().

Samuel Macêdo

Data scientist and R developer

samuelmacedo@recife.ifpe.edu.br

no post found

sparklyr: tuning your hyperparameters

Jun 6, 2018

Welcome, friend :) This is another tutorial about spark using the sparklyr package. In this way, I am going to present you how tuning your model parameters. It’s not so difficult but there is some details that I have to tell you. If you are not confident about trainning your models in spark yet, check my previous post and come back here later :) Let’s get to action… Pipeline First of all, you need to create a pipeline.

sparklyr: supervised learning

May 5, 2018

Welcome, friend :) In this tutorial, I am going to present you how to perform supervised learning in R using the sparklyr package. The models that I am going to use are: Linear Regression Naive Bayes Decision Tree Random Forest Logistic Regression Multilayer Perceptron Gradient Boosted Tree Support Vector Machine If you don’t know how to connect spark in R, don’t worry…check this out. If you have any question or suggestions, don’t hesitate to contact me on samuelmacedo@recife.

sparklyr: connecting spark in local mode

May 5, 2018

Welcome, friend :) In this tutorial, I am going to present you how to connect spark with R in your local machine. This will be a very brief tutorial but you will need it to understand when you read the next tutorials about feature transformation, supervised and unsupervised learning. If you have any doubts, don’t hesitate to contact me on samuelmacedo@recife.ifpe.edu.br. Let’s get to action… to do this task I use the sparklyr package.

dplyr: basic functions

Apr 4, 2018

Welcome, friend :) In this tutorial, I am going to present you the basic functions in the dplyr package: select(), filter(), mutate(), transmute(), group_by() and summarise(). If you have any doubts, don’t hesitate to contact me on samuelmacedo@recife.ifpe.edu.br. Let’s get to action…first of all, to install dplyr, please use the commands below: install.packages("dplyr") library(dplyr) Before I start: as_tibble() Tibble is a modern reimagining of the data.frame. You don’t need to change your data.