3+ Hours of Video Instruction
Advanced R LiveLessons, Part I teaches R programmers techniques for dealing with large data, both in memory and in databases. It also covers advanced machine learning, network analysis, web graphics, and documents and presentations.
In this video training Jared starts with reading XML data and some common data manipulation operations using various base R functions and packages like plyr, comparing the speed of in memory calculations. He then demonstrates more advanced techniques for accomplishing the same task such as data.table, dplyr, Rcpp and parallel computation for increased speed. He then turns to some other useful advanced concepts and techniques.
About the Instructor
Jared P. Lander is the CEO of Lander Analytics, the organizer of the New York Open Statistical Programming Meetup (formerly the R Meetup) and an adjunct professor of Statistics at Columbia University. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. He specializes in data management, multilevel models, machine learning, generalized linear models, data management, visualization, and statistical computing. He is the author of R for Everyone, a book about R Programming geared toward data scientists and non-statisticians alike. Very active in the data community, Jared is a frequent speaker at conferences, universities, and meetups around the world. He is a member of the 2014 Strata New York selection committee.
What You Will Learn
Who Should Take This Course
Table of Contents
Lesson 1: Reading XML Data
Reading data is the first step to analyzing it and with so much data in XML format it is important to be able to load XML data directly into R. In Lesson 1 you learn how to use the XML package, xPath, and helper functions to easily parse XML and HTML data.
Lesson 2: Faster Group Operations
Data munging often takes up 80% of a data scientist’s time, so efficiently processing data is incredibly important. Lesson 2 covers a number of functions for aggregating data and discusses their relative speeds, both in terms of computing and human time.
Lesson 3: Rcpp for Faster Code
Using Rcpp you can efficiently and easily integrate C++ with R for even more performant code. Lesson 3 provides a look at building R packages with both R and C++ code.
Lesson 4: Advanced Machine Learning
Two hot topics in modern data science are recommendation engines and text mining. Both are handled easily in R using the RecommenderLab and RTextTools packages as seen in Lesson 4.
Lesson 5: Network Analysis
Network analysis uses graph theory to identify relationships and key players in groups. Lesson 5 focuses on the igraph package for working with network data.
Lesson 6: Web Graphics
Visualizations have always been important, and modern web technology has opened new frontiers for displaying data. Lesson 6 provides you a first look at ggvis and rCharts for easily creating web graphics using simple R code.
Lesson 7: Easier Presentations and Documents with RMarkdown
RStudio has made great advancements in creating documents and presentations, making the whole process easier than it was even just a few months ago. Lesson 7 discusses the very easy steps to generate HTML, PDF and Word documents and HTML presentations.
About LiveLessons Video Training
The LiveLessons Video Training series publishes hundreds of hands-on, expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. This professional and personal technology video series features world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, IBM Press, Pearson IT Certification, Prentice Hall, Sams, and Que. Topics include: IT Certification, Programming, Web Development, Mobile Development, Home and Office Technologies, Business and Management, and more. View all LiveLessons on InformIT at: http://www.informit.com/livelessons.