Courses of Study 2022-2023 
    
    Apr 19, 2024  
Courses of Study 2022-2023 [ARCHIVED CATALOG]

Add to Favorites (opens a new window)

STSCI 5040 - R Programming for Data Science


     
Fall. 4 credits. Student option grading.

Prerequisite: Introductory statistics course. Co-meets with STSCI 3040 .

J. Entner.

Statistics courses usually use clean and well-behaved data, this leaves many unprepared for the messiness and chaos of data in the real world. This course aims to prepare students for dealing with data using the R programming language.  The introduction will overview the basic R syntax, foundational R programming concepts such as data types, vectors arithmetic, and indexing, and importing data into R from different file formats.  The data wrangling topics include how to tidy data using the tidy verse to better facilitate analysis, string processing with regular expressions and with dates and times as file formats, web scraping, and text mining. Data visualization topics will cover visualization principles, the use of ggplot2 to create custom plots, and how to communicate data-driven findings.

Outcome 1: Learn basic R syntax, foundational R programming concepts such as data types, vectors arithmetic, and indexing, and importing data into R from different file formats.

Outcome 2: Learn data wrangling topics include how to tidy data using the tidy verse.

Outcome 3: Produce professional and informative data visualizations.

Outcome 4: Use R Markdown to create reports to document data analysis and communicate findings.



Add to Favorites (opens a new window)