Courses of Study 2014-2015 
    
    Mar 28, 2024  
Courses of Study 2014-2015 [ARCHIVED CATALOG]

Add to Favorites (opens a new window)

CS 5304 - Data Science in the Wild

(crosslisted) INFO 5304  
     


Spring. 3 credits. Letter grades only.

Enrollment limited to: students enrolled at Cornell Tech. Offered at Cornell Tech, New York City.

Y. Kanza.

Massive amounts of data are collected by many companies and organizations and the task of a data scientist is to extract actionable knowledge from the data – for scientific needs, to improve public health, to promote businesses, for social studies and for various other purposes. This course will focus on the practical aspects of the field and will attempt to provide a comprehensive set of tools for extracting knowledge from data.

The course will cover the topics needed to solve data-science problems, which include problem formulation (business understanding), data preparation (collection, sampling, integration, cleaning), data modeling (characterization, model selection, and analysis), implementation (large-scale data processing, feedback loops, QA) and communication (data presentation, visualization). Advanced topics such as causal inference and processing streaming data will be presented.

Throughout the course, the students will perform a data-science mission with all the required steps, from problem formulation to result presentation.

 



Add to Favorites (opens a new window)