Data Science and Artificial Intelligence

Data Analysis and Data Quality

Integrated course, 5.00 ECTS

 

Course content

At the beginning of the course, Data Analysis & Cleaning will be presented as a process step in the process model of Computational Intelligence. Following the presentation of current data models and structures, an introduction to Exploratory Data Analysis including the presentation of standard graphical methods of this area will be given. It then introduces the central concept of data quality and metrics to measure data quality aspects. Finally, the individual phases (screening, diagnosis, treatment, ...) and methods (detection of typical errors such as missing values, outliers, obvious inconsistencies, etc. by means of data mining and their removal such as simple transformation, deductive correction, missing Data Imputation) of Data Cleaning are shown. In the tutorial, Data Cleaning will be demonstrated by using R and the Hadoop Toolchain.

Learning outcomes

The graduate gains detailed knowledge of data analysis and all related principles and methods.

Recommended or required reading and other learning resources / tools

Books: Gerhard Svolba: Data Quality for Analytics using SAS, Anasse Bari: Wolf-Michael Kähler: Statistische Datenanalyse, Predictive Analytics for Dummies; Vijay Kotu: Predictive Analytics and Data Mining, Cleve Jürgen: Data Mining, EMC Education: Data Science and Big Data Analytics;
Journals:

Mode of delivery

2 THW Lecture, 2 THW Tutorial

Prerequisites and co-requisites

Module MAT 1

Assessment methods and criteria

Lecture: final exam, Tutorial: continuous appraisal