Scripting for Data Scientists
Integrated course, 5.00 ECTS
Course content
Area 1: Programming / scripting
- Programming paradigms
- Data types
- Elementary commands
- Operators and control structures
- Functions and libraries
- Regular expressions
- Clean coding and debugging
Area 2: Data-based applications
- Import and export of data
- Elementary data handling
Area 3: Tools
- Version control systems
- development environments
Learning outcomes
The students are able to implement simple and complex programs in at least two programming languages relevant to the field of data science, as well as to import, export and elementarily transform data. They are also able to discuss the advantages and disadvantages of various programming paradigms and to use version management programs and development environments efficiently.
Recommended or required reading and other learning resources / tools
Recommended literature or books:
- Ernesti, J. (2017). Python 3: The comprehensive manual: Language basics, object-oriented programming, modularization. Rheinwerk Computing, 5th edition.
- Luhmann, M. (2015). R for beginners: Introduction to statistical software for the social sciences. Beltz, 4th edition.
- McKinney, W. (2017). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O'Reilly UK Ltd., 2nd edition.
- Preißel, R. (2019). Git: Decentralized version management in a team - basics and workflows. dpunkt.verlag GmbH, 5th edition.
- VanderPlas, J. (2016). Python Data Science Handbook: Essential Tools for working with Data. O'Reilly UK Ltd., 1st edition.
- Wickham, H. (2017). R for data science. O'Reilly UK Ltd., 1st edition.
Recommended journals or selected articles:
Relevant journals and articles will be announced in the course.
Typical software for this module:
R / RStudio, Python / Spyder / PyCharm, Git / TortoiseGit etc.
Mode of delivery
5 ECTS Exercise
Prerequisites and co-requisites
none
Assessment methods and criteria
examination character