Data Science Concepts / Data Science Tools

Developing Syllabus

This syllabus may change during the course of the semester with some additions and clarifications.
All substantial changes will marked here with date.

Important Links:
GitHub organization of the course: https://github.com/CU-F23-MDSSB-01-Concepts-Tools
Organization repository with non-public information and FAQ in Discussions: https://github.com/CU-F23-MDSSB-01-Concepts-Tools/Organization

1 Module Names

This syllabus is for two modules which require each other.

  • MDSSSB-DSOC-02 Data Science Concepts (Core area, 5 credits)
  • MDSSSB-MET-01 Data Science Tools (Methods area, 5 credits)

Offered in the Master program: Data Science for Society and Business (DSSB). See the module description in the DSSB Handbook.

2 Module Components

  • Data Science Concepts (5 Credits)
  • Data Science Tools in R (2.5 credits)
  • Data Science Tools in python (2.5 credits)

All are courses offered in the Fall term and have no entry requirement.
Data Science Concepts and Data Science Tools are co-requirements.

The Concepts treated in the lectures are applied in exercises and homework in the Tools course.

3 Class Meeting Information

Data Science Concepts:
Monday 09:45-11:00 and 11:15-12:30
Location: East Hall Room 1 in-person
Video streaming in Teams provided for late-arriving students: Team F23_MDSSB-DSOC-02_Data Science Concepts, General Channel

Data Science Tools:
Thursday 09:45-11:00 and 11:15-12:30
Location: East Hall Room 2 in-person
Video streaming in Teams provided for late-arriving students: Teams F23_MDSSB-MET-01-A_Data Science Tools in R and F23_MDSSB-MET-01-B_Data Science Tools in python, General Channel

Note: The R and python courses are provided in the same time slot. Details in the schedule.

4 Instructors

Concepts:
Jan Lorenz Email:

Tools in R:
Armin Müller Email:

Tools in python:
Peter Steiglechner Email:

5 Format and Workload

Concepts: Lectures, sometimes with aspects of Tutorials (35 hours in presence, total expected workload 125 hours)
Tools: Tutorials, sometimes with aspects of Lectures (35 hours in presence, total expected workload 125 hours)

Besides homework, both modules require a decent amount of self-study.
Workload homework and self-study is expect to be 90/125 = 72% of the total workload.

Homework assignments are given along the concepts treated in Concepts to be solved with the tools treated in Tools.

6 Intended Learning Outcomes

The module descriptions can be found in the DSSB Handbook

6.1 From the handbook

By the end of the Concepts module, you will be able to:

  1. understand and use the mathematical foundations of statistical learning algorithms
  2. explain and classify data science problems
  3. explain and classify data-driven approaches
  4. understand the application of data science techniques to typical situations and tasks in business and societal research, including the search, retrieval, preparation, and statistical analysis of data
  5. interpret complexity analysis and performance evaluation of data science problems and algorithms

By the end of the Tools module, you will be able to:

  1. explain basic concepts of imperative and object-oriented programming
  2. write, test, and debug programs
  3. perform data handling and data manipulation tasks in R and Python
  4. apply your knowledge to implement own functions in R and Python
  5. effectively use core packages and libraries of R and Python for data analysis
  6. know about the typical applications of R and Python in data science
  7. implement and apply advanced data mining methods with appropriate tools
  8. perform a full cycle of data analysis

6.2 Main Learning Goal

Our main goal to help you build a good basis for your more and more independent work in the whole study program. That means you can

  • learn core concepts in data science on your own, for example
    • concepts to explore data (import, wrangle, visualize)
    • learn and explore mathematics and statistics through the data science lens
    • learn concepts to model and draw conclusions from data (model, infer, predict)
  • create and maintain a digital working environment on your computer to do data science
  • learn to program in the data science languages R and python, and become able to learn new skills in these independently
  • do a data science project of your own interest

7 Examination and Assessment

7.1 Concepts Module

Assessment Type: Written Examination
Duration: 120 min
Weight: 100%
Scope: All intended learning outcomes of the module.
Completion: to pass this module, the examination has to be passed with at least 45%

An exam will take place after all lectures in December. The date will be published by the university adminstration later.

There are no additional achievements necessary and there are no bonus options.

7.2 Tools Module

Module achievement: 50% of the assignments correctly solved
Programming and analysis assignments will appear step by step during the courses.
To pass the module you have to solve half of them by the end of the semester.

Assessment Type: Project Report
Length: 4000 - 5000 words Weight: 100% Scope: All intended learning outcomes of the module.
More information of project formats and a rubric for the grading will be provided later.

Bonus option: Students receive 0.33 points grade improvements on their project grade (on the numerical grade as specified here https://constructor.university/sites/default/files/2023-02/Grading_Table_2023.pdf) when all assignments are solved by the end of the semester. (Note, the bonus is not necessary to reach the best grade in the module.)

All assignments and the final project must be delivered in personalized repositories in the GitHub organization https://github.com/CU-F23-MDSSB-01-Concepts-Tools.

8 Module Policies

There are no formal requirement about attendance and active participation. However, we rely on your engagement in a many-facted way including:

  • Preparation (looking at readings and material before and after class, being informed about syllabus and course material)
  • Focus (avoid distraction during in class and self-learning activities)
  • Presence (listening and responding during group activities)
  • Asking questions (in class, out of class, online, offline, when you get stuck conclude by writing a question)
  • Specificity (being as specific as possible when describing your problem or question)
  • Synthesizing (making connections between concepts from reading and discussion)
  • Persistence (you don’t need to understand everything immediately, but stay engaged, try again, confusion shows that you pay attention)

9 Academic Integrity

All involved parties (professors and lecturers, instructors and students) are expected to abide by the word and spirit of the “Code of Academic Integrity”: https://constructor.university/student-life/student-services/university-policies/academic-policies/code-of-academic-integrity. Violations of the Code might be brought to the attention of the Academic Integrity Committee.

10 Artifical Intelligence (AI) Use Policy

This policy covers any generative AI tool, such as ChatGPT, Elicit, etc. This includes text, code, slides, artwork/graphics/video/audio and other products.

We instructors encouraged to using and exploring AI tools for these purposes:

  • Learning by dialog with a chatbot. AI chatbots can be very helpful to explain you concepts on your desired level and get a feeling about how certain topics are treated. You can ask for an easier or more detailed explanation or focus on certain aspects. Note: The capabilities are limited and you likely receive a lot of false information! Using chatbots should remain a small part of your learning process. Rule of thumb: Spend not more than 25% of the learning time with chatting. There is no way around reading textbooks, reading software documentation, learning and understanding concepts, searching for help online, asking instructors or fellow students.
  • Have code snippets written. Tools like GitHub Copilot (free to use for students and teachers in VSCode) are heavily used and currently change code-writing. They can speed up writing your code. However, they do not deliver always correct solutions. There is no way around understanding yourself what a code is doing! Do not spend endless hours asking for new code with new prompts, spend time understanding a language and the functions and objects you are using! Copilots can be a great help to get a skeleton of code and an idea how your solution might look, they rarely deliver the complete code. Expect that the code does not work, expect that the code seems to works but the results are wrong. You are 100% accountable for the code you produce, with or without the help of a copilot!
  • Have a draft text snippet written. Data science is also about formulating and describing research questions, describing data, documenting code, interpreting results, and deriving conclusion form the results. This is all verbal text and chatbots are good in writing text which often appears well readable. You can use this to inspire yourself and to polish and improve your texts. However, you are 100% accountable for the text you deliver. You are expected to know what your text is about and to be able to answer questions about what your text means! In text generated purely by a chatbot it is often evident that you do not. We consider such cases worse than incomplete but sensible text. Also text written by chatbots is often very generic and less specific. In general, we value more specific text higher than generic text. Large parts of very generic text is considered worse than a shorter more specific text. In extreme case, a long very generic text will be considered worse than no text at all
  • Note: Philosophical and legal questions around the training and use of chatbots and code copilots are controversially contested and re-examined constantly! We encourage to engage with such questions and become aware of arguments and debates.

If any part of this AI policy is confusing or uncertain, please reach out to us for a conversation before submitting your work.

11 Schedule and Homework

The Schedule is on an extra page and will be updated continuously with

  • Links to slides
  • Note on what homework you are expected to do
  • Some questions which you should be able to answer after each week. Test yourself.

Homework page will successively appear as extra pages. See the sidebar to the left.

12 Feedback from Students

We are eager to constantly improve the quality of our teaching. We would be glad to obtain your feedback at any time of the course to improve your learning experience.