Course Basic Data Science in Python

ECTS: 2

Course leader: Ira Assent

Language: English

Graduate school: Course collaboration

Course fee: 2,400.00 DKK

Status: Course is open for application

Semester: Fall 2022

Application deadline: 08/08/2022

Start date: 21/09/2022

Administrator: Thilde Møller Risgaard

The course Basic Data Science in Python is being offered by the Graduate School of Natural Sciences/GSNS and Graduate School of Technical Sciences/GSTS, Aarhus University, fall 2022.

 

Course parameters:
No. of contact hours/hours in total incl. preparation, assignment(s) or the like: 36 hours/60 hours in total. Of the 36 hours, 24 hours are lectures and in-class exercises, and 12 hours are consultations. The remaining time is to be spent on preparation and homework on projects.

Capacity limits: Minimum 5 and maximum 15 participants

Objectives of the course:
The aim of the course is to introduce the PhD student to basic tasks, methods and evaluation procedures in data science, using Python and its libraries and environments.

Learning outcomes and competences:
At the end of the course, the student should be able to:

  1. Identify the key assumptions and critically evaluate some data science methods and models
  2. Identify appropriate data sources, establish data quality, identify suitable data science approaches, devise experiments and draw conclusions
  3. Present (orally) and report (written) the results of those analyses.

Compulsory programme:
The course is divided into modules. For each module, exercises need to be solved and handed in. Admittance to the exam requires approval of these exercises.

Course contents:

  1. Introduction to data science as a field, relationship with neighboring fields statistics, artificial intelligence, machine learning, data mining
  2. Data pre-processing: basic approaches, impact on data / results, versioning, repeatability
  3. Core data science methods
    1. Representative methods
      • k-means, EM, DBSCAN, linear / logistic regression, decision trees, neural networks, …
      • Characterization of base assumptions, strengths and weaknesses
      • Structured data analysis
      • Elements of computational learning theory
  1. Practical aspects:
    1. Setting hyperparameters, grid search, etc.
    2. Evaluation principles, verification, validation, evaluation measures, common pitfalls
    3. Pointers to social aspects: fairness, privacy, explainability

Prerequisites:
The course will use Python as a tool, but it is NOT a course on Python. It is assumed that the PhD students have Python installed on their computers, and that they know the basics of Python programming as covered in the course Introduction to Python for Data Science. It is not assumed that the PhD student knows any data science techniques prior to the course

Name of lecturer:
TBA (Data-Intensive Systems group, Computer Science, Faculty of Natural Sciences)

Type of course/teaching methods:
Lectures and in-class exercises

Literature:
https://jakevdp.github.io/PythonDataScienceHandbook/ other online resources made available during the course

Course homepage:
TBA (Brightspace)

Course assessment:
As part of the course, small projects (made available as Jupyter notebooks) are to be completed and submitted. Oral exam on course topics and projects.

Provider:
GSNS and Department of Computer Science

Special comments on this course:
None

Time:
6 course sessions with 6 hours each with lectures and in-class exercises; additional time to be spent on homework on projects

Day 1: 21 September 2022, 9-15, Meeting room 2.3
Day 2: 23 September 2022, 9-15, Meeting room 2.3
Day 3: 27 September 2022, 9-15, Richard Mortensen stuen
Day 4: 30 September 2022, 9-15, Meeting room 2.3

Place:
AU Conference centre, Fredrik Nielsens vej 4, 8000 Aarhus C

No show fee:
Course participants on our transferable skills courses, who do not show up at the course or cancel their course participation after the course cancellation deadline (without providing a doctor’s note), may have to pay a no-show fee, unless someone from the waiting list is able to take part in the course instead.

The no-show fee is DKK 1,200 (the price of one ECTS). The no-show fee has been introduced due to many late cancellations, thus preventing people from the waiting lists to have a seat at the courses.

Registration:

  • Participation in the course is without cost for PhD students from Aarhus University

Due to an Agreement between Danish Universities that came into force as of 1 January 2011, participants from other universities than Aarhus University will have to pay DKK 1,200 per ECTS. In principle this also applies to external parties, but exemption can be granted under specific circumstances.

Please be aware that your registration for the course not necessarily equals your admission for the course. You will receive an e-mail after the registration deadline regarding whether you are admitted for the course or if you are registered on the waiting list. Please note that seats are allocated on a first-come-first-served basis.

Course dates:

  • 21 September 2022 09:00 - 15:00
  • 23 September 2022 09:00 - 15:00
  • 27 September 2022 09:00 - 15:00
  • 30 September 2022 09:00 - 15:00