Course Introduction to machine learning in health sciences

ECTS: 3.1

Course leader: Peter Mondrup Rasmussen

Language: English

Graduate school: Faculty of Health

Course fee: 3,720.00 DKK

Status: Course is finished

Semester: Spring 2022

Application deadline: 07/02/2022

Start date: 07/03/2022

Administrator: Thilde Møller Risgaard

New course dates

Due to COVID-19 course leader has postponed the course until later this spring. Please see the new course dates in the description.

The course C308/01 Introduction to machine learning in health sciences is being offered by the Graduate School of Health, Aarhus University, spring 2022.

Criteria for participation: University degree in medicine, dentistry, nursing, or Master’s degree in other fields and/or postgraduate research fellows (PhD students and research-year medical students).

Aim: The aim of the course is to introduce the student to machine learning techniques and enable the student to analyze complex data sets as typically encountered in modern research. The students will achieve an understanding of the theoretical background of supervised- and unsupervised machine learning techniques and will gain practical experience in applying these techniques in real-world data analysis.

Learning outcomes: A student who has met the objectives of the course will be able to:

  • Describe main steps involved in typical machine learning analyzes, including data preparation, data modeling, model evaluation, and result dissemination.
  • Describe the mathematical and statistical principles in supervised- and unsupervised machine learning.
  • Describe basic and advanced methods for predicting continuous- and discrete outcomes (regression and classification).
  • Describe procedures for model building, model selection and model evaluation.
  • Identify relevant machine learning techniques to solve particular problems.
  • Design and implement a solution strategy to solve research-based problems.
  • Apply unsupervised- and supervised machine learning techniques to their own data.
  • Disseminate the analysis result and account for the solution strategy and analysis results as necessary for publication in scientific journals.

Content: Technical content:

  • Data preprocessing, feature extraction and feature representation. Summary statistics. Unsupervised machine learning techniques including techniques for dimension reduction (e.g. principal component analysis) and techniques for clustering (e.g. k-means, hierarchical clustering). Supervised machine learning techniques including techniques for modelling continuous outputs (regression) and discrete outputs (classification) (e.g. linear regression, logistic regression, support vector machines, neural networks). Techniques for complexity control (e.g. feature selection, shrinkage methods), and techniques for model selection and model evaluation (e.g. cross-validation).

Course structure:

  • Guided self-study (textbook, notes, video-clips). The student will gain knowledge on the conceptual- and theoretical basis of the modeling- and machine learning techniques.
  • In-class activities (four days 8-16) with mixture between short lectures and hands-on exercises. Key concepts will be highlighted in lectures, but focus is on student oriented learning with conceptual/theoretical exercises and practical data analysis (computer).
  • Data will be from the health science research domain, e.g. neuroimaging data.
  • After the last day with in-class activities the student will independently work on a small project in which the student plan, carry out, and report on a machine learning project. Students can work either on their own data or on data provided during the course.

Course material:

  • Textbook: Introduction to Statistical Learning. James, Witten, Hastie, Tibshirani.
  • Video-lectures
  • Notes and exercise material

Course evaluation:

  • The student hands in i) assignment/exercise portfolio based on in-class exercise work, ii) report disseminating the small machine learning project. Pass/fail, internal evaluation (deadline two weeks after last in-class day).

Recommended knowledge for participation (if any):

  • Experience with programming is required, e.g. variable types (cells, structs, tables, strings), functions, loops (if, while, for), scripts, basic plotting and visualization, import/export data. For example AU PhD course C171/11 Introduction to data analysis for health sciences using MATLAB.
  • Matlab is used as programming language in lecture examples and exercises. (Students with sufficient experience in other languages, e.g. R or Python may use these in their report - but no technical support provided).
  • Knowledge on basic statistics, e.g. linear modeling/regression, ANOVA.


Mandag den 7. marts, Meeting room 2.2, AU Conference centre, Fredrik Nielsens vej 4, 8000 Aarhus C
Torsdag den 10. marts, Meeting room 1.1, AU Conference centre, Fredrik Nielsens vej 4, 8000 Aarhus C
Mandag den 14. marts, Meeting room1.1, AU Conference centre, Fredrik Nielsens vej 4, 8000 Aarhus C
Onsdag den 16. marts, Meeting room 1.1, AU Conference centre, Fredrik Nielsens vej 4, 8000 Aarhus C

Participation in the course is without cost for:

  • PhD students, Research Year students and Research Honours Programme students from Aarhus University
  • PhD students enrolled at partner universities of the Nordoc collaboration
  • PhD students from other institutions in the open market agreement for PhD courses

Course dates:

  • 07 March 2022 08:00 - 16:00
  • 10 March 2022 08:00 - 16:00
  • 14 March 2022 08:00 - 16:00
  • 16 March 2022 08:00 - 16:00