ECTS: 4
Course leader: Peter Mondrup Rasmussen
Language: English
Graduate school: Faculty of Health
Graduate program: ClinFO
Course fee: 4,800.00 DKK
Status: Course is open for application
Semester: Spring 2025
Application deadline: 02/02/2025
Cancellation deadline: 16/02/2025
Course type: Classroom teaching
Start date: 03/03/2025
Administrator: Thilde Møller Risgaard
The course C308/04 Applied machine learning in health sciences is being offered by the Graduate School of Health, Aarhus University, spring 2025.
Criteria for participation:
University degree in medicine, dentistry, nursing, or master’s degree in other fields and/or postgraduate research fellows (PhD students and research-year medical students).
Requirements for participation:
- Experience with programming is required. Participants should have a good understanding of programming concepts including variable types (e.g. strings, arrays, tables), functions (definition, parameters, return values), control structures (e.g. for loops), experience in writing and running scripts. Ability to create basic data visualizations such as bar plots, scatter plots, and histograms. Experience in importing data from text/csv files.
- Programming code used in lecture examples and in exercises will be provided as Matlab, R, and Python code. Students are thereby welcome to work in their favorite programming language.
- Knowledge of basic statistics, e.g. linear modeling/regression, ANOVA, and biostatistics courses will be an advantage.
Aim:
The aim of the course is to introduce the student to machine learning techniques and enable the student to apply these methods to analyze complex data sets as typically encountered in modern research. The student will achieve an understanding of the theoretical background of supervised- and unsupervised machine learning techniques and will gain practical experience in applying these techniques in real-world data analysis.
Learning outcomes:
A student who has met the objectives of the course will be able to:
- Describe main steps involved in typical machine learning analyses, including data preparation, data modeling, model evaluation, and result dissemination.
- Describe the mathematical and statistical principles in supervised- and unsupervised machine learning.
- Describe basic and advanced methods for predicting continuous- and discrete outcomes (regression and classification).
- Describe procedures for model building, model selection and model evaluation.
- Identify relevant machine learning techniques to solve research-based problems.
- Design and implement a solution strategy to solve research-based problems.
- Apply unsupervised- and supervised machine learning techniques to their own data.
- Disseminate the analysis result and account for the solution strategy and analysis results as necessary for publication in scientific journals.
Workload: The full workload of the course is expected to be 100 hours.
Content:
Technical content:
- Data preprocessing, feature extraction and feature representation. Unsupervised machine learning techniques including techniques for dimension reduction (e.g. principal component analysis) and techniques for clustering (e.g. k-means, hierarchical clustering). Supervised machine learning techniques including techniques for modelling continuous outputs (regression) and discrete outputs (classification) (e.g. linear regression, logistic regression, support vector machines, neural networks). Techniques for complexity control (e.g. feature selection, shrinkage methods), and techniques for model selection and model evaluation (e.g. cross-validation).
Course structure:
- Guided self-study (textbook, notes, video-clips). The student will gain knowledge on the conceptual- and theoretical basis of the modeling- and machine learning techniques.
- In-class activities (five days 8-16) with mixture between short lectures, hands-on exercises, and group work. Key concepts will be highlighted in lectures, but the focus is on student-oriented learning with conceptual/theoretical exercises and practical data analysis (computer).
- Students should expect to set aside 7 working hours in between the course days to complete in-class exercises and to prepare for the next course day.
- Data will be from the health science research domain.
Course material:
- Textbook: Introduction to Statistical Learning. James, Witten, Hastie, Tibshirani. https://www.statlearning.com/
- Video-lectures
- Notes and exercise material
Course evaluation:
- As the course progresses, the student hands in in-class exercises. Two weeks after the last course day, the student hands in i) completed exercise portfolio based on in-class exercise work, ii) group report. Pass/fail, internal evaluation.
Instructors: Peter Mondrup Rasmussen
Venue: Aarhus University, Aarhus
Participation in the course is without cost for:
- PhD students, Health Research Year students from Aarhus University
- PhD students enrolled at partner universities of the Nordoc collaboration
- PhD students from other institutions in the open market agreement for PhD courses
Course dates:
- 03 March 2025 08:00 - 16:00
- 05 March 2025 08:00 - 16:00
- 07 March 2025 08:00 - 16:00
- 11 March 2025 08:00 - 16:00
- 13 March 2025 08:00 - 16:00