Type | Advanced Lecture (6 ECTS) |
Lecturers | Prof. Dr. Jilles Vreeken and Prof. Dr. Isabel Valera |
Assistants | Osman Ali Mian (lead), Miriam Rateike, Joscha Cueppers, and to be annoucned |
Tutors | to be annoucned |
eml-ta (at) mmci.uni-saarland.de | |
Lectures | Thursdays, 14–16 o'clock (sharp) via Zoom and YouTube |
Tutorials | Mondays and Tuesdays, 12–14 o'clock via Zoom |
Office Hours |
Prof. Dr. Jilles Vreeken and
Prof. Dr. Isabel Valera: after each lecture Assistants: by appointment |
Summary |
In this course we will discuss the foundations – the elements – of machine learning. In particular, we will focus on the ability of, given a data set, to choose an appropriate method for analyzing it, to select the appropriate parameters for the model generated by that method and to assess the quality of the resulting model. Both theoretical and practical aspects will be covered. What we cover will be relevant for computer scientists in general as well as for other scientists involved in data analysis and modeling. (This course replaces the course Elements of Statistical Learning, and will be held in English.) |
Prerequisites |
The course is targeted to students in computer science, bioinformatics, math, and general sciences with a mathematical background. Students should know linear algebra and have good basic knowledge of statistics, for example by succesfully taking Statistics Lab or Mathematics for Computer Scientists III. |
Each problem set will cover theoretical proofs and programming exercises with roughly equal weight. In general, the deadlines are on the day indicated in the schedule at 10:00 Saarbrücken standard-time. You are free to hand in earlier. Further details will be announced in the first lecture.
As programming language we will use R – a language for statistical computing. It is freely available for Windows, Linux and Mac. As a vectorized programming language, it is ideally suited for the problems we will encounter. There are also many freely available packages (or libraries) to perform a variety of classification and regression tasks, or to visualize the results of statistical analyses in a convenient way.
You hand in your solution as follows. For the theoretical exercises, you may hand in your solutions in handwritten form before the lecture, or send one PDF file with all the answers by email to eml-ta (at) mmci.uni-saarland.de. For the programming exercises, send a single email with both your R code as .R file (should compile with the command "Rscript YourCode.R") as well as a pdf answering the questions and showing the generated plots (if any).
There will be one tutorial per week. In the week after you submitted an assignment, the solution will be presented in the tutorial sessions on Monday and Tuesday 12:00, repectively. We will also help you with the current problem set. In the following week, we will return the corrected sheets to you on Monday or Wednesday, respectively. We will also recapitulate the lectures, and have some time for discussions.
R (version 3.2.3) is installed on the CIP pool computers and can be started by invoking R
from the command line.
The official web site of the R project is r-project.org. You can download R for Windows, Linux and Mac from there. Additional packages, documentation and tutorials are also available for download from the official web site. Useful manuals and tutorials include:
The CRAN Contributed Documentation lists many other tutorials for R beginners and advanced programmers.
You can also check out RStudio, an open-source IDE for R.
You need a cumulative 50% of the points in the problem sets (in both theoretical and programming exercises) to be admitted to the exam.
To succesfully participate, you need to register for the exam in the LSF/HISPOS system of Saarland University – this will be possible as soon as the exam date has been entered into the system (this usually happens a few weeks into the semester).
The final exams will be oral if student numbers allow and otherwise be written. The final decision on this will be made three weeks into the course. The final exam will cover all the material discussed in the lectures and the required reading. The exam dates will be announced in due time.
This course was originally developed by Thomas Lengauer, and we thank him for kindly providing his lecture materials and experience.