Topics in Algorithmic Data Analysis SS'18


News

more ▾

Course Information

Type Advanced Lecture (6 ECTS)
Lecturer Dr. Jilles Vreeken
Email tada-staff (at) mpi-inf.mpg.de
Lectures Thursdays, 10–12 o'clock in Room E1.7 0.01.
Summary In this advanced course we'll be investigating hot topics in data mining that the lecturer thinks are cool. This course is for those of you who are interested in Data Mining, Machine Learning, Data Science, Big Data Analytics – or, as the lecturer prefers to call it – Algorithmic Data Analysis. We'll be looking into how to discover significant and useful patterns from data, efficiently measure non-linear correlations and determine causal relations, as well as how to analyse large graphs.

Schedule

Month Day Topic Slides Assignment Req. Reading Opt.
Reading
April 19 Lecture 1 1st assignment out
26 Lecture 2 deadline 1st, 2nd out
May 3 Jilles travelling – no class
10 yay holiday – no class
17 Lecture 3
24 Lecture 4 deadline 2nd, 3rd out
31 yay holiday – no class
June 7 Lecture 5
14 Lecture 6
21 Lecture 7 deadline 3rd, 4th out
28 Lecture 8
July 5 Lecture 9
12 Lecture 10
19 Lecture 11
26 oral exams deadline 4th
October 8 re-exams

Materials

All required and optional reading will be made available. You will need a username and password to access the papers outside the MMCI/MPI network. Contact the lecturer if you don't know the username or password.

In case you do not have a strong enough background in data mining, machine learning, or statistics, these books may help to get you on your way [1,2,3] The university library kindly keeps hard copies of these books available in a so-called Semesteraparat.

Required Reading

Optional Reading

[1] Aggarwal, C.C. Data Mining - The Textbook. Springer, 2015.
[2] Cover, T.M. & Thomas, J.A. Elements of Information Theory. Wiley-Interscience New York, 2006.
[3] Wasserman, L. All of Statistics. Springer, 2005.

Course format

The course has two hours of lectures per week. There are no weekly tutorial group meetings. Instead, you will have to write four essays based on the material covered on the lectures and scientific articles assigned by the lecturer.

Structure and Content

In general terms, the course will consist of

  1. lectures, and
  2. assignments that include critically reading scientific articles
At a high level, the topics we will cover will include
  1. Mining Interesting Patterns
  2. Mining Complex Correlations
  3. Mining Large Graphs
Loosely speaking, students will learn about current hot topics in exploratory data analysis, with an emphasis on statistically well-founded approaches, including those based on information theoretic principles.

Assignments

Students will individually do one assignment per topic – four in total. For every assignment, you will have to read one or more research papers and hand in a report that critically discusses this material and answers the assignment questions. Reports should summarise the key aspects, but more importantly, should include original and critical thought that show you have acquired a meta level understanding of the topic – plain summaries will not suffice. All sources you've drawn from should be referenced. The expected length of a report is 3 pages, but there is no limit.

The deadlines for the reports are at 10:00 Saarbrücken standard-time. You are free to hand in earlier.

Grading and Exam

The assignments will be graded in scale of Fail, Pass, Good, and Excellent. Any assignment not handed in by the deadline is automatically considered failed, and cannot be re-done. You are allowed to re-do one Failed assignment: you have to hand in the improved assignment within two weeks. Two failures mean you are not eligible for the exam, and hence failed the course.

You can earn up to three bonus points by obtaining Excellent or Good grades for the assignments. An Excellent grade gives you one bonus point, as do every two Good grades, up to a maximum of three bonus points. Each bonus point improves your final grade by 1/3 assuming you pass the final exam. For example, if you have two bonus points and you receive 2.0 from the final exam, your final grade will be 1.3. You fail the course if you fail the final exam, irrespective of your possible bonus points. Failed assignments do not reduce your final grade, provided you are eligible to sit the final exam.

The final exams will be oral. The final exam will cover all the material discussed in the lectures and the topics on which you did your assignments. The main exam will be on July 26th. The re-exam will be on October 8th. The exact time slot per student will be announced per email. Inform the lecturer of any potential clashes as soon as you know them.

Prerequisites

Students should have basic working knowledge of data analysis and statistics, e.g. by successfully having taken courses related to data mining, machine learning, and/or statistics, such as Information Retrieval and Data Mining, Machine Learning, Probabilistic Graphical Models, Statistical Learning, etc.