DM870: Data mining and machine learning

Study Board of Science

Teaching language: Danish or English depending on the teacher, but English if international students are enrolled
EKA: N340033102
Censorship: Second examiner: External
Grading: 7-point grading scale
Offered in: Odense
Offered in: Spring
Level: Master

STADS ID (UVA): N340033101
ECTS value: 10

Date of Approval: 02-11-2018


Duration: 1 semester

Version: Approved - active

Comment

NEW course spring 2019.
The course is co-read with: DM868, DM566: Data Mining and Machine Learning (10 ECTS)                     
The course cannot be chosen by students who: have either followed, or have passed DM555, DM855, or DM859.

Entry requirements

None

Academic preconditions

Students taking the course are recommended to:

  • Have knowledge of the basic concepts of discrete methods for computer science
  • Have knowledge oft the basic concepts of linear algebra
  • Have knowledge of basic algorithms and data structures
  • Be able to program

Course introduction

The aim of the course is to enable the student to choose and use techniques from Data Mining and Machine Learning, which is important in regard to being able to analyze large datasets in many financial, medical, commercial, and scientific applications.

Data Mining and Machine Learning techniques enable computational systems to identify meaningful patterns in the data and to adaptively improve their performance with experience accumulated from the observed data.
This course introduces the most common techniques for performing basic data mining and machine learning tasks, and covers the basic theory, algorithms, and applications. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects. Computational learning methods are introduced at a general level, with their basic ideas and intuition.

Moreover, the students have the opportunity to experiment and apply data mining and machine learning techniques to selected problems.

The course gives an academic basis for conducting large scale data analysis and for conducting bachelor and master thesis projects as well as other practical oriented study-activities, that are part of the degree.

In relation to the competence profile of the degree it is the explicit focus of the course to:
  • Give knowledge of common data mining and machine learning tasks and methods
  • Give skills to apply common data mining and machine learning methods to real world problems
  • Give the competence to design data mining and machine learning methods
  • Give knowledge to understand and reflect on theories, methods, and practices in the computer science field
  • Give skills to acquire new knowledge in an effective and independent manner and be able apply this knowledge in a reflective way
  • Give skills to describe, analyze and solve computer science problems applying methods and modeling formalisms from the core area and its mathematical support disciplines
  • Give skills in analyzing the advantages and disadvantages of various algorithms, especially in terms of resource consumption
  • Give skills to make and justify professional decisions
  • Give skills to describe, formulate and communicate issues and results to peers, non-specialists, project partners and users.

Expected learning outcome

The learning objectives of the course are that the student demonstrates the ability to:
  • Describe the data mining and machine learning tasks presented during the course
  • Describe the algorithms and methods presented in the course
  • Describe the topics presented in the course in precise mathematical language
  • Explain the individual steps of the mathematical derivations presented in class
  • Apply the methods to simple problems
  • Apply the methods to situations different from the ones presented in class
  • Reflect on and assess design choices for data mining and machine learning systems
  • Undertake experimental evaluation of data mining and statistical learning methods and report the results

Content

The following main topics are contained in the course:
  • basic probability
  • theory of learning (feasibility of learning, generalization, overfitting)
  • error and noise
  • bias and variance
  • training vs. testing (cross-validation, bootstrap, model selection)
  • methods (for example rule learning, Bayes learning, nearest neighbor classification, decision trees, clustering)
  • frequent pattern mining (item set mining, association rules)

Literature

See Blackboard for syllabus lists and additional literature references.

Examination regulations

Exam element a)

Timing

June

Tests

Written exam

EKA

N340033102

Censorship

Second examiner: External

Grading

7-point grading scale

Identification

Student Identification Card

Language

Normally, the same as teaching language

Examination aids

Aids allowed, a closer description of the exam rules will be posted under 'Course Information' on Blackboard.

ECTS value

10

Additional information

The examination form for re-examination may be different from the exam form at the regular exam.

Indicative number of lessons

70 hours per semester

Teaching Method

At the faculty of science, teaching is organized after the three-phase model ie. intro, training and study phase.

In the intro phase, concepts, theories and models are introduced and put into perspective. In the training phase, students train their skills through exercises and dig deeper into the subject matter. In the study phase, students gain academic, personal and social experiences that consolidate and further develop their scientific proficiency. Focus is on immersion, understanding, and development of collaborative skills.   
Activities during the study phase:
  • Reading from textbooks
  • Solving homework
  • Applying acquired knowledge in practical projects

Teacher responsible

Name E-mail Department
Arthur Zimek zimek@imada.sdu.dk Institut for Matematik og Datalogi, Datalogi

Timetable

Administrative Unit

Institut for Matematik og Datalogi (datalogi, fiktiv)

Team at Registration & Legality

NAT

Recommended course of study

Profile Programme Semester Period