
DM847: Introduction to Bioinformatics
Entry requirements
Academic preconditions
Students taking the course are expected to:
- Have basic knowledge in probability theory
- Have basic knowledge in algorithmics
- Have proficiency in programming
Course introduction
The course provides an academic basis for solving bioinformatics problems by modelling and implementing computer programs. The course also provides a scientific basis for analyzing the advantages and disadvantages of different computational methods in bioinformatics, develop new variants of the methods if required by the specific problem, and communicate research-based knowledge and discuss professional and scientific problems with both, specialists and non- specialists.
Expected learning outcome
- Explain and understand the central dogma of molecular biology, central aspects of gene regulation, the basic principle of epigenetic DNA modifications, and specialties w.r.t. bacteria & phage genetics
- Model ontologies for biomedical data dependencies
- Design of systems biology databases
- Explain and implement DNA & amino acid sequence analysis methods (HMMs, scoring matrices, and efficient statistics with them on data structures like suffix arrays)
- Explain and implement statistical learning methods on biological networks (network enrichment)
- Explain the specialties of bacterial genetics (the operon prediction trick).
- Explain and implement methods for suffix trees, suffix arrays, and the Burrows-Wheeler transformation
- Explain de novo sequence pattern screening with EM algorithm and entropy models.
- Explain and implement basic methods for supervised and unsupervised data mining, as well as their application to biomedical OMICS data sets
Content
- Central dogma of molecular genetics, epigenetics, and bacterial and phage genetics
- Design of online databases for molecular biology content (ontologies, and example databases: NCBI, CoryneRegNet, ONDEX)
- DNA and amino acid sequence pattern models (HMMS, scoring matrices, mixed models, efficient statistics with them on big data sets)
- Specialities in bacterial genetics (sequence models and functional models for operons prediction)
- De novo identification of transcription factor binding motifs (recursive expectation maximization, entropy-based models)
- Analysis of next-generation DNA sequencing data sets (memory-aware short sequence read mapping data with Burrows Wheeler transformation and suffix arrays, bi-modal peak calling)
- Visualization of biological networks (graph layouting: small but highly variable graphs vs. huge but rather static graphs)
- Systems biology and statistics on networks (network enrichment with CUSP, jActiveModules and KeyPathwayMiner)
- Basic supervised and unsupervised classification methods for OMICS data analysis
Literature
Examination regulations
Exam element a)
Timing
Tests
Portfolio
EKA
Assessment
Grading
Identification
Language
Duration
Examination aids
ECTS value
Additional information
- A written project assignment handed in during the course
- Final oral exam during the exam period
Indicative number of lessons
Teaching Method
Planned lessons
Total number of planned lessons: 86
Common lessons in classromm/auditorium: 41 hours
Team lessons in classroom: 45 hours
In the lectures, concepts, algorithms, and models are introduced along with examples. During these sessions, there are also quizzes that students first solve individually, then discuss with their peers, and finally review with the teacher. During the exercise hours, students practice skills and delve deeper into the contents covered in previous lectures. These hours include both individual and group work activities.
Other planned teaching activities:
- Solve assignments
- Read the assigned literature
- Practice to apply the acquired knowledge