Module Details

The information contained in this module specification was correct at the time of publication but may be subject to change, either during the session because of unforeseen circumstances, or following review of the module at the end of the session. Queries about the module should be directed to the member of staff with responsibility for the module.
Title INTRODUCTION TO DATA SCIENCE
Code COMP229
Coordinator Dr V Kurlin
Computer Science
Vitaliy.Kurlin@liverpool.ac.uk
Year CATS Level Semester CATS Value
Session 2018-19 Level 5 FHEQ First Semester 15

Aims

1. To provide a foundation and overview of modern problems in Data Science.
2. To describe the tools and approaches for the design and analysis of algorithms for da-ta clustering, dimensionally reduction, graph reconstruction from noisy data.
3. To discuss the effectiveness and complexity of modern Data Science algorithms.
4. To review applications of Data Science to Vision, Networks, Materials Chemistry. 

Learning Outcomes

describe modern problems and tools in data clustering and dimensionality reduction,

formulate a real data problem in a rigorous form and suggest potential solutions,

choose the most suitable approach or algorithmic method for given real-life data,    

visualise high-dimensional data and extract hidden non-linear patterns from the data.


Syllabus

1.    Metric Geometry (6 lectures): point clouds, distance functions, metric spaces, isometries and invariants, equivalence of point clouds up to linear transformations.
2.    Clustering methods (6 lectures): graphs and trees, a minimum spanning tree, union-find algorithm, clustering based on connectivity, centroids, densities and distributions.
3.    Computational Geometry (6 lectures): Voronoi decompositions, alpha-complexes, the Reeb graph, the Mapper algorithm, the graph reconstruction problem from noisy data.
4.    Dimensionality reduction (6 lectures): linear operators, eigenvectors and eigenvectors, Principal Component Analysis (PCA) and Singular-Value Decomposition (SVD).
5.    Geometric Data Analysis (6 lectures): graph Laplacians in spectral graph theory, graph partitioning algorithms, connectivity of networks, shape descriptors.

Teaching and Learning Strategies

Lecture - Formal Lectures

Tutorial - Tutorials with 4-5 formative assessments (marked by demonstrators) - using problems similar to exam questions.


Teaching Schedule

  Lectures Seminars Tutorials Lab Practicals Fieldwork Placement Other TOTAL
Study Hours 30
Formal Lectures
  10
Tutorials with 4-5 formative assessments (marked by demonstrators) - using problems similar to exam questions.
      40
Timetable (if known)              
Private Study 110
TOTAL HOURS 150

Assessment

EXAM Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
Unseen Written Exam  150  Semester 1  100  Yes  Standard UoL penalty applies  Final Exam 
CONTINUOUS Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
Coursework    Semester 1  No reassessment opportunity    3-4 formative homework There is no reassessment opportunity, Notes (applying to all assessments) 3-4 formative homework (without hard deadlines) that is marked by demonstrators and returned to students without a contribution to the final mark. 

Recommended Texts

Reading lists are managed at readinglists.liverpool.ac.uk. Click here to access the reading lists for this module.
Explanation of Reading List: