Module Details |
The information contained in this module specification was correct at the time of publication but may be subject to change, either during the session because of unforeseen circumstances, or following review of the module at the end of the session. Queries about the module should be directed to the member of staff with responsibility for the module. |
Title | Big Data | ||
Code | CKIT525 | ||
Coordinator |
Prof FP Coenen Computer Science Coenen@liverpool.ac.uk |
||
Year | CATS Level | Semester | CATS Value |
Session 2018-19 | Level 7 FHEQ | Whole Session | 15 |
Aims |
|
|
Learning Outcomes |
|
A comprehensive understanding of the theories, models and frameworks underpinning the concept of Big-Data in a variety of organisational settings. |
|
An ability to critically apply the standard techniques of Big Data so as to design and implement effective Big Data ecosystems to support business analytics. |
|
A complete and systematic understanding of an open-source software framework for distributed data storage and distributed processing, and its practical application. |
|
An in-depth awareness of the critical issues involved in the deployment of distributed data processing pipelines. |
|
Knowledge of the application of Big Data techniques in the wider context such as with respect to enterprise deployment and data security. |
Syllabus |
|
1 |
Week 1 Introduction to Big Data and Apache Hadoop, terminology and basic concepts.
Week 2
Big Data Ecosystems and the Big Data landscape, the six V’s of Big Data.
Week 3
Components of the Hadoop stack, attributes and uses of MapReduce, the Hadoop Distributed File System (HDFS) and Yarn, inst
alling Hadoop and running "large-dataset programs" with Hadoop.
Week 4
Modeling and managing Big Data, Big Data Management Systems (BDMS), practical work with the Cloudera Data Management Virtual Machine.
Week 5
Big Data Integration and Processing; configuring and working with BDMS schemes; further work with the Cloudera Virtual Machine.
Week 6
Data Frames and Document-Oriented Big Data systems; predictive analytics with the Pandas Data-frames, MongoDB, Splunk and Datameer.
Week 7
Big Data processing pipelines and graph analytics, distributed processing with Apache Spark components (Spark core, pipelines, transformation engines, Spark-SQL and Spark Graph-X).
Week 8
Big Data enterprise deployment, integration and security issues.
|
Teaching and Learning Strategies |
|
Online Learning - Weekly seminar supported by asynchronous discussion in a virtual classroom environment facilitated by an online instructor. Number of hours per week that students are expected to attend the virtual classroom so as to participate in discussion, dedicated to group work and individual assessment is 7.5. |
Teaching Schedule |
Lectures | Seminars | Tutorials | Lab Practicals | Fieldwork Placement | Other | TOTAL | |
Study Hours |
60 Weekly seminar supported by asynchronous discussion in a virtual classroom environment facilitated by an online instructor. |
60 | |||||
Timetable (if known) |
Number of hours per week that students are expected to attend the virtual classroom so as to participate in discussion, dedicated to group work and individual assessment is 7.5.
|
||||||
Private Study | 90 | ||||||
TOTAL HOURS | 150 |
Assessment |
||||||
EXAM | Duration | Timing (Semester) |
% of final mark |
Resit/resubmission opportunity |
Penalty for late submission |
Notes |
CONTINUOUS | Duration | Timing (Semester) |
% of final mark |
Resit/resubmission opportunity |
Penalty for late submission |
Notes |
Coursework | Weekly Ddiscussion Q | Whole session | 40 | No reassessment opportunity | Standard UoL penalty applies | Eight discussion questions There is no reassessment opportunity, The nature of the adopted online learning paradigm is such that no reassessment opportunity is available; instead students failing the module will be offered the opportunity to retake the entire module. |
Coursework | one week: 750 - 1000 | Week 2 | 7 | No reassessment opportunity | Standard UoL penalty applies | Essay - Big Data trends and salient Hadoop eco-system features There is no reassessment opportunity, The nature of the adopted online learning paradigm is such that no reassessment opportunity is available; instead students failing the module will be offered the opportunity to retake the entire module. |
Coursework | one week; Software f | Week 3 | 10 | No reassessment opportunity | Standard UoL penalty applies | Portfolio - Managing and reporting on big-data using Hadoop, Part 1 There is no reassessment opportunity, The nature of the adopted online learning paradigm is such that no reassessment opportunity is available; instead students failing the module will be offered the opportunity to retake the entire module. |
Coursework | one week: Software f | Week 4 | 7 | No reassessment opportunity | Standard UoL penalty applies | Practical - using Map-Reduce There is no reassessment opportunity, The nature of the adopted online learning paradigm is such that no reassessment opportunity is available; instead students failing the module will be offered the opportunity to retake the entire module. |
Coursework | one week: Software f | Week 5 | 10 | No reassessment opportunity | Standard UoL penalty applies | Portfolio - Managing and reporting on big-data using Hadoop, Part 2 There is no reassessment opportunity, The nature of the adopted online learning paradigm is such that no reassessment opportunity is available; instead students failing the module will be offered the opportunity to retake the entire module. |
Coursework | one week: 750 - 1000 | Week 6 | 8 | No reassessment opportunity | Standard UoL penalty applies | Essay – Big-Data Analytic schemes There is no reassessment opportunity, The nature of the adopted online learning paradigm is such that no reassessment opportunity is available; instead students failing the module will be offered the opportunity to retake the entire module. |
Coursework | one week: Software f | Week 7 | 10 | No reassessment opportunity | Standard UoL penalty applies | Portfolio - Managing and reporting on big-data using Hadoop, Part 3 There is no reassessment opportunity, The nature of the adopted online learning paradigm is such that no reassessment opportunity is available; instead students failing the module will be offered the opportunity to retake the entire module. |
Coursework | one week: 750 - 1000 | Week 8 | 8 | No reassessment opportunity | Standard UoL penalty applies | Essay - Big-Data systems There is no reassessment opportunity, The nature of the adopted online learning paradigm is such that no reassessment opportunity is available; instead students failing the module will be offered the opportunity to retake the entire module. Notes (applying to all assessments) (1) Due to nature of the on-line mode of instruction this work is not marked anonymously. (2) Students who fail the module have the opportunity to repeat the entire module. (3) The "Standard UoL Penalty" for late submission that applies is the "Standard UoL Penalty" agreed with respect to online programmes offered in collaboration with Laureate Online Education. (4) For group work assessments groups typically comprise 3 to 4 students. Both group and individual contributions are assessed and integrated to produce a final mark for each student. |
Recommended Texts |
|
Reading lists are managed at readinglists.liverpool.ac.uk. Click here to access the reading lists for this module. Explanation of Reading List:
The online programmes offered by the department of Computer Science in Collaboration with Laureate Online Education use online materials wherever possible including the online resources available within the University of Liverpool’s libraries. This module does not require a specific text book.
|