In this course, we will learn

Tools and technologies

During our course we will learn how to process data with

We will learn to store out data in HDFS and MongoDB.

The course will be using the following 2 languages

You can submit your assignments in any of those!

Schedule (part 1)

Week Lecture Who? Topic Teacher
13/11 1 All Course introduction, Big and Fast data, Intro to course PLs GG
13/11 2 All Programming for Big Data (1) GG
20/11 1 All Programming for Big Data (2) GG
20/11 2 All Distributed Systems GG
27/11 1 All Distributed Databases GG
27/11 2 All Map/Reduce and Hadoop GG
4/12 1 All Spark: RDDs and Pair RDDs GG
4/12 2 All Spark Internals JR

Schedule (part 2)

Week Lecture Who? Topic Teacher
11/12 1 All Spark SQL, Synonyms with Word2Vec GG
11/12 2 All Recommending bands, Predicting pull request merges GG
18/12 1 BSc Stream processing AK
18/12 2 BSc Stream processing systems AK
18/12 1 Minors Introduction to Data Science GG
18/12 2 Minors Introduction to Data science GG
8/1 1 All Big Graphs (Maybe) GG
8/1 2 All Graph processing systems (Maybe) GG

Grades

There will be resit, there is no mid-term.

You can transfer your assignment grade AS A WHOLE. No individual assignment resubmissions!

Assignments

Assignment sign-off

Deadlines for each assignment are on the course’s web site

One day before the deadline (before 23:59) you must:

On the deadline date

Forming groups

All assignments are done in pairs

Getting help

Course resources

Slide symbols

Lecture notes

Bibliography