During our course we will learn how to process data with
The course will be using the following 3 programming languages
Week | Date | Topic | Teacher | Assignment (Deadline) |
---|---|---|---|---|
1 | 14/11 | Course introduction, Big and Fast data, Intro to course PLs | GG | |
1 | 15/11 | The Unix programming environment | GG | Unix (jupyter) (28/11) |
2 | 21/11 | Programming for Big Data (1) | GG | Functional programming: Scala (jupyter), Python (jupyter) (4/12) |
2 | 22/11 | Programming for Big Data (2) | GG | |
3 | 28/11 | Distributed Systems | JR | |
3 | 29/11 | Distributed Databases, Distributed filesystems | JR |
Week | Date | Topic | Teacher | Assignment (Deadline) |
---|---|---|---|---|
4 | 5/12 | Spark: RDDs and Pair RDDs | GG | Spark (18/12) |
4 | 6/12 | Spark Internals | JR | |
5 | 12/12 | Spark SQL, Spark use cases: Synonyms with Word2Vec, Recommending bands, Predicting pull request merges | GG | |
5 | 13/12 | Live Data Processing | GG | |
6 | 19/12 | Stream processing | GG | Streaming (14/1) (Note: Optional for minor students) |
6 | 20/12 | Stream processing systems | GG | |
7 | 8/1 | Recap | GG | |
7 | 9/1 | No lecture | GG |
There will be a resit, there is no mid-term.
You can transfer your assignment grade to the resit AS A WHOLE. No individual assignment resubmissions!
The course consists of 4 mandatory assignments (3 for minor students)
You always work in pairs: select your team mate!
Grade: \(\frac{\sum_{assign=1}^{4} grade(assign)}{4}\), aka the mean
Most assignment grades add to \(> 10\)
We will use Jupyter for the first 3 assignments, and Intellij/Scala for the last one.
We will use CPM. The course name is TI2736-B: Big Data Processing
Lab sessions every Monday morning
You are welcome to join the BDP 2018-2019 Mattermost channel
The course VM: Contains all software (Spark, HDFS, Flink) you may need pre-installed
Q A question with a known answer; this will be revealed, but we should work together towards it!
D A open discussion item; we need to think and discuss.
Freely available on the web, on my homepage (http://gousios.org/teaching.html)
You can print/download them before the lecture and bring them along to make additional notes.
I am looking forward to improve them! If you have suggestions, find errors etc, let me know!
This work is (c) 2017, 2018 - onwards by TU Delft and Georgios Gousios and licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.