Big Data for Engineers - Spring 2020

Latest information | Overview | Lecture and exercise times | Course material | People | Q&A


Latest information

Exam Environment (new)

The exam will be computer-based. Students will have at their disposal an exam environment which features the following software:

  • Anaconda 2019.10
    • ipython-sql
    • psycopg2
    • pyspark 
    • Apache Spark 2.4.4
  • PostgreSQL 12
  • Rumble (update: there will not be a Rumble installation, so that you can focus the practical aspects of your preparation on PostgreSQL and Spark. There may still be questions on JSONiq/Rumble, though.)

Please feel free to install these locally, and try the environment out, in order to get more acquainted with it prior to the exam.

You will find a previous examination in paper form here.

The exam takes place in the Moodle environment that you are already familiar with from the exercise quizzes. Most questions will be in the form of multiple choice questions, filling textboxes with short answers, drag and drop, etc.

Datasets will be stored on the machine or (for PostgreSQL) prepopulated. The exam environment will also contain a Jupyter notebook (some sort of cheatsheet) to get you started with them. You can find a version of this cheatsheet (with the actual datasets used at the exam removed) here. The exam may involve questions asking you to query these provided datasets. As a consequence, we recommend preparing yourself well to write SQL or Spark SQL, as well as (in Python) Spark RDD or DataFrame queries. For such questions, some final result is asked and used for grading, and you are also asked to also provide the query that you used. 

No physical classes are held any longer due to the COVID-19 pandemic. Please contact your TA via Mattermost or e-mail to set up 1-on-1 meetings during the office hours, as listed here.

TA session

We offer four exercise sessions on Wednesday and two exercise session on Friday. To balance the attendance of each session, we assign the students based on the first letters in your family names. Check lecture and exercise times to see which slot is yours. Then please contact your corresponding TA via either Mattermost or e-mail.

The lecture has a 1A components, meaning that you are expected to have practical exposure to technology (1-2 hours). The TAs will have plenty of practical exercises for you and will help you getting your computers set up in the exercise sessions.


ETH EduApp

We will try to make the course a bit interactive, using the ETH ticker application during lectures. You can access it as a web app or install it on your smartphone (learn how).