Distributed Stream Processing: Systems and Algorithms

NOTE: This seminar will not be taking place this semester.

Overview

Modern distributed stream processing technology enables continuous, fast, and reliable analysis of large-scale unbounded datasets. Stream processing has recently become highly popular across industry and academia due to its capabilities to both improve established data processing tasks and to facilitate novel applications with real-time requirements. In this seminar, we will study the design and architecture of modern distributed streaming systems as well as fundamental algorithms for analyzing data streams. We will also consider current research topics and open issues in the area of distributed stream processing. In particular, the students will read, review, present, and discuss a series of research and industrial papers covering the following topics:

  • Fault-tolerance and processing guarantees
  • State management
  • Windowing semantics and optimizations
  • Query languages and libraries for stream processing (e.g. Complex Event Processing, online machine learning)

Course Program and Materials


Participation

If you are interested in participating send an email to Vasiliki Kalavri (kalavriv at inf dot ethz dot ch) listing three topics of your preference.


Examination

There is no formal examination at the end of the seminar. Each participant is continuously graded based on their participation in the discussions (20%), presentation (20%) and reports (10% complete review - 50% weekly short reports). 

Specifically, each participant is expected to fullfil the following tasks:
- Read one paper per week and for that
  - prepare a set of questions for discussion during the lecture.
  - write a short (half-page) report listing the paper's strong, weak, and interesting points.
- Prepare and deliver one 20' presentation for a seminar paper.
- Write one complete review for one paper of their choosing (different from the one selected for presentation).


Staff