Build an Open Source Streaming Data Pipeline

FOSDEM 2022

Any conversation about Big Data would be incomplete without talking about Apache Kafka and Apache Flink: the winning open source combination for high-volume streaming data pipelines.

In this talk we'll explore how moving from long running batches to streaming data changes the game completely. We'll show how to build a streaming data pipeline, starting with Apache Kafka for storing and transmitting high throughput and low latency messages. Then we'll add Apache Flink, a distributed stateful compute engine, to create complex streaming transformations using familiar SQL statements.

This session is aimed at data professionals, who are ready to embrace open source streaming and make their data fly.

Speakers: Olena Kutsenko Francesco Tisiot