Featuring speakers from:
Stream processing is increasingly relevant in today’s world of big data, thanks to the lower latency, higher-value results, and more predictable resource utilization afforded by stream processing engines. At the same time, without a solid understanding of the necessary building blocks, streaming can feel like a complex and subtle beast. It doesn’t have to be that way.
Join Davor Bonaci for a tour of stream processing concepts via a walkthrough of the easiest to use yet most sophisticated stream processing model on the planet, Apache Beam. You’ll explore a series of examples that help shed light on the important topics of windowing, watermarks, and triggers; observe firsthand the different shapes of materialized output made possible by the flexibility of the Beam streaming model; experience the portability afforded by Beam, as you work through examples using the runner of your choice (Apache Flink, Apache Spark, or Google Cloud Dataflow); and interact with engineers who have years of experience with massive-scale stream processing.
Requirements
Course Dates
Location
Course Length
Onsite limited
Language
Davor is serving as a chair of the Apache Beam Project Management Committee, and has been regularly committing code to the project since its inception. I'm working as a Senior Software Engineer at Google.
Before Beam, Davor has been working on its predecessor, Google Cloud Dataflow, since its beginnings, most recently by leading the development of the Dataflow SDK for Java.
Gris Cuevas is an Open Source Program Manager at Google Cloud and an aspiring Data Scientist. She currently studies a Masters in Operations Research and Data Science at UC Berkeley. Gris has worked on developing online communities for the past 7 years and is now collaborating on the design of an algorithm to predict author quality in online forums within a research team at Google. Gris likes to solve undefined problems and to spearhead solutions no one has designed before. She's learning to juggle, she loves The Beatles and green tea in all forms.
Pablo is a Software Engineer from Mexico City. He lives in Seattle, and works trying to make Google Cloud Dataflow the best runner for Beam. He's worked all across the stack, mostly in Python and Java. His favorite activities are traveling, and getting drunk with the locals.
9:30-10:30 AM
Arrival, networking, environment setup
10:30 - 11:30 AM
Introduction to streaming concepts and Apache Beam
11:30 - 12:00 PM
Case study: developing a data processing pipeline for a mobile game
12:00 - 2:00 PM
Excercises
2:00 - 3:00 PM
Lunch!
3:00 - 4:00 PM
Unified batch and stream processing model in Apache Beam
4:00 - 5:00 PM
Exercises, LeaderBoard and GameStats
Sign up to receive notifications about upcoming Wizeline Academy courses
Interested in sharing your expertise at Wizeline Academy? Send us an email academy@wizeline.com