Getting Started
Documentation

Samza Meetups

March 2019
Apache Samza 1.0: Recent advances and our plans for the future
» Presented At ─ LinkedIn
» Presented By ─ Prateek Maheshwari
Upcoming Event!
Apache Samza has reached a major milestone with its recent 1.0 release. In this talk, we step back and take stock of the major new features and enhancements in Samza 1.0. We also take a sneak peek at what's next on our roadmap. Both Stream Processing veterans and developers new to Stream Processing will discover new features to leverage for their applications.
Oct 2018
Operating Samza at LinkedIn
» Presented At ─ LinkedIn
» Presented By ─ Abhishek Shivanna and Stephan Soilleau
Upcoming Event!
Operating a streaming platform that processes over a trillion messages daily, with thousands of applications is a daunting task. This talk shares the best practices around operating Samza as a managed service.
July 2018
Beam me up Samza: How we built a Samza Runner for Apache Beam
» Presented At ─ LinkedIn
» Presented By ─ Xinyu Liu
Upcoming Event!
Apache Beam is an open source unified programming model to define and execute data processing pipelines, this talks explains how LinkedIn build Samza Runner to leverage Beam
Concourse - Near real-time notifications platform at LinkedIn
» Presented At ─ LinkedIn
» Presented By ─ Ajith Muralidharan Vivek Nelamangala
Upcoming Event!
Concourse is LinkedIn’s first near-real-time targeting and scoring platform for notifications. This talk provides an in-depth overview of the design and discuss various scaling optimizations.
June 2018
Stream Processing at LinkedIn with Apache Samza (Bangalore Kafka Group Meetup)
» Presented At ─ LinkedIn
» Presented By ─ Abhishek Shivanna
Upcoming Event!
This talk covers from introduction to Stream processing concepts using Samza to deep dives into use cases like Notifications and Viewport tracking at LinkedIn
March 2018
Conquering the Lambda architecture in LinkedIn metrics platform with Apache Calcite and Apache Samza
» Presented At ─ LinkedIn
» Presented By ─ Khai Tran
Upcoming Event!
Metrics play an important role in data-driven companies like LinkedIn. To serve metrics in real time LinkedIn built an extension to the offline platform that auto-generates Samza real-time flow from existing offline transformation code with just a single command using Apache Calcite
Building Venice with Apache Kafka & Samza
» Presented At ─ LinkedIn
» Presented By ─ Gaojie Liu
Upcoming Event!
Venice is a distributed key-value store, which specializes in serving the datasets computed in Hadoop and Samza at LinkedIn. This talk covers how LinkedIn built Venice by leveraging Kafka and how it empowers new Samza use cases at LinkedIn.
December 2017
Stream processing using Samza-SQL@LinkedIn
» Presented At ─ LinkedIn
» Presented By ─ Srinivasulu Punuru
Upcoming Event!
Deep Dive into Samza's SQL library, API and architecture
Real-time Indexing of LinkedIn’s Economic Graph
» Presented At ─ LinkedIn
» Presented By ─ Almog Gavra
Upcoming Event!
A look into LinkedIn’s Search Engine indexing pipeline, focusing on its leverage of Kafka and Samza to ingest over 10K events per second of real time updates
November 2017
Unified Stream Processing at Scale with Apache Samza
» Presented At ─ BigDataSpain 2017
» Presented By ─ Jacob Maes
Upcoming Event!
Deep Dive intro Stream Processing Ecosystem at LinkedIn, use cases in batch and streaming world
September 2017
Samza at Redfin: Using Streaming to Help Home Buyers and Sellers
» Presented At ─ LinkedIn
» Presented By ─ Brian Hanks
Upcoming Event!
Redfin sends millions of notifications per day to our customers to help them buy and sell homes. This talk explains how Redfin developed a streaming system based on Samza to provide a low latency, resilient, horizontally scalable, high throughput system to send notifications to there customers
Real-time Indexing of LinkedIn’s Economic Graph
» Presented At ─ LinkedIn
» Presented By ─ Almog Gavra
Upcoming Event!
In this talk Almog covers the basics of LinkedIn’s Search Engine indexing pipeline, focusing on how they leveraged Kafka and Samza to ingest over 10K events per second of real time updates.
Unified Batch & Stream Processing with Apache Samza
» Presented At ─ Dataworks Summit Sydney 2017
» Presented By ─ Navina Ramesh
Upcoming Event!
In this talk Navina shares insights on Convergence of batch data pipelines at Linkedin with Apache Samza, Samza unified data processing api and flexible deployment model
August 2017
Samza: Stateful Scalable Stream Processing at LinkedIn
» Presented At ─ VLDB 2017
» Presented By ─
Upcoming Event!
Deep dive into Stream processing infra at LinkedIn operating at scale at 2.1 trillion messages per day
May 2017
Streaming Data Pipelines with Brooklin
» Presented At ─ LinkedIn
» Presented By ─ Samarth Shetty
Upcoming Event!
Brooklin is a system at LinkedIn to create data pipelines connecting streaming data sources (i.e. Kafka, EventHubs, Change-Capture streams) with nearline applications. The talk explains about Brooklin, the problems it addresses, its design, usage and future directions.
Managed or stand alone, streaming or batch; Unified processing with the Samza Fluent API
» Presented At ─ LinkedIn
» Presented By ─ Yi Pan
Upcoming Event!
Samza 0.13 improves the simplicity and portability of Samza applications. The new fluent API supports common operations like windowing, map and join on streams. This talk also covers Samza Standalone, which empowers developers to deploy and scale Samza applications as a simple embedded library.
What it takes to process a trillion events a day? Case studies in scaling stream processing at LinkedIn
» Presented At ─ ApacheCon Big Data 2017
» Presented By ─ Jagadish Venkatraman
Upcoming Event!
Deep dive into hard problems in stream processing and case studies of LinkedIn's communication platform and News feed platform leveraging Samza
February 2017
Asynchronous Processing and Multithreading in Apache Samza
» Presented At ─ LinkedIn
» Presented By ─ Xinyu Liu
Upcoming Event!
Data use cases for streaming, asynchronous model improvements and performance enhancements
Batching to Streaming Analytics at Optimizely
» Presented At ─ LinkedIn
» Presented By ─ Vignesh Sukumar, Mike Davis, Hao Xia
Upcoming Event!
How Optimizely manages billions of events per day and generates real time for Personalization and Recommendation
November 2016
Apache Samza: Past, Present, and Future
» Presented At ─ LinkedIn
» Presented By ─ Kartik Paramasivam
Upcoming Event!
Scaling up Near real-time Analytics
» Presented At ─ QCon'16 SF
» Presented By ─ Yi Pan (LinkedIn) and Chinmay Soman (Uber)
Upcoming Event!
Deep dive into real time analytics platform use cases at Uber and Samza SQL - Apache Calcite integration of Samza at LinkedIn
June 2016
Air Traffic Controller: Using Samza to Manage Communications with Members
» Presented At ─ LinkedIn
» Presented By ─ Cameron Lee & Shubhanhu Naga
Upcoming Event!
Air Traffic Controller (ATC) is a system built on top of Samza, which is responsible for managing many of the communication channels LinkedIn has with its members. How Samza was leveraged to build ATC, and some challenges that we faced while building ATC.
Scalable Complex Event Processing on Samza
» Presented At ─ LinkedIn
» Presented By ─ Shuyi Chen
Upcoming Event!
The Marketplace data team at Uber has built a scalable complex event processing platform to solve many challenging real time data needs for various Uber products. They discuss the design and architecture of the platform, and how they employ Samza, Kafka, and Siddhi at scale.
Lambdaless Stream Processing at Scale in LinkedIn
» Presented At ─ Hadoop Summit 2016
» Presented By ─ Yi Pan & Kartik Paramasivam
Upcoming Event!
May 2016
Will it Scale? The Secrets behind Scaling Stream Processing Applications
» Presented At ─ ApacheCon Big Data NA 2016
» Presented By ─ Navina Ramesh
Upcoming Event!
February 2016
StatServer-Samza: Near Real-time Analytics
» Presented At ─ LinkedIn
» Presented By ─ Tomy Tsai
Upcoming Event!
StatServer is a near real-time analytics service popularly used in LinkedIn and is in the process of being migrated to the Samza platform.
October 2015
Benchmarking Apache Samza
» Presented At ─ LinkedIn
» Presented By ─ Tao Feng
Upcoming Event!
New Features in Samza 0.10.0
» Presented At ─ LinkedIn
» Presented By ─ Navina Ramesh
Upcoming Event!
Netflix Keystone Pipeline
» Presented At ─ LinkedIn
» Presented By ─ Monal Daxini
Upcoming Event!
Essential ingredients for real time stream processing @Scale
» Presented At ─ BigData 2015 @Spain
» Presented By ─ Kartik Paramasivam
Upcoming Event!
July 2015
Athena - Stream Processing Platform
» Presented At ─ LinkedIn
» Presented By ─ Chinmay Soman
Upcoming Event!
Harvesting the Power of Samza in LinkedIn Feed
» Presented At ─ LinkedIn
» Presented By ─ Mohamed Mahmoud
Upcoming Event!
June 2015
Going Realtime with Kafka and Samza at Improve Digital)
» Presented At ─ GeekOut
» Presented By ─ Garry Turkington
Upcoming Event!
May 2015
System Latency Diagnosis for Microservices with Samza and Druid
» Presented At ─ LinkedIn
» Presented By ─ Roger Hoover
Upcoming Event!
Indexing Time Series Streams with Samza and Druid
» Presented At ─ LinkedIn
» Presented By ─ Gian Merlino
Upcoming Event!
Clojure with Samza: application architecture...implementation challenges
» Presented At ─ LinkedIn
» Presented By ─ Gian Merlino
Upcoming Event!
February 2015
Optimizing Streaming SQL Queries
» Presented At ─ LinkedIn
» Presented By ─ Julian Hyde
Upcoming Event!
Scalable real-time data processing with Apache Samz
» Presented At ─ Jfokus
» Presented By ─
Upcoming Event!
November 2014
Moving Towards a Streaming Architecture
» Presented At ─ Strata EU
» Presented By ─
Upcoming Event!
Scalable stream processing with Apache Samza and Apache Kafka
» Presented At ─ ApacheCon EU
» Presented By ─
Upcoming Event!
Samza in LinkedIn: How LinkedIn Processes Billions of Events Everyday in Real-time
» Presented At ─ QCon SF
» Presented By ─
Upcoming Event!
October 2014
Staying agile in the face of the data deluge
» Presented At ─ Span Conference
» Presented By ─ Martin Kleppmann
Upcoming Event!
Building real-time data products at LinkedIn with Apache Samza
» Presented At ─ Strata/Hadoop World
» Presented By ─ Martin Kleppmann
Upcoming Event!
September 2014
Turning the database inside out with Apache Samza
» Presented At ─ Strangeloop
» Presented By ─ Martin Kleppmann
Upcoming Event!
Samza: Reliable Stream Processing atop Apache Kafka and Hadoop YARN
» Presented At ─ Global Big Data Conference
» Presented By ─
Upcoming Event!
May 2014
Samza at LinkedIn: Taking Stream Processing to the Next Leve)
» Presented At ─
» Presented By ─ Los Angeles Data Engineering Meetup
Upcoming Event!
November 2013
Samza: Real-time Stream Processing at LinkedIn
» Presented At ─ QCon SF 2013
» Presented By ─ Chris Riccomini
Upcoming Event!
Samza: Real-time Stream Processing at LinkedIn
» Presented At ─ HUG at LinkedIn
» Presented By ─ Chris Riccomini
Upcoming Event!
September 2013
Apache Samza: Reliable Stream Processing atop Apache Kafka and Hadoop YARN
» Presented At ─ London HUG
» Presented By ─
Upcoming Event!
Introduction to Samza
» Presented At ─ YARN Meetup
» Presented By ─ Chris Riccomini
Upcoming Event!