Getting Started
Documentation

Powered By

A list of companies powered by Samza
  • Samza was originally developed at LinkedIn. It’s currently used to process tracking data, logs from various services, and for streaming data pipelines.

  • Tripadvisor

    Tripadvisor is the world’s largest travel site, enabling travelers to plan and book the perfect trip. It uses Apache Samza to process billions of events daily for analytics, machine learning, and site improvement.

  • Tivo

    Tivo is a digital video recorder that allows users to save TV programs for later viewing based on an electronic TV programming schedule. It leverages Samza for realtime processing of views and ratings to help power personalized content recommendations and analytics.

  • Slack

    Slack uses Samza to build their streaming data pipelines for monitoring and analytics.

  • Redfin

    Redfin provides real estate search and brokerage services through a combination of real estate web platforms. At Redfin, we use Samza and Kafka for sending millions of email and push notifications to our customers everyday. We chose Samza for distributed processing because it integrates really well with Kafka. Samza also provides managed state and a resilient local storage which we found to be very useful features.

  • Intuit

    At Intuit, we use Samza to enrich events with more contextual data from various sources (CMDB, Change Management, Incident Management, Problem Management). This gives us more meaningful events that an operations centre person can act on.

  • Netflix

    Netflix uses single-stage Samza jobs to route over 700 billion events / 1 peta byte per day from fronting Kafka clusters to s3/hive. A portion of these events are routed to Kafka and ElasticSearch with support for custom index creation, basic filtering and projection. We run over 10,000 samza jobs in that many docker containers.

  • Optimizely

    Optimizely, the world’s leader in customer experience optimization uses Apache Samza to aggregate and enrich billions of events per day to power real-time analytics of experiments and personalization experiences.

  • Vmware

    vRealize Network Insight (vRNI) is VMware’s flagship product for delivering intelligent operations for software defined network environments (e.g. NSX).

    At the heart of the vRNI architecture are a set of distributed processing and analytics modules that crunch large amounts of streaming data on a cluster of multiple machines. It is critical that these operations are carried out in a way that is reliable, efficient and robust - even in the face of dynamic faults in the underlying infrastructure layers. We have been successfully using Apache Samza as a distributed streaming data processing framework for executing these analytical modules reliably and efficiently at a very large scale, thus helping us focus on our core business problems.

  • Banno

    Jack Henry and Associates is an S&P 400 company that supports more than 11,300 financial institutions with core processing services. It leverages Samza to process user-activity data across its Banno suite of products for financial institutions.

  • Fortscale

    Fortscale is redefining behavioral analytics, with the industry’s first embeddable engine, making behavioral analytics available for everyone. It is using Samza to process security events as part of their data ingestion pipelines and for the creation of on-line machine learning models.

  • DoubleDutch

    DoubleDutch provides mobile applications and performance analytics for events, conferences, and trade shows for more than 1,000 customers including SAP, UBM, and Urban Land Institute. It uses Samza to power their analytics platform and stream data live into an event dashboard for real-time insights.

  • Metamarkets

    Metamarkets offers an interactive analytics platform for buyers and sellers of programmatic advertising. It uses Samza to transform and join real-time event streams, then forward them into a Druid cluster for interactive querying.

  • Movio

    Movio offers data-driven marketing solutions for the film industry. At Movio, we use Samza to process and enrich billions of change data capture events on all databases in real-time.

  • Ntent

    Ntent blends semantic search with natural language processing technologies to predict and create relevant content experiences. At Ntent, we use Samza to power our streaming content ingestion system. We take crawled web pages and news articles, and pass them through a multi-stage processing pipeline that cleanses, classifies, extracts features that power other learning models, stores, and indexes the content for search.

  • State

    State is a public global opinion network that focuses on empowering individuals, democracy, and social progress. It uses Samza to process and join streams of changes from MongoDB to update a wide range of realtime services that support the website and mobile apps. These include search, user recommendations, opinion metrics and lots more.

  • MobileAware

    At MobileAware, we use Samza to enrich events with more contextual data from various sources (CMDB, Change Management, Incident Management, Problem Management). This gives us more meaningful events that an operations centre person can act on.

  • ImproveDigital

    Improve Digital is using Samza as the foundation of its realtime processing capabilities, data analytics needs and alerting systems.

  • HappyPancake

    Happy Pancake, Northern Europe’s largest internet dating service, is using Samza for all event handlers and data replication.

  • VinTank

    VinTank, is the leading software solution for social media management for the wine and hospitality industry. It uses Samza to power their social media analysis and NLP pipeline. Measuring over one billion conversations about wine, profiling over 30 million social wine consumers and serving over 1000 wine brands, VinTank helps wineries, restaurants, and hotels connect and understand their customers.