Getting Started

Apache Samza 0.14 [Docs]

We are very excited to announce the release of Apache Samza 0.14.0. It is a major release with highly anticipated features viz Samza SQL, Azure EventHubs support and AWS Kinesis consumer.

Enhancements and Bug Fixes

Overall, 65 JIRAs were resolved in this release. Here are few highlights

  • SAMZA-1510 Introduce SQL semantics to Samza
  • SAMZA-1438 Implement Producer and consumer for Azure EventHubs
  • SAMZA-1515 Implement Kinesis consumer
  • SAMZA-1486 Checkpoint provider for Azure tables
  • SAMZA-1421 Support for durable state in high-level API
  • SAMZA-1392 Fix performance and correctness issues with concurrent sends and flushes in kafka system producer
  • SAMZA-1406 Enhancements to the Zookeeper based deployment model
  • SAMZA-1321 Support for multi-stage batch processing

Upgrade Notes

  • Introduced a new mandatory configuration - job.coordination.utils.factory. It impacts applications using non-YARN deployment models. Read more about it here.
  • The following APIs in SystemAdmin have been deprecated in the previous versions and hence, replaced with newer APIs. If you have a custom System implementation, then you have to update to the newer APIs.
    • void createChangelogStream(String streamName, int numOfPartitions); -> boolean createStream(StreamSpec streamSpec);
    • void createCoordinatorStream(String streamName); -> boolean createStream(StreamSpec streamSpec);
    • void validateChangelogStream(String streamName, int numOfPartitions); -> void validateStream(StreamSpec streamSpec) throws StreamValidationException;
  • New API has been added to SystemAdmin that clear a stream.
    • boolean clearStream(StreamSpec streamSpec); Read more about it in the API docs.

Sources and Artifacts


For more details about this release, please check out the release blog post.