Samza Event Hubs Connectors Example
The hello-samza project has an example that uses the Samza High Level Streams API to consume and produce from Event Hubs using the Zookeeper deployment model.
Get the Code
Let’s get started by cloning the hello-samza project
The project comes up with numerous examples and for this tutorial, we will pick the Azure Event Hubs demo application.
Setting up the Deployment Environment
For our Azure application, we require ZooKeeper. The hello-samza project comes with a script called “grid” to help with the environment setup
This command will download, install, and start ZooKeeper and Kafka. It will also check out the latest version of Samza and build it. All package files will be put in a sub-directory called “deploy” inside hello-samza’s root folder.
If you get a complaint that JAVA_HOME is not set, then you’ll need to set it to the path where Java is installed on your system.
Configuring the Samza Application
Here are the Event Hubs descriptors you must set before building the project.
Configure these in the src/main/java/samza/examples/AzureApplication.java
file.
In the code snippet above, we create the input and output streams that can consume and produce from the configured Event Hubs entities.
- Line 3: A
EventHubsSystemDescriptor
is created with the name “eventhubs”. You may set different system descriptors here. - Line 6: Event Hubs messages are consumed as key value pairs. The serde is defined for the value of the payload of the Event Hubs’ EventData. You may use any of the serdes that samza ships with out of the box or define your own. The serde for the key is not set since it will always the String from the EventData partitionKey.
- Line 9-17: An
EventHubsInputDescriptor
and anEventHubsOutputDescriptor
are created with the required descriptors to gain access of the Event Hubs entity (STREAM_ID
,EVENTHUBS_NAMESPACE
,EVENTHUBS_ENTITY
,EVENTHUBS_SAS_KEY_NAME
,EVENTHUBS_SAS_KEY_TOKEN
). These must be set to the credentials of the entities you wish to connect to. - Line 10-21: creates an
InputStream
andOutputStream
with the previously definedEventHubsInputDescriptor
andEventHubsOutputDescriptor
, respectively.
Alternatively, you can set these properties in the src/main/config/azure-application-local-runner.properties
file.
Note: the keys set in the .properties
file will override the ones set in code with descriptors.
Refer to the Event Hubs configuration reference for the complete list of configurations.
Building the Hello Samza Project
With the environment setup complete, let’s move on to building the hello-samza project. Execute the following command:
We are now all set to run the application locally.
Running the Azure application
In order to run the application, we will use the run-azure-application script.
The above command executes the helper script which invokes the AzureZKLocalApplication main class, which starts the AzureApplication. This application prints out the messages from the input stream to stdout
and send them the output stream.
The messages consumed should be printed in the following format:
Shutdown
This application can be shutdown by terminating the run-azure-application script. We can use the grid script to tear down the local environment (Kafka and Zookeeper).