Does the heading sound familiar? 2 years ago I released an integration testing framework for the AMQP protocol.
Now it’s time for something similar. Say hello to Kafka.TestFramework!
Kafka
Kafka was originally developed by LinkedIn and open-sourced in 2011. It is a distributed streaming platform that uses pub/sub to distribute messages to topics in form of a partitioned log. It is built to scale while maintaining performance no matter how much data you throw at it.
The Protocol
Kafkas protocol is build based on a request/response model. The client sends a request, and the broker sends back a response. Versioning is built into the messages, where each and every property is defined with a version range. This makes versioning quite easy to handle and it is possible to build one platform on top of all versions. Property types are defined by a mix of simple data types and variable-length zig-zag encoded types using Google Protocol Buffers.
All message definitions can be found on github, and the protocol is explained in detail in the Kafka protocol guide. Both of these resources make up the core of the Kafka.Protocol project.
Kafka.Protocol
The testing framework is built upon protocol definitions. Kafka.Protocol auto-generates all protocol messages, primitives and serializers into typed C# classes. The library makes it possible to write client and server implementations of the kafka protocol using only .NET. This makes it easier to debug and understand the protocol in a C# point of view and removes any dependencies to libraries like librdkafka and it’s interop performance implications.
Kafka.TestFramework
The Kafka.TestFramework aims to make it possible to test integrations with Kafka clients. It acts as a server and exposes an api that makes it possible to subscribe on requests and send responses to clients. It can run in isolation in-memory or hook up to a TCP socket and is based on the protocol definitions found in Kafka.Protocol.
Both Kafka.TestFramework and Kafka.Protocol are built on top of System.IO.Pipelines, a high-performant streaming library that greatly simplifies handling of data buffers and memory management. I’m not going to go into details on how that works, there’s already a lot of good articles about this library. Btw, did I say it’s FAST? 🙂
To see some examples on how the framework can be utilized, check out the tests where it communicates with Confluent’s .NET client for Kafka!
The Future
The Kafka protocol is quite complex, specifically when combined with the distribution nature of Kafka. Many of it’s processes can be hard to understand and might be interpretated as too low-level. These processes can probably be packaged in a simple way making it possible to focus on the high-level messages of the business domain.
Yet, at the same time, it is powerful being able to simulate different cluster behaviours that might be hard to replicate on high-level clients. Being able to get both would be nice.
Test it out, let me know what you think!
Be First to Comment