Thursday, February 27, 2014

Apache Camel meets Kafka!

 LinkedIn engineers in their work in [1] have introduced Kafka, a distributed messaging system that was initially developed for collecting and delivering high volumes of log data with low latency, which was later on donated to Apache Software Foundation (ASF).

 Kafkadue to its architecture, performance and scalability characteristics, has proven to be revolutionary in today’s messaging technologies and has been used with success in several domains and projects - most notably among them the LinkedIn’s real-time activity data pipeline [2].

 It is the Kafka design (more details and background are given in the relevant paper[1]) along with benchmarks, comparisons and discussion found in the community, the reason that I wanted to get involved and learn more about this technology. In the process to learn more about Kafkahow it works and create some toy projetcs , I started also working on an Apache Camel component.

 Apache Camel is another ASF project, one of my most favourite ones, which I consider as an absolute must in many cases, especially when it comes to distributed systems and more over to integration projects.

 This way, I decided to bring together Apache Kafka with Apache Camel, in a similar way like I was doing in my previous work on Camel-Gora component for NoSQL databases.

 All in all, this post is about announcing camel-kafka and providing links to the source code and wiki  pages.  Camel-kafka component gives the ability to utilize Apache Kafka through camel and therefore integrate it in your stack.

Note: documentation and examples to come soon!


[1] Kreps, Jay, Neha Narkhede, and Jun Rao. "Kafka: A distributed messaging system for log processing.Proceedings of the NetDB. 2011.

[2] Goodhope, Ken, et al. "Building LinkedIn's Real-time Activity Data Pipeline."IEEE Data Eng. Bull. 35.2 (2012): 33-45.In