Which protocol is more complex to implement: RabbitMQ (AMQP) or Apache Kafka?

3 points by aljun_invictus a year ago

We are considering offering our internally developed message queue service to external users (possibly as an open-source project). We are currently deciding whether to implement the RabbitMQ or Kafka protocol, aiming to integrate with the existing ecosystem more quickly. From our experience, Kafka seems easier to implement. However, Kafka has multiple versions and a variety of clients, which might introduce some challenges.

Our core requirements are:

- Protocol Extensibility: We may need to implement some custom features for internal use.

- Ease of Maintenance. Ease of Development: We do not necessarily need to implement the entire protocol, just the core functionalities. We noticed that AWS’s Kafka services do not provide all features either.

- Ease of Integration with Existing Ecosystem: We are concerned about potential issues with Kafka, mainly due to frequent protocol changes. RabbitMQ, on the other hand, primarily uses AMQP 0-9-1.

Could you provide advice based on these requirements?

caprock a year ago

From my understanding, the protocols for each of these systems are inherently coupled to the internal service characteristics. RabbitMQ and Kafka take very different approaches to messaging. Neither is generally better, and they each make different tradeoffs.

These tradeoffs have real impact on client logic. And in this case, where it's not as simple as a line protocol like http, it may be hard to disentangle the "objects" and the protocols.

With that in mind, I'd choose the one which is most similar to your internal messaging service so there's less impedance mismatch.

aljun_invictus a year ago

Honestly, my service is most similar to Kafka (it can replay), but implementing the routing of RabbitMQ doesn't seem too complicated.

ethegwo a year ago

I worked for TikTok before, and we forked the Kafka client for internal use. The approach of the Kafka client and its protocol is the craziest thing I've ever encountered. I would never want to build it from scratch, even just maintaining the forked version was horrible.

Kinrany a year ago

Why do your users want your queue instead of Rabbit or Kafka?

aljun_invictus a year ago

My new queue has better capabilities and performance, and I want the user migration to be smoother. Therefore, I hope to use a better ecosystem.
- sc68cal a year ago
  
  Suppose this is true for a moment, where your system has better performance than Kafka or RabbitMQ.
  Why would someone be interested in using your system, and give up the ecosystem, operational knowledge, and known pain points of those systems, in exchange for your system?
  Performance is not the be-all-end-all.
  
  aljun_invictus a year ago
  
  I offer some features that they have partially or don't have at all (such as priority queues and label filtering). The reason I raised this question is that I hope to reuse their protocol layers to reduce the burden on people during the migration process. I believe others with similar motivations exist as well (including cloud providers offering Kafka/RabbitMQ, whose internal implementations are not necessarily Kafka/RabbitMQ).
- Kinrany a year ago
  
  How are you going to expose the new capabilities through protocols that don't support them?

leros a year ago

RabbitMQ is simpler to implement but it has fewer capabilities than Kafka.

aljun_invictus a year ago

I asked this question in other communities, and they all expressed that RabbitMQ (AMQP) has more features. Can you tell me which features of Kafka you think are more complex and difficult to implement?

sc68cal a year ago

I honestly think both protocols have sufficient complexity that it's not really the right question.

I also would say that both protocols target different use cases. It may seem on the surface that they are similar but I don't think that gives either system a fair appraisal.

What is your target audience? What do they use currently?

aljun_invictus a year ago

We support both push and pull models (it seems both RabbitMQ and Kafka support both). Our implementation is actually more like Kafka (for example, it supports replay), but implementing RabbitMQ's routing isn't very complicated either. However, I'm concerned about the Kafka protocol, as it seems to have been revised many times historically, which could lead to client compatibility issues. My users typically want to decouple processing logic (not for large-scale data scenarios like logging, more like RabbitMQ use cases), so I'm quite conflicted.
- sc68cal a year ago
  
  So, as another commenter asked, why would someone use your solution instead of Kafka or RabbitMQ directly?
  It's not clear to me what usecase you are trying to fulfill.
  Like, at least for me, I reach for RabbitMQ when I am doing RPC or want reliable delivery of messages between systems.
  I have a lot more experience with RabbitMQ for that usecase.
  With Kafka, my experience with it is mostly the classic log processing usecase where we are processing syslog messages with Kafka and then Vector to process the syslog messages and transform them.
  Like, you COULD use RabbitMQ to do log processing and COULD use Kafka to do RPC and reliable message delivery, but each seems to have a more comfortable fit doing the jobs I described and you don't spend a lot of time forcing them to go in ways that thet don't easily go.
  So again, what is your system trying to accomplish, and what protocol matches what you are trying to do more?
  I've seen a couple of your comments and it seems like you are trying to be both?
  
  aljun_invictus a year ago
  
  We have currently implemented both push and pull models, along with disk storage, and we provide services internally through gRPC. Our goal now is to offer these services externally. However, one downside of message queues is the fragmentation of protocols, which results in high migration costs for users. My motivation is quite simple: to leverage existing ecosystems while providing a better implementation for users and reducing migration costs (similar to how Redpanda partially implements the Kafka protocol). Currently, our implementation is in C++ and Rust, but the protocol layer can be in other languages. We have an advantage in throughput and have also implemented features like priority queues that Kafka doesn't have.
  The AMQP protocol is quite complete (and allows for some customization), but I am not very familiar with the details of the Kafka protocol. I've heard it changes frequently, and I'm concerned that this might affect the implementation.
  
  sc68cal a year ago
  
  > However, one downside of message queues is the fragmentation of protocols, which results in high migration costs for users.
  So, I have a little bit of experience with this, because of some work using Celery which abstracted out the message queue protocol, as well as Redis' pub/sub and AMQP directly when I did some work on Ansible EDA.
  Honestly the code for consuming from these message systems was an incredibly small percentage of the code, due to the plethora of existing libraries for those protocols. Like, they were almost one-liners to block and consume from the queue.
  The majority of the code was handling the actual message contents and acting upon them.
  So, I don't think your statement is accurate.
  This may be something to consider, because if you are doing the AMQP or Kafka protocol just to be compatible, it may not actually be worth it, and instead it may be worth just providing a protocol that actually maps to what your system does.
  
  aljun_invictus a year ago
  
  You're right, there are many trade-offs involved. I'm considering this as well, but it's best to use a widely adopted ecosystem for external use.