Introduction of – Apache Pulsar

Apache Pulsar is an open-source distributed pub-sub messaging system originally created at Yahoo! that is part of the Apache Software Foundation.

Pulsar is a multi-tenant, high-performance solution for server-to-server messaging.

Pulsar’s key features include:

Architecture Overview

At the highest level, a Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can replicate data amongst themselves.

The diagram below provides an illustration of a Pulsar cluster:

Pulsar Comparison With Apache Kafka

The table below lists the similarities and differences between Apache Pulsar and Apache Kafka:

KAFKA PULSAR
Concepts Producer-topic-consumer group-consumer Producer-topic-subscription-consumer
Consumption More focused on streaming, exclusive messaging on partitions. No shared consumption. Unified messaging model and API.

  • Streaming via exclusive, failover subscription
  • Queuing via shared subscription
Acking Simple offset management

  • Prior to Kafka 0.8, offsets are stored in ZooKeeper
  • After Kafka 0.8, offsets are stored on offset topics
Unified messaging model and API.

  • Streaming via exclusive, failover subscription
  • Queuing via shared subscription
Retention Messages are deleted based on retention. If a consumer doesn’t read messages before the retention period, it will lose data. Messages are only deleted after all subscriptions consumed them. No data loss even the consumers of a subscription are down for a long time.Messages are allowed to keep for a configured retention period time even after all subscriptions consume them.
TTL No TTL support Supports message TTL

Conclusion

Apache Pulsar is an effort undergoing incubation at The Apache Software Foundation (ASF) sponsored by the Apache Incubator PMC. It seems that it will be a competitive alternative to Apache Kafka due to its unique features.

References

  1. Apache Pulsar homepage
  2. Yahoo! Open Source homepage
  3. Apache homepage
  4. Pulsar concepts and architecture documentation
  5. Comparing Pulsar and Kafka: Unified queueing and streaming
  6. Apache-Pulsar Distributed Pub-Sub Messaging System