User:Jak7878/ApachePulsar

Apache Pulsar is a cloud-native, distributed messaging and stream-processing platform originally created by Yahoo! and later open-sourced under the Apache Software Foundation. It is designed to provide scalable, durable, and secure messaging and stream processing that can handle high-throughput use cases such as event sourcing, real-time analytics, and data integration.

= History = Apache® Pulsar™ was developed by Yahoo! in 2014 to address the limitations of existing messaging systems within their scale. It was first released as an open-source project in 2016. Since 2018, it has become a top-level Apache project and has seen significant growth and adoption across various industries.

= Main features =


 * Horizontal Scalability: Scales out to accommodate large data volume and high throughput. Allows for performance to scale linearly with the addition of new nodes to the cluster.
 * Elasticity: Supports scaling compute and storage resources up or down in response to workload changes without data movement across nodes.
 * Multi-Tenancy: Supports multiple tenants in a single Pulsar cluster.
 * Persistent Storage: Uses Apache BookKeeper for durable storage of messages to prevent data loss.
 * Geo-Replication: Provides native support for geo-replicated data, ensuring high availability and disaster recovery.
 * Client Libraries: Offers client libraries for multiple languages, including Java, Python, and C++.
 * Tiered Storage: Allows for long-term storage without the cost of primary storage systems.
 * Pulsar Functions: A lightweight compute framework for stream processing.

= Architecture = Apache Pulsar adopts a segmented architecture that separates the serving layer from the storage layer. This separation allows for independent scaling of processing and storage, providing flexibility and ensuring better resource utilization.

Key components include:


 * Brokers: Stateful components that handle client connections and manage the dispatch of messages to consumers.
 * Bookies: Part of Apache BookKeeper, responsible for storing messages on disk.
 * ZooKeeper: Used for coordination and metadata storage among different Pulsar components.

= Performance = Pulsar is designed for high performance and low latency.

Orange Financial, a subsidiary of China Telecom and a large payment provider in China, uses Apache Pulsar to handle over 50 million transactions per day and more than 1 billion events per day. Orange Financial shared this information during a talk at the O’Reilly Strata Data Conference in New York​.

= Community and Adoption = Several companies, from small startups to large enterprises, have adopted Pulsar for real-time applications. The project's governance is managed by the Apache Software Foundation, which ensures that it remains a community-driven project.

= Comparison with Other Systems = Apache Pulsar is often compared to other messaging and stream processing systems like Apache Kafka, RabbitMQ, and NATS. Each system has its own set of features and trade-offs, with Pulsar often highlighted for its strong multi-tenancy support, elasticity, and performance characteristics.

= External Links =


 * Apache Pulsar Official Website
 * Apache Pulsar GitHub Repository

= See Also =


 * Apache Kafka
 * Stream processing
 * Message queue
 * Event-driven architecture

= References =