HN2new | past | comments | ask | show | jobs | submitlogin

The main driver for Pulsar is that we have a number of different messaging use cases, some more "pub/sub" like and some that are more "log" like. Pulsar really does unify those two worlds while also being a ton more flexible than any hosted options.

For example, Kinesis is really limiting with the limited retention and making it very difficult to do any real ordering at scale due to the really tiny size of each shard.

Similarly, SQS does pub/sub well, but we keep finding that we do need to use the data more than the first initial delivery. Instead of having multiple systems where we store that data we have one.

As for why we didn't go with Kafka, the biggest single reason is that Pulsar is easier operationally with no needing to re-balance and also with the awesome feature that is tiered storage via offloading that allows us to actually do topics that have unlimited retention. Perhaps more importantly for the adoption though is pub/sub is much easier with Pulsar and the API is just much easier to reason about for developers than all the complexity of consumer groups, etc. There are a ton of other nice things like being able to have topics be so cheap such that we can have hundred of thousands and all of the built-in multi-tenancy features, geo-replication, flexible ACL system, pulsar functions and pulsar IO and many other things that really have us excited about all the capabilities



> able to have topics be so cheap

For GDPR a lot of us has to do exportable 'user activity'. Can you in theory have a topic/user ( we had like 50 million users) and publish any user activity to that topic?


Pulsar docs indicate "millions" of topics but IDK what 50 million would look like but from what I know I would be a bit nervous about it :)

It might be worth chatting with Pulsar devs on their slack community (https://apache-pulsar.herokuapp.com/).

Most commonly what I hear people doing for this is either one of two approaches (or a combination of both): - encrypt the user data and delete the key, eventually the user data will get removed - regularly compact the topic (pulsar has a compaction feature) and write in a tombstone record which will remove any user data after compaction




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: