[GH-ISSUE #78] NATS

BreizhHardware commented

2026-05-07 00:19:24 +02:00

Owner

Originally created by @binwiederhier on GitHub (Jan 1, 2022).
Original GitHub issue: https://github.com/binwiederhier/ntfy/issues/78

Continuation of https://github.com/binwiederhier/ntfy/issues/19#issuecomment-1002769688

Originally created by @binwiederhier on GitHub (Jan 1, 2022). Original GitHub issue: https://github.com/binwiederhier/ntfy/issues/78 Continuation of https://github.com/binwiederhier/ntfy/issues/19#issuecomment-1002769688

BreizhHardware

2026-05-07 00:19:24 +02:00

closed this issue
added the
server

enhancement
labels

BreizhHardware commented

2026-05-07 00:19:24 +02:00

Author

Owner

@gc-ss commented on GitHub (Jan 1, 2022):

The pure GRPC basis is definitely a nice option.

You got it!

Although you know this - just repeating to ensure we are all on the same page: due to GRPC, it's automatic to use ANY language as the pub-sub nodes.

For example, Tyler and his team focuses on Go, and we are a Python shop.

However, because of GRPC, we are easily able to use Python with LIftbridge.

Does LIftbridge has the ability for the client to automatically connect to nearest server and then fail over to nearest etc ?

It does and more (it really does a good job of HA replication - something that's critical for prod workloads)

Just ask and confirm with them:

Tyler and his team is fantastic and has OSS the very product he is building his company on.

Just drop by on their Slack to ask any questions you have - no matter how basic 👍

@gc-ss commented on GitHub (Jan 1, 2022): > The pure GRPC basis is definitely a nice option. You got it! Although you know this - just repeating to ensure we are all on the same page: due to GRPC, it's automatic to use ANY language as the pub-sub nodes. For example, Tyler and his team focuses on Go, and we are a Python shop. However, because of GRPC, we are easily able to use Python with LIftbridge. > Does LIftbridge has the ability for the client to automatically connect to nearest server and then fail over to nearest etc ? It does and more (it really does a good job of HA replication - something that's critical for prod workloads) Just ask and confirm with them: Tyler and his team is fantastic and has OSS the very product he is building his company on. Just drop by on their Slack to ask any questions you have - no matter how basic 👍

BreizhHardware commented

2026-05-07 00:19:25 +02:00

Author

Owner

@gc-ss commented on GitHub (Jan 2, 2022):

Yes, it's being used in production. The biggest gotchas have been due to us using Python while LIftbridge documentation often assumes a Go context. Since they don't seem to be a Python shop, they are not as interested in reproducing Python issues.

However, for core Liftbridge issues, I've obtained responses and fixes (when needed) within days without a commercial contract

a message hub with publishers and consumers of streams decoupled at compile time . So at runtime you can hook a consumer up to a publishers stream and subject

That's exactly the usecase I have for Liftbridge in production.

I can dynamically create/delete (sub)topics and ondemand scale in and scale out readers/writers/monitors.

You might be interested in reading up why Liftbridge was designed even though nats Jetstream exists.

Feel free to drop by on their Slack to ask any questions you have - I use Liftbridge in a very niche way (strongly decoupled, multi environment (language/OS/onpremise/cloud) node data mesh) - they can answer these Liftbridge specific questions much better than I can 👍

@gc-ss commented on GitHub (Jan 2, 2022): Yes, it's being used in production. The biggest gotchas have been due to us using Python while LIftbridge documentation often assumes a Go context. Since they don't seem to be a Python shop, they are not as interested in reproducing Python issues. However, for core Liftbridge issues, I've obtained responses and fixes (when needed) within days without a commercial contract > a message hub with publishers and consumers of streams decoupled at compile time . So at runtime you can hook a consumer up to a publishers stream and subject That's exactly the usecase I have for Liftbridge in production. I can dynamically create/delete (sub)topics and ondemand scale in and scale out readers/writers/monitors. You might be interested in reading up why Liftbridge was designed even though nats Jetstream exists. Feel free to drop by on their Slack to ask any questions you have - I use Liftbridge in a very niche way (strongly decoupled, multi environment (language/OS/onpremise/cloud) node data mesh) - they can answer these Liftbridge specific questions much better than I can 👍

BreizhHardware commented

2026-05-07 00:19:25 +02:00

Author

Owner

@gc-ss commented on GitHub (Jan 2, 2022):

… and yes, your initial proposal to @binwiederhier that ntfy would be drastically simplified and scaled is correct using LiftBridge/NATS which is why I joined the conversation.

In this case, the publisher and consumer are the phone apps or webservice, as the case may be.

LiftBridge/NATS is the hosted message bus.

Very simple yet powerful cloud native (that can be hosted on a laptop if needed) architecture.

@gc-ss commented on GitHub (Jan 2, 2022): … and yes, your initial proposal to @binwiederhier that `ntfy` would be drastically simplified and scaled is correct using LiftBridge/NATS which is why I joined the conversation. In this case, the publisher and consumer are the phone apps or webservice, as the case may be. LiftBridge/NATS is the hosted message bus. Very simple yet powerful cloud native (that can be hosted on a laptop if needed) architecture.

BreizhHardware commented

2026-05-07 00:19:25 +02:00

Author

Owner

@gc-ss commented on GitHub (Jan 2, 2022):

I was looking for security stuff like NATS Jetstream had so that streams can be restricted by user / roles. It looks like with Liftbridge you have to do this yourself

Sure, since I already use oso or opa for Authz, I prefer using those for this as well instead of yet another different thing NATS Jetstream offers.

I am interested to know how you find the latest Liftbridge

@gc-ss commented on GitHub (Jan 2, 2022): > I was looking for security stuff like NATS Jetstream had so that streams can be restricted by user / roles. It looks like with Liftbridge you have to do this yourself Sure, since I already use `oso` or `opa` for Authz, I prefer using those for this as well instead of yet another different thing NATS Jetstream offers. I am interested to know how you find the latest Liftbridge

BreizhHardware commented

2026-05-07 00:19:25 +02:00

Author

Owner

@binwiederhier commented on GitHub (Feb 5, 2022):

I was gonna just close this ticket, but I do think I owe you guys the courtesy of fully understanding what you'd want to use NATS for in ntfy.

I think I understand what it is (a pub sub system), and I think I understand what you'd want me to do (replace the guts of ntfy with NATS), but what I don't understand is why? What's the use case you see or want to support that needs NATS. Why would I do weeks of work to implement NATS? It feels like a pointless refactor at this point. I'd add complexity for features I don't want to need, no?

@binwiederhier commented on GitHub (Feb 5, 2022): I was gonna just close this ticket, but I do think I owe you guys the courtesy of fully understanding what you'd want to use NATS for in ntfy. I think I understand what it is (a pub sub system), and I think I understand what you'd want me to do (replace the guts of ntfy with NATS), but what I don't understand is _why_? What's the use case you see or want to support that needs NATS. Why would I do weeks of work to implement NATS? It feels like a pointless refactor at this point. I'd add complexity for features I don't want to need, no?

BreizhHardware commented

2026-05-07 00:19:26 +02:00

Author

Owner

@gc-ss commented on GitHub (Feb 7, 2022):

but what I don't understand is why? What's the use case you see or want to support that needs NATS. Why would I do weeks of work to implement NATS? It feels like a pointless refactor at this point. I'd add complexity for features I don't want to need, no?

This is a very strategic question, the answer to which is a long scenarios and requirements spec.

Before writing that out, let me share my usecase for ntfy.

ntfy to me is a prototype event sharing producer-consumer system.

The current ntfy prototype has proven to us that:

There's a huge demand for such an event sharing producer-consumer system - i.e., you have addressed the Market risk (people are interested in this as it solves a problem they have).
There's proof that such an event sharing producer-consumer system can be built - i.e., you have addressed the Product risk (you have the ability to build a product that solves the problem)

The next steps are to build out the product more and there's one big gaping hole in the current ntfy prototype: Reliability.

Reliability has to be the cornerstone of an event sharing producer-consumer system.

As people build out and start depending on ntfy for their event sharing needs, reliability will start to rear its ugly head.

Some things/reports we will see based off my experience in event sharing producer-consumer systems:

i. Messages sent weren't received by all consumers
ii. Duplicate messages were received by consumers
iii. Messages were thought to be sent weren't received by any consumers
iv. A temporary spike in events from certain producers overwhelmed the single ntfy service and made it crash, losing state/events and there's no ability to "scale out" the system
v. Certain popular topics that are important, but fire only occasionally, have an overwhelming number of consumers and the resource overhead of idle connections are "too much" for one server (it would be awesome to pause/resume these topics on demand to efficiently utilize compute resources)

These will show up as ntfy becomes popular.

These are all usecases well solved by LiftBridge/NATS and we haven't even discussed HA needs yet as @gedw99 has pointed out some scenarios (which I would classify as the second phase of reliability).

@gc-ss commented on GitHub (Feb 7, 2022): > but what I don't understand is why? What's the use case you see or want to support that needs NATS. Why would I do weeks of work to implement NATS? It feels like a pointless refactor at this point. I'd add complexity for features I don't want to need, no? This is a very strategic question, the answer to which is a long scenarios and requirements spec. Before writing that out, let me share my usecase for `ntfy`. `ntfy` to me is a prototype event sharing producer-consumer system. The current `ntfy` prototype has proven to us that: 1. There's a huge demand for such an event sharing producer-consumer system - i.e., you have addressed the Market risk (people are interested in this as it solves a problem they have). 2. There's proof that such an event sharing producer-consumer system can be built - i.e., you have addressed the Product risk (you have the ability to build a product that solves the problem) The next steps are to build out the product more and there's one big gaping hole in the current `ntfy` prototype: Reliability. Reliability has to be the cornerstone of an event sharing producer-consumer system. As people build out and start depending on `ntfy` for their event sharing needs, reliability will start to rear its ugly head. Some things/reports we will see based off my experience in event sharing producer-consumer systems: i. Messages sent weren't received by all consumers ii. Duplicate messages were received by consumers iii. Messages were thought to be sent weren't received by any consumers iv. A temporary spike in events from certain producers overwhelmed the single `ntfy` service and made it crash, losing state/events and there's no ability to "scale out" the system v. Certain popular topics that are important, but fire only occasionally, have an overwhelming number of consumers and the resource overhead of idle connections are "too much" for one server (it would be awesome to pause/resume these topics on demand to efficiently utilize compute resources) These will show up as `ntfy` becomes popular. These are all usecases well solved by LiftBridge/NATS and we haven't even discussed HA needs yet as @gedw99 has pointed out some scenarios (which I would classify as the second phase of reliability).

BreizhHardware commented

2026-05-07 00:19:26 +02:00

Author

Owner

@binwiederhier commented on GitHub (Feb 11, 2022):

This certainly answers all the questions. Thank you for the detailed info and the videos.

Yes its a 2nd phase thing !!

So I agree and disagree with this. I don't know if there is a second phase for ntfy, or if I even want any of the features you listed. A single Go service can easily handle thousands of subscribers, and while it's not HA fault tolerant, there is also no need for it to be, because it's not a mission critical system for anyone. The target audience (as of now) is hobbyists and selfhosters, not enterprises. And even if I were to put hours and days and weeks into it, enterprises would still not use a thing that only one guy made.

The argument I suppose is the same that I just gave in Discord about MQTT: I don't need the features, I don't want the complexities and many of the things will just add headaches that I didn't have before.

I am going to close this ticket with a respectful, but firm "thanks, but no thanks".

That of course doesn't mean that I'll never ever do NATS or MQTT, or that you can't fork it and do it yourself. In fact if you do, and it's done right and provides value without adding too much complexity, I may even merge it (no guarantees though).

I hope you understand. Happy weekend.

@binwiederhier commented on GitHub (Feb 11, 2022): This certainly answers all the questions. Thank you for the detailed info and the videos. > Yes its a 2nd phase thing !! So I agree and disagree with this. I don't know if there is a second phase for ntfy, or if I even want any of the features you listed. A single Go service can easily handle thousands of subscribers, and while it's not HA fault tolerant, there is also no need for it to be, because it's not a mission critical system for anyone. The target audience (as of now) is hobbyists and selfhosters, not enterprises. And even if I were to put hours and days and weeks into it, enterprises would still not use a thing that only one guy made. The argument I suppose is the same that I just gave in Discord about MQTT: I don't need the features, I don't want the complexities and many of the things will just add headaches that I didn't have before. **I am going to close this ticket with a respectful, but firm "thanks, but no thanks".** That of course doesn't mean that I'll never ever do NATS or MQTT, or that you can't fork it and do it yourself. In fact if you do, and it's done right and provides value without adding too much complexity, I may even merge it (no guarantees though). I hope you understand. Happy weekend.

BreizhHardware referenced this issue

2026-05-07 01:01:02 +02:00

[PR #68] [MERGED] Add Arch Linux installation instructions #1213

Rows
Columns

[GH-ISSUE #78] NATS #63