[GH-ISSUE #210 ] Implement Prometheus /metrics endpoint to ntfy server #166

@rogeliodh commented on GitHub (Apr 12, 2022):

fair enough!

about dependencies, there is an alternative library which, supposedly as I haven't used it, has less dependencies: https://github.com/VictoriaMetrics/metrics

about the endpoint being public, yes, I think it shoud be protected by some auth, or even published in another port

but let's wait to see if there are more users that want this feature :)

@rogeliodh commented on GitHub (Apr 12, 2022): fair enough! about dependencies, there is an alternative library which, supposedly as I haven't used it, has less dependencies: https://github.com/VictoriaMetrics/metrics about the endpoint being public, yes, I think it shoud be protected by some auth, or even published in another port but let's wait to see if there are more users that want this feature :)

BreizhHardware commented

@mhameed commented on GitHub (Apr 29, 2022):

Hi,
I would also be interested in seeing ntfy prometheus metrics implemented.
I would also second the idea that the prometheus/victoria/openmetrics information would be served via a diffrent interface/port to the actual ntfy service.
Thanks.

@mhameed commented on GitHub (Apr 29, 2022): Hi, I would also be interested in seeing ntfy prometheus metrics implemented. I would also second the idea that the prometheus/victoria/[openmetrics](https://github.com/OpenObservability/OpenMetrics/) information would be served via a diffrent interface/port to the actual ntfy service. Thanks.

BreizhHardware commented

@benisai commented on GitHub (Aug 4, 2022):

Interested in this

@benisai commented on GitHub (Aug 4, 2022): Interested in this

BreizhHardware commented

@genofire commented on GitHub (Aug 12, 2022):

Maybe we should implement /ready and /healthy eigther to run in a k8s.

@genofire commented on GitHub (Aug 12, 2022): Maybe we should implement /ready and /healthy eigther to run in a k8s.

BreizhHardware commented

@nicois commented on GitHub (Dec 21, 2022):

Victoria metrics looks like a good candidate. Adding a configuration option for the path, defaulting to undefined/off, should be sufficient here I believe. Nginx/Apache can use ACLs to prevent external hosts from reaching whatever endpoint is configured.

@nicois commented on GitHub (Dec 21, 2022): Victoria metrics looks like a good candidate. Adding a configuration option for the path, defaulting to undefined/off, should be sufficient here I believe. Nginx/Apache can use ACLs to prevent external hosts from reaching whatever endpoint is configured.

BreizhHardware commented

https://github.com/binwiederhier/ntfy/issues/210#issuecomment-1095742026

@genofire commented on GitHub (Dec 23, 2022):

I am missing metric here, like:

subscription_total{topic="x"}
messages_total {topic="x"}

For getting big topics based on subscription and message rate.
Maybe also a Histogramm of the time get and send to all reciever like: delivery{le="+Inf", topic="x"}

@genofire commented on GitHub (Dec 23, 2022): > https://github.com/binwiederhier/ntfy/issues/210#issuecomment-1095742026 I am missing metric here, like: - `subscription_total{topic="x"}` - `messages_total {topic="x"}` For getting big topics based on subscription and message rate. Maybe also a Histogramm of the time get and send to all reciever like: `delivery{le="+Inf", topic="x"}`

BreizhHardware commented

@binwiederhier commented on GitHub (Mar 7, 2023):

@rogeliodh @nicois @genofire @benisai @mhameed, and anyone else who is interested, what metrics would you like to see?

Here's my current WIP:

messagesPublishedSuccess: prometheus.NewCounter(prometheus.CounterOpts{
	Name: "ntfy_messages_published_success",
}),
messagesPublishedFailure: prometheus.NewCounter(prometheus.CounterOpts{
	Name: "ntfy_messages_published_failure",
}),
messagesCached: prometheus.NewGauge(prometheus.GaugeOpts{
	Name: "ntfy_messages_cached_total",
}),
firebasePublishedSuccess: prometheus.NewCounter(prometheus.CounterOpts{
	Name: "ntfy_firebase_published_success",
}),
firebasePublishedFailure: prometheus.NewCounter(prometheus.CounterOpts{
	Name: "ntfy_firebase_published_failure",
}),
emailsPublishedSuccess: prometheus.NewCounter(prometheus.CounterOpts{
	Name: "ntfy_emails_sent_success",
}),
emailsPublishedFailure: prometheus.NewCounter(prometheus.CounterOpts{
	Name: "ntfy_emails_sent_failure",
}),
visitors: prometheus.NewGauge(prometheus.GaugeOpts{
	Name: "ntfy_visitors_total",
}),
subscribers: prometheus.NewGauge(prometheus.GaugeOpts{
	Name: "ntfy_subscribers_total",
}),
topics: prometheus.NewGauge(prometheus.GaugeOpts{
	Name: "ntfy_topics_total",
}),
httpRequests: prometheus.NewCounterVec(prometheus.CounterOpts{
	Name: "ntfy_http_requests_total",
}, []string{"http_code", "ntfy_code", "http_method"}),

Here's my dabbling dashboard:

@binwiederhier commented on GitHub (Mar 7, 2023): @rogeliodh @nicois @genofire @benisai @mhameed, and anyone else who is interested, what metrics would you like to see? Here's my current WIP: ``` messagesPublishedSuccess: prometheus.NewCounter(prometheus.CounterOpts{ Name: "ntfy_messages_published_success", }), messagesPublishedFailure: prometheus.NewCounter(prometheus.CounterOpts{ Name: "ntfy_messages_published_failure", }), messagesCached: prometheus.NewGauge(prometheus.GaugeOpts{ Name: "ntfy_messages_cached_total", }), firebasePublishedSuccess: prometheus.NewCounter(prometheus.CounterOpts{ Name: "ntfy_firebase_published_success", }), firebasePublishedFailure: prometheus.NewCounter(prometheus.CounterOpts{ Name: "ntfy_firebase_published_failure", }), emailsPublishedSuccess: prometheus.NewCounter(prometheus.CounterOpts{ Name: "ntfy_emails_sent_success", }), emailsPublishedFailure: prometheus.NewCounter(prometheus.CounterOpts{ Name: "ntfy_emails_sent_failure", }), visitors: prometheus.NewGauge(prometheus.GaugeOpts{ Name: "ntfy_visitors_total", }), subscribers: prometheus.NewGauge(prometheus.GaugeOpts{ Name: "ntfy_subscribers_total", }), topics: prometheus.NewGauge(prometheus.GaugeOpts{ Name: "ntfy_topics_total", }), httpRequests: prometheus.NewCounterVec(prometheus.CounterOpts{ Name: "ntfy_http_requests_total", }, []string{"http_code", "ntfy_code", "http_method"}), ``` Here's my dabbling dashboard: ![image](https://user-images.githubusercontent.com/664597/223312298-9104e347-44b5-46a5-8271-3e11c0be047b.png)

BreizhHardware commented

@genofire commented on GitHub (Mar 7, 2023):

Maybe for find big topics:

subscribers: prometheus.NewGaugeVec(prometheus.GaugeOpts{
	Name: "ntfy_subscribers_total",
}, []string {"topic"),

(Also on other metrics, based on topic).

@genofire commented on GitHub (Mar 7, 2023): Maybe for find big topics: ``` subscribers: prometheus.NewGaugeVec(prometheus.GaugeOpts{ Name: "ntfy_subscribers_total", }, []string {"topic"), ``` (Also on other metrics, based on `topic`).

BreizhHardware commented

2026-05-07 00:20:56 +02:00

@rogeliodh commented on GitHub (Mar 7, 2023):

I think adding topic label is not a good idea because there could be thousands of topics and adding a high cardinality dimension to prometheus metrics is a bad practice.

So I think original proposal is ok but I was wondering if they are enough to replicate the statistics you generate daily on the stats topic:

IPs: 23000
Clients: 23551
- Google Play: 7905
- F-Droid: 12104
- iOS: 2910
- curl: 5
- other: 627
Messages:
- Successful: 179595
- Failed (rate limited): 49016
- Failed (other): 443320

and I think we would be missing:

a label error to ntfy_messages_published_failure (and firebase and emails metrics, too)

for IPs I'm not really sure if ntfy could generate that stat...

for Clients... maybe change it to the "published" metric: add a client_type label to all ntfy_*_published_* metrics so we can get number of published messages per client type.

I'm not sure about the last one. @binwiederhier should know better ;)

@rogeliodh commented on GitHub (Mar 7, 2023): I think adding `topic` label is not a good idea because there could be thousands of topics and adding a high cardinality dimension to prometheus metrics is a bad practice. So I think original proposal is ok but I was wondering if they are enough to replicate the statistics you generate daily on the `stats` topic: ``` IPs: 23000 Clients: 23551 - Google Play: 7905 - F-Droid: 12104 - iOS: 2910 - curl: 5 - other: 627 Messages: - Successful: 179595 - Failed (rate limited): 49016 - Failed (other): 443320 ``` and I think we would be missing: * a label `error` to ` ntfy_messages_published_failure` (and firebase and emails metrics, too) for IPs I'm not really sure if ntfy could generate that stat... for Clients... maybe change it to the "published" metric: add a `client_type` label to all `ntfy_*_published_*` metrics so we can get number of published messages per client type. I'm not sure about the last one. @binwiederhier should know better ;)

BreizhHardware commented

@xenrox commented on GitHub (Mar 9, 2023):

What about number of user registrations?
That could be useful for abuse prevention/detection.

@xenrox commented on GitHub (Mar 9, 2023): What about number of user registrations? That could be useful for abuse prevention/detection.

BreizhHardware commented

2026-05-07 00:20:56 +02:00

@binwiederhier commented on GitHub (Mar 14, 2023):

I think I am content the way I have implemented this. I may add a few more metrics and then call it a day for now. We can always add more or change it later. The majority of work for me will be to Anisble-ize the Grafana+Prometheus installation.

@binwiederhier commented on GitHub (Mar 14, 2023): I think I am content the way I have implemented this. I may add a few more metrics and then call it a day for now. We can always add more or change it later. The majority of work for me will be to Anisble-ize the Grafana+Prometheus installation. ![image](https://user-images.githubusercontent.com/664597/224882937-2d043e5c-f587-4613-a177-e2f4289a1045.png)

BreizhHardware commented

@binwiederhier commented on GitHub (Mar 14, 2023):

The funniest thing: Apparently some other dependency had pulled in the prometheus client already, so this won't add any dependencies, whooo.

@binwiederhier commented on GitHub (Mar 14, 2023): The funniest thing: Apparently some other dependency had pulled in the prometheus client already, so this won't add any dependencies, whooo. ![image](https://user-images.githubusercontent.com/664597/225029948-6ad8a1ba-15d1-40dc-ad04-7d334f5992b9.png)

BreizhHardware commented

@binwiederhier commented on GitHub (Mar 15, 2023):

📢 Question around the /metrics endpoint for Prometheus: Right now I have it so that if you configure listen-metrics-http, it'll listen on a different IP/port for the metrics. But I feel like some people may want to just use the same interface potentially.

Are there any best practices around this?

What I have:

listen-http: ":80" # ntfy server, listen on all IPs
listen-metrics-http: "10.0.2.1:9090" # only /metrics endpoint, listen on internal network

I want to keep it simple, but flexible enough to be useful for everyone. Thoughts?

Plus, I don't want 100 different metrics config options...

@binwiederhier commented on GitHub (Mar 15, 2023): :loudspeaker: Question around the `/metrics` endpoint for Prometheus: Right now I have it so that if you configure `listen-metrics-http`, it'll listen on a different IP/port for the metrics. But I feel like some people may want to just use the same interface potentially. Are there any best practices around this? What I have: ```yaml listen-http: ":80" # ntfy server, listen on all IPs listen-metrics-http: "10.0.2.1:9090" # only /metrics endpoint, listen on internal network ``` I want to keep it simple, but flexible enough to be useful for everyone. Thoughts? Plus, I don't want 100 different metrics config options...

BreizhHardware commented

@xenrox commented on GitHub (Mar 15, 2023):

📢 Question around the /metrics endpoint for Prometheus: Right now I have it so that if you configure listen-metrics-http, it'll listen on a different IP/port for the metrics. But I feel like some people may want to just use the same interface potentially.

Are there any best practices around this?

What I have:
listen-http: ":80" # ntfy server, listen on all IPs
listen-metrics-http: "10.0.2.1:9090" # only /metrics endpoint, listen on internal network
I want to keep it simple, but flexible enough to be useful for everyone. Thoughts?

Plus, I don't want 100 different metrics config options...

Most of the services I scrape with Prometheus, expose their metrics on their default listening port under some variant of "/metrics".
As an example the following services handle it like that: keycloak, sourcehut, gitea, hedgedoc.

synapse is an example that supports both.

I personally prefer a simple "/metrics" on the normal port, but if that somehow does not work with ntfy because of conflicts with topics, a different port seems fine as well and does not really cause any annoyances for setups.

@xenrox commented on GitHub (Mar 15, 2023): > 📢 Question around the `/metrics` endpoint for Prometheus: Right now I have it so that if you configure `listen-metrics-http`, it'll listen on a different IP/port for the metrics. But I feel like some people may want to just use the same interface potentially. > > Are there any best practices around this? > > What I have: > > ```yaml > listen-http: ":80" # ntfy server, listen on all IPs > listen-metrics-http: "10.0.2.1:9090" # only /metrics endpoint, listen on internal network > ``` > > I want to keep it simple, but flexible enough to be useful for everyone. Thoughts? > > Plus, I don't want 100 different metrics config options... Most of the services I scrape with Prometheus, expose their metrics on their default listening port under some variant of "/metrics". As an example the following services handle it like that: [keycloak], [sourcehut], [gitea], [hedgedoc]. [keycloak]: https://www.keycloak.org/server/configuration-metrics#_querying_metrics [sourcehut]: https://meta.sr.ht/metrics [gitea]: https://docs.gitea.io/en-us/config-cheat-sheet/#metrics-metrics [hedgedoc]: https://docs.hedgedoc.org/dev/api/#hedgedoc-server [synapse] is an example that supports both. [synapse]: https://github.com/matrix-org/synapse/blob/develop/docs/metrics-howto.md I personally prefer a simple "/metrics" on the normal port, but if that somehow does not work with ntfy because of conflicts with topics, a different port seems fine as well and does not really cause any annoyances for setups.

BreizhHardware commented

@binwiederhier commented on GitHub (Mar 15, 2023):

Thanks. I'll check them out tonight. It feels weird to me that these tools allow them to be publicly available. Metrics are sensitive information and can give attackers insight.

@binwiederhier commented on GitHub (Mar 15, 2023): Thanks. I'll check them out tonight. It feels weird to me that these tools allow them to be publicly available. Metrics are sensitive information and can give attackers insight.

BreizhHardware commented

@xenrox commented on GitHub (Mar 15, 2023):

For sourcehut at least is it a design decision that everything around metrics/alerts is public, so that users can get a better grasp of what is going on.
Besides that, it is easy enough for sysadmins to protect the metrics with e.g. their reverse proxy.
Setting up basic auth, or only internal access (1 2) is pretty straightforward in nginx.

@xenrox commented on GitHub (Mar 15, 2023): For sourcehut at least is it a design decision that everything around metrics/alerts is public, so that users can get a better grasp of what is going on. Besides that, it is easy enough for sysadmins to protect the metrics with e.g. their reverse proxy. Setting up [basic auth], or only internal access ([1] [2]) is pretty straightforward in nginx. [basic auth]: https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-http-basic-authentication/ [1]: https://git.xenrox.net/~xenrox/ansible/tree/3c279881e01e922bf2bfcef64302017f31739819/item/roles/nginx/templates/internal_access.conf.j2 [2]: https://git.xenrox.net/~xenrox/ansible/tree/3c279881e01e922bf2bfcef64302017f31739819/item/roles/keycloak/files/keycloak.conf#L19

BreizhHardware commented

@binwiederhier commented on GitHub (Mar 16, 2023):

@xenrox Thanks again for the examples. They were great. I did this, and I think I like it:

# ntfy can expose Prometheus-style metrics via a /metrics endpoint, or on a dedicated listen IP/port.
# Metrics may be considered sensitive information, so before you enable them, be sure you know what you are
# doing, and/or secure access to the endpoint in your reverse proxy.
#
# - enable-metrics enables the /metrics endpoint for the default ntfy server (i.e. HTTP, HTTPS and/or Unix socket)
# - metrics-listen-http exposes the metrics endpoint via a dedicated [IP]:port. If set, this option implicitly
#   enables metrics as well, e.g. "10.0.1.1:9090" or ":9090"
#
enable-metrics: false
metrics-listen-http:

The names are a bit weird, but they match the other variables. There is enable-(reservations|login|..). It could be listen-metrics-http to match the other listen- ones, but there's also smtp-listen-addr, so it's already inconsistent...

@binwiederhier commented on GitHub (Mar 16, 2023): @xenrox Thanks again for the examples. They were great. I did this, and I think I like it: ``` yaml # ntfy can expose Prometheus-style metrics via a /metrics endpoint, or on a dedicated listen IP/port. # Metrics may be considered sensitive information, so before you enable them, be sure you know what you are # doing, and/or secure access to the endpoint in your reverse proxy. # # - enable-metrics enables the /metrics endpoint for the default ntfy server (i.e. HTTP, HTTPS and/or Unix socket) # - metrics-listen-http exposes the metrics endpoint via a dedicated [IP]:port. If set, this option implicitly # enables metrics as well, e.g. "10.0.1.1:9090" or ":9090" # enable-metrics: false metrics-listen-http: ``` The names are a bit weird, but they match the other variables. There is `enable-(reservations|login|..)`. It could be `listen-metrics-http` to match the other `listen-` ones, but there's also `smtp-listen-addr`, so it's already inconsistent...

BreizhHardware commented

@binwiederhier commented on GitHub (Mar 17, 2023):

This is done and will be in the next release.

@binwiederhier commented on GitHub (Mar 17, 2023): This is done and will be in the next release.

BreizhHardware commented

@genofire commented on GitHub (Mar 18, 2023):

do you have an grafana dashboard for it? i like to make it part of my helm-chart

@genofire commented on GitHub (Mar 18, 2023): do you have an grafana dashboard for it? i like to make it part of my helm-chart

BreizhHardware commented