[GH-ISSUE #956] Thousands ad thousands of defunct ssl client orphans... #671

Closed
opened 2026-05-07 00:26:28 +02:00 by BreizhHardware · 6 comments

Originally created by @emigrating on GitHub (Nov 21, 2023).
Original GitHub issue: https://github.com/binwiederhier/ntfy/issues/956

🐞 Describe the bug

Not really sure TBH. I just noticed this when doing my monthy system updates.

💻 Components impacted

Dockerized ntfy server. Running behind Traefik, which in turn is behind Cloudflare proxy (it was a bitch to get to run properly at first, but it's been running fine for ages now).

💡 Screenshots and/or logs

❯ dps
    CONTAINER ID   NAMES            SIZE                     STATUS
    0123456789ab   container        0B (virtual 606MB)       Up 2 weeks
    83a65aa860c2   ntfy             0B (virtual 53.2MB)      Up 2 weeks (unhealthy)  
    0123456789ab   container        23.8kB (virtual 564MB)   Up 3 weeks
    629eb6192dc3   traefik          62.9kB (virtual 148MB)   Up 3 weeks
    0123456789ab   container        0B (virtual 6.27MB)      Exited (2) 4 weeks ago
    0123456789ab   container        78.8kB (virtual 765MB)   Up 3 weeks
    0123456789ab   container        64.9kB (virtual 765MB)   Up 3 weeks
    0123456789ab   container        2B (virtual 387MB)       Up 3 weeks
    0123456789ab   container        145kB (virtual 491MB)    Up 2 days
    0123456789ab   container        0B (virtual 38.7MB)      Up 3 weeks (healthy)
    0123456789ab   container        0B (virtual 412MB)       Up 3 weeks (healthy)
    0123456789ab   container        51.5MB (virtual 195MB)   Up 3 weeks
    0123456789ab   container        22.2kB (virtual 203MB)   Up 3 weeks

❯ pstree -p
systemd(1)-+-ModemManager(544846)-+
           ├─containerd-shim(1526704)─┬─ntfy(1526724)─┬─ssl_client(1527025)
           │                          │               ├─ssl_client(1527129)
           │                          │               ├─ssl_client(1527555)
           │                          │               ├─ssl_client(1527620)
           │                          │               ├─ssl_client(1527734)
           │                          │               ├─ssl_client(1527802)
           │                          │               ├─ssl_client([...])
           │                          │               ├─ssl_client(2700900)
           │                          │               ├─{ntfy}(1526760)
           │                          │               ├─{ntfy}(1526761)
           │                          │               ├─{ntfy}(1526762)
           │                          │               ├─{ntfy}(1526763)
           │                          │               ├─{ntfy}(1526764)
           │                          │               ├─{ntfy}(1526776)
           │                          │               ├─{ntfy}(1527782)
           │                          │               └─{ntfy}(1559488)

🔮 Additional context

Not really sure what I'm expecting with this post, perhaps that someone else has run into a similar thing or perhaps just to log my issue if it happens time and time again.

But basically, I have a few NTFY servers running here and there, this has happened on one of them and I don't think there are any config differences between them all.

After the initial headache getting them to run properly behind cloudflare's proxied DNS and traefik this was running fine. I then initiated an upgrade a few weeks ago (ie. 'docker compose pull && docker compose down ntfy && docker compose up ntfy -d') and made sure the service spun up again properly.

Have since left it alone as it's seemingly been working just fine - as in, my android client app is still receiving notifications just fine and has been doing so thruout. Only noticed this today when I was doing the monthly system updates. When I did notice the million or so defunct ssl sessions I immediately tried the web UI only to be greeted by a completely blank page, which makes no sense as the andorid client uses the web to connect, no? But either way, the web ui is no longer showing me data whereas the andoing app received a notification as recent as this morning.

I have since rebooted the entire server as there were kernel upgrades and the likes, but...

Originally created by @emigrating on GitHub (Nov 21, 2023). Original GitHub issue: https://github.com/binwiederhier/ntfy/issues/956 :lady_beetle: **Describe the bug** <!-- A clear and concise description of the problem. --> Not really sure TBH. I just noticed this when doing my monthy system updates. :computer: **Components impacted** <!-- ntfy server, Android app, iOS app, web app --> Dockerized ntfy server. Running behind Traefik, which in turn is behind Cloudflare proxy (it was a bitch to get to run properly at first, but it's been running fine for ages now). :bulb: **Screenshots and/or logs** <!-- If applicable, add screenshots or share logs help explain your problem. To get logs from the ... - ntfy server: Enable "log-level: trace" in your server.yml file - Android app: Go to "Settings" -> "Record logs", then eventually "Copy/upload logs" - web app: Press "F12" and find the "Console" window --> ``` ❯ dps CONTAINER ID NAMES SIZE STATUS 0123456789ab container 0B (virtual 606MB) Up 2 weeks 83a65aa860c2 ntfy 0B (virtual 53.2MB) Up 2 weeks (unhealthy) 0123456789ab container 23.8kB (virtual 564MB) Up 3 weeks 629eb6192dc3 traefik 62.9kB (virtual 148MB) Up 3 weeks 0123456789ab container 0B (virtual 6.27MB) Exited (2) 4 weeks ago 0123456789ab container 78.8kB (virtual 765MB) Up 3 weeks 0123456789ab container 64.9kB (virtual 765MB) Up 3 weeks 0123456789ab container 2B (virtual 387MB) Up 3 weeks 0123456789ab container 145kB (virtual 491MB) Up 2 days 0123456789ab container 0B (virtual 38.7MB) Up 3 weeks (healthy) 0123456789ab container 0B (virtual 412MB) Up 3 weeks (healthy) 0123456789ab container 51.5MB (virtual 195MB) Up 3 weeks 0123456789ab container 22.2kB (virtual 203MB) Up 3 weeks ❯ pstree -p systemd(1)-+-ModemManager(544846)-+ ├─containerd-shim(1526704)─┬─ntfy(1526724)─┬─ssl_client(1527025) │ │ ├─ssl_client(1527129) │ │ ├─ssl_client(1527555) │ │ ├─ssl_client(1527620) │ │ ├─ssl_client(1527734) │ │ ├─ssl_client(1527802) │ │ ├─ssl_client([...]) │ │ ├─ssl_client(2700900) │ │ ├─{ntfy}(1526760) │ │ ├─{ntfy}(1526761) │ │ ├─{ntfy}(1526762) │ │ ├─{ntfy}(1526763) │ │ ├─{ntfy}(1526764) │ │ ├─{ntfy}(1526776) │ │ ├─{ntfy}(1527782) │ │ └─{ntfy}(1559488) ``` :crystal_ball: **Additional context** <!-- Add any other context about the problem here. --> Not really sure what I'm expecting with this post, perhaps that someone else has run into a similar thing or perhaps just to log my issue if it happens time and time again. But basically, I have a few NTFY servers running here and there, this has happened on one of them and I don't think there are any config differences between them all. After the initial headache getting them to run properly behind cloudflare's proxied DNS and traefik this was running fine. I then initiated an upgrade a few weeks ago (ie. 'docker compose pull && docker compose down ntfy && docker compose up ntfy -d') and made sure the service spun up again properly. Have since left it alone as it's seemingly been working just fine - as in, my android client app is still receiving notifications just fine and has been doing so thruout. Only noticed this today when I was doing the monthly system updates. When I did notice the million or so defunct ssl sessions I immediately tried the web UI only to be greeted by a completely blank page, which makes no sense as the andorid client uses the web to connect, no? But either way, the web ui is no longer showing me data whereas the andoing app received a notification as recent as this morning. I have since rebooted the entire server as there were kernel upgrades and the likes, but...
BreizhHardware 2026-05-07 00:26:28 +02:00
  • closed this issue
  • added the
    🪲 bug
    label
Author
Owner

@emigrating commented on GitHub (Nov 21, 2023):

Just logged for future.

<!-- gh-comment-id:1820878144 --> @emigrating commented on GitHub (Nov 21, 2023): Just logged for future.
Author
Owner

@DatDucati commented on GitHub (Jun 8, 2025):

This issue is popping up on my machine.
2025-06-06: 225 Threads
2025-06-07: 1655 Threads.

After 1.5 Days it runs into WARN territory of my CheckMK Monitoring... which is very annoying.

docker-compose:

services:
  ntfy:
    image: binwiederhier/ntfy
    command:
      - serve
    environment:
      TZ: Europe/Berlin
      NTFY_BEHIND_PROXY: true
      NTFY_BASE_URL: https://ntfy.domain.com
      NTFY_CACHE_FILE: /var/lib/ntfy/cache.db
      NTFY_AUTH_FILE: /var/lib/ntfy/auth.db
      NTFY_AUTH_DEFAULT_ACCESS: deny-all
      NTFY_ATTACHMENT_CACHE_DIR: /var/lib/ntfy/attachments
      NTFY_ENABLE_LOGIN: true
      UPSTREAM_BASE_URL: "https://ntfy.domain.com"
      WEB_PUSH_PUBLIC_KEY: <snip>
      WEB_PUSH_PRIVATE_KEY: <snip>
      WEB_PUSH_FILE: /var/lib/ntfy/webpush.db # or similar
      WEB_PUSH_EMAIL_ADDRESS: webmaster@ntfy.domain.com
    user: 1000:1000 # optional: replace with your own user/group or uid/gid
    volumes:
      - ./data/lib/:/var/lib/ntfy/
    networks:
      proxy:
        ipv4_address: 172.20.0.20
    hostname: ntfy
    healthcheck: # optional: remember to adapt the host:port to your environment
        test: ["CMD-SHELL", "wget -q --tries=1 https://ntfy.domain.com/v1/health -O - | grep -Eo '\"healthy\"\\s*:\\s*true' || exit 1"]
        interval: 60s
        timeout: 10s
        retries: 3
        start_period: 40s
    restart: unless-stopped
networks:
  proxy:
    external: true
    name: proxy_net

Nginx-Config

server {
  listen 80;
  server_name ntfy.domain.com;
  location / {
    # Redirect HTTP to HTTPS, but only for GET topic addresses, since we want
    # it to work with curl without the annoying https:// prefix
    set $redirect_https "";
    if ($request_method = GET) {
      set $redirect_https "yes";
    }
    if ($request_uri ~* "^/([-_a-z0-9]{0,64}$|docs/|static/)") {
      set $redirect_https "${redirect_https}yes";
    }
    if ($redirect_https = "yesyes") {
      return 302 https://$http_host$request_uri$is_args$query_string;
    }

    proxy_pass http://172.20.0.20;
    proxy_http_version 1.1;

    proxy_buffering off;
    proxy_request_buffering off;
    proxy_redirect off;

    proxy_set_header Host $http_host;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

    proxy_connect_timeout 3m;
    proxy_send_timeout 3m;
    proxy_read_timeout 3m;

    client_max_body_size 0; # Stream request body to backend
  }
}

server {
  listen 443 ssl;
  server_name ntfy.domain.com;
  http2 on;
  include /etc/letsencrypt/options-ssl-nginx.conf;

  ssl_certificate /etc/letsencrypt/live/ntfy.domain.com/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/ntfy.domain.com/privkey.pem;

  location / {
    proxy_pass http://172.20.0.20;
    proxy_http_version 1.1;

    proxy_buffering off;
    proxy_request_buffering off;
    proxy_redirect off;

    proxy_set_header Host $http_host;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

    proxy_connect_timeout 3m;
    proxy_send_timeout 3m;
    proxy_read_timeout 3m;

    client_max_body_size 0; # Stream request body to backend
  }
}

Have you noticed that issue?
I could implement a restart cron job, but that is also not the best solution.

<!-- gh-comment-id:2954083609 --> @DatDucati commented on GitHub (Jun 8, 2025): This issue is popping up on my machine. 2025-06-06: 225 Threads 2025-06-07: 1655 Threads. After 1.5 Days it runs into WARN territory of my CheckMK Monitoring... which is very annoying. docker-compose: ``` services: ntfy: image: binwiederhier/ntfy command: - serve environment: TZ: Europe/Berlin NTFY_BEHIND_PROXY: true NTFY_BASE_URL: https://ntfy.domain.com NTFY_CACHE_FILE: /var/lib/ntfy/cache.db NTFY_AUTH_FILE: /var/lib/ntfy/auth.db NTFY_AUTH_DEFAULT_ACCESS: deny-all NTFY_ATTACHMENT_CACHE_DIR: /var/lib/ntfy/attachments NTFY_ENABLE_LOGIN: true UPSTREAM_BASE_URL: "https://ntfy.domain.com" WEB_PUSH_PUBLIC_KEY: <snip> WEB_PUSH_PRIVATE_KEY: <snip> WEB_PUSH_FILE: /var/lib/ntfy/webpush.db # or similar WEB_PUSH_EMAIL_ADDRESS: webmaster@ntfy.domain.com user: 1000:1000 # optional: replace with your own user/group or uid/gid volumes: - ./data/lib/:/var/lib/ntfy/ networks: proxy: ipv4_address: 172.20.0.20 hostname: ntfy healthcheck: # optional: remember to adapt the host:port to your environment test: ["CMD-SHELL", "wget -q --tries=1 https://ntfy.domain.com/v1/health -O - | grep -Eo '\"healthy\"\\s*:\\s*true' || exit 1"] interval: 60s timeout: 10s retries: 3 start_period: 40s restart: unless-stopped networks: proxy: external: true name: proxy_net ``` Nginx-Config ``` server { listen 80; server_name ntfy.domain.com; location / { # Redirect HTTP to HTTPS, but only for GET topic addresses, since we want # it to work with curl without the annoying https:// prefix set $redirect_https ""; if ($request_method = GET) { set $redirect_https "yes"; } if ($request_uri ~* "^/([-_a-z0-9]{0,64}$|docs/|static/)") { set $redirect_https "${redirect_https}yes"; } if ($redirect_https = "yesyes") { return 302 https://$http_host$request_uri$is_args$query_string; } proxy_pass http://172.20.0.20; proxy_http_version 1.1; proxy_buffering off; proxy_request_buffering off; proxy_redirect off; proxy_set_header Host $http_host; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_connect_timeout 3m; proxy_send_timeout 3m; proxy_read_timeout 3m; client_max_body_size 0; # Stream request body to backend } } server { listen 443 ssl; server_name ntfy.domain.com; http2 on; include /etc/letsencrypt/options-ssl-nginx.conf; ssl_certificate /etc/letsencrypt/live/ntfy.domain.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/ntfy.domain.com/privkey.pem; location / { proxy_pass http://172.20.0.20; proxy_http_version 1.1; proxy_buffering off; proxy_request_buffering off; proxy_redirect off; proxy_set_header Host $http_host; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_connect_timeout 3m; proxy_send_timeout 3m; proxy_read_timeout 3m; client_max_body_size 0; # Stream request body to backend } } ``` Have you noticed that issue? I could implement a restart cron job, but that is also not the best solution.
Author
Owner

@wunter8 commented on GitHub (Jun 8, 2025):

I'm pretty sure this is the result of the healthcheck command, but I'm not sure why it's staying around and not cleaning itself up. You could maybe try adding && exit at the end of the command, so that if the grep fails it will exit 1 and if the grep succeeds, it will exit

<!-- gh-comment-id:2954134642 --> @wunter8 commented on GitHub (Jun 8, 2025): I'm pretty sure this is the result of the healthcheck command, but I'm not sure why it's staying around and not cleaning itself up. You could maybe try adding ` && exit` at the end of the command, so that if the grep fails it will `exit 1` and if the grep succeeds, it will `exit`
Author
Owner

@DatDucati commented on GitHub (Jun 9, 2025):

I changed the health check to the following:

healthcheck: # optional: remember to adapt the host:port to your environment
        test: ["CMD-SHELL", "wget -q --tries=1 https://ntfy.domain.com/v1/health -O - | grep -Eo '\"healthy\"\\s*:\\s*true' && exit || exit 1"]
        interval: 60s
        timeout: 10s
        retries: 3
        start_period: 40s
Image On Sunday evening I updated the docker container stack to use that health check. No Lick there, as you can see.
<!-- gh-comment-id:2955158335 --> @DatDucati commented on GitHub (Jun 9, 2025): I changed the health check to the following: ``` healthcheck: # optional: remember to adapt the host:port to your environment test: ["CMD-SHELL", "wget -q --tries=1 https://ntfy.domain.com/v1/health -O - | grep -Eo '\"healthy\"\\s*:\\s*true' && exit || exit 1"] interval: 60s timeout: 10s retries: 3 start_period: 40s ``` <img width="1116" alt="Image" src="https://github.com/user-attachments/assets/f7c8b63c-e48d-47ac-9134-c56f632e39c9" /> On Sunday evening I updated the docker container stack to use that health check. No Lick there, as you can see.
Author
Owner

@wunter8 commented on GitHub (Jun 9, 2025):

At least in OP's case, the leftover process was from ssl_client, which would seem to be a result of checking the ntfy health status using https.

I'm pretty sure the healthcheck runs inside the container (and not on the docker host), right? If so, you should be able to just change the URL to http://localhost/v1/health to avoid using ssl_client. Maybe that will make a difference 🤷‍♂️

<!-- gh-comment-id:2955846137 --> @wunter8 commented on GitHub (Jun 9, 2025): At least in OP's case, the leftover process was from `ssl_client`, which would seem to be a result of checking the ntfy health status using https. I'm pretty sure the healthcheck runs inside the container (and not on the docker host), right? If so, you should be able to just change the URL to `http://localhost/v1/health` to avoid using `ssl_client`. Maybe that will make a difference 🤷‍♂️
Author
Owner

@DatDucati commented on GitHub (Jun 10, 2025):

Okay, the change to localhost in the heatlthcheck does appear to help. Threads are stable at ~230 for 5 hours now. I'll be monitoring (ha) the behavior in the long run. Thanks for your help! Then the example docker-compose file should be changed, as I got the health check from there :)

<!-- gh-comment-id:2958985849 --> @DatDucati commented on GitHub (Jun 10, 2025): Okay, the change to localhost in the heatlthcheck does appear to help. Threads are stable at ~230 for 5 hours now. I'll be monitoring (ha) the behavior in the long run. Thanks for your help! Then the example docker-compose file should be changed, as I got the health check from there :)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ntfy#671
No description provided.