[GH-ISSUE #1254] Option to prevent SEO indexing with `robots.txt` #886

Closed
opened 2026-05-07 00:28:28 +02:00 by BreizhHardware · 3 comments

Originally created by @litetex on GitHub (Jan 8, 2025).
Original GitHub issue: https://github.com/binwiederhier/ntfy/issues/1254

💡 Idea

It would be nice to have an option that prevents SEO indexing by serving a robots.txt:

User-agent: *
Disallow: /

💻 Target components
ntfy server

Originally created by @litetex on GitHub (Jan 8, 2025). Original GitHub issue: https://github.com/binwiederhier/ntfy/issues/1254 :bulb: **Idea** It would be nice to have an option that prevents SEO indexing by serving a [``robots.txt``](https://en.wikipedia.org/wiki/Robots.txt): ``` User-agent: * Disallow: / ``` :computer: **Target components** ntfy server
BreizhHardware 2026-05-07 00:28:28 +02:00
Author
Owner

@pixitha commented on GitHub (Jun 4, 2025):

Technically the index page for ntfy already has the meta tags in place to stop indexing:

    <!-- Never index -->
    <meta name="robots" content="noindex, nofollow" />

Robots.txt and noindex are a bit of a catch-22:

A page that's disallowed in robots.txt can still be indexed if linked to from other sites.
While Google won't crawl or index the content blocked by a robots.txt file, we might still find and index a disallowed URL if it is linked from other places on the web. As a result, the URL address and, potentially, other publicly available information such as anchor text in links to the page can still appear in Google search results. To properly prevent your URL from appearing in Google search results, password-protect the files on your server, use the noindex meta tag or response header, or remove the page entirely. googleref

Why we want to allow google bot to read the page, and not use a robots.txt disallow:

Important

For the noindex rule to be effective, the page or resource must not be blocked by a robots.txt file, and it has to be otherwise accessible to the crawler. If the page is blocked by a robots.txt file or the crawler can't access the page, the crawler will never see the noindex rule, and the page can still appear in search results, for example if other pages link to it.
googleref

<!-- gh-comment-id:2937868243 --> @pixitha commented on GitHub (Jun 4, 2025): Technically the index page for ntfy already has the meta tags in place to stop indexing: ```html <!-- Never index --> <meta name="robots" content="noindex, nofollow" /> ``` Robots.txt and noindex are a bit of a catch-22: > A page that's disallowed in robots.txt can still be indexed if linked to from other sites. > While Google won't crawl or index the content blocked by a robots.txt file, we might still find and index a disallowed URL if it is linked from other places on the web. As a result, the URL address and, potentially, other publicly available information such as anchor text in links to the page can still appear in Google search results. To properly prevent your URL from appearing in Google search results, password-protect the files on your server, use the noindex meta tag or response header, or remove the page entirely. [googleref](https://developers.google.com/search/docs/crawling-indexing/robots/intro) Why we want to allow google bot to read the page, and not use a robots.txt disallow: > [!IMPORTANT] > For the noindex rule to be effective, the page or resource must not be blocked by a robots.txt file, and it has to be otherwise accessible to the crawler. If the page is blocked by a robots.txt file or the crawler can't access the page, the crawler will never see the noindex rule, and the page can still appear in search results, for example if other pages link to it. [googleref](https://developers.google.com/search/docs/crawling-indexing/block-indexing)
Author
Owner

@binwiederhier commented on GitHub (Jan 18, 2026):

I don't think I'll add this. I don't think bots really care about the robots.txt anymore in 2026 ..

<!-- gh-comment-id:3764663689 --> @binwiederhier commented on GitHub (Jan 18, 2026): I don't think I'll add this. I don't think bots really care about the robots.txt anymore in 2026 ..
Author
Owner

@litetex commented on GitHub (Jan 18, 2026):

I don't think bots really care about the robots.txt

Search Engine crawlers usually do - at least the ones that I observe on my infrastructure.

I don't think I'll add this

Well guess users have to manually configure their reverse proxies than...

<!-- gh-comment-id:3765272649 --> @litetex commented on GitHub (Jan 18, 2026): > I don't think bots really care about the robots.txt Search Engine crawlers usually do - at least the ones that I observe on my infrastructure. > I don't think I'll add this Well guess users have to manually configure their reverse proxies than...
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ntfy#886
No description provided.