mirror of
https://github.com/ovh/the-bastion.git
synced 2026-05-09 16:35:33 +02:00
[GH-ISSUE #584] Prevent The Bastion from running if /home isn't mounted yet #153
Labels
No labels
answered
bug
documentation
enhancement
enhancement
feature
feature
kept-open-for-info
pull-request
question
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/the-bastion#153
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @jon4hz on GitHub (Sep 11, 2025).
Original GitHub issue: https://github.com/ovh/the-bastion/issues/584
Hi,
To ensure that The Bastion has always consistent data, I would like to prevent connection to it, if the
/homepartition isn't mounted yet.For example, if I reboot one of my Bastion slave nodes, I only want to allow connections to it, once the
/homepartition was (manually) decrypted and mounted.Is there a way to achieve this?
@speed47 commented on GitHub (Sep 12, 2025):
Hello,
Interestingly, this is a discussion we've been having a few times internally (including this very week!), because when it happens that a node reboots, the "Permission denied (publickey)" message is a bit confusing to the users.
The thing is that, as the authentication is completely handed off to OpenSSH, we have limited options to tell the users that this node is currently in maintenance, and there's no way, for users, to notice a difference between "your couldn't authenticate because your key is invalid" and "you couldn't authenticate because this node is currently sealed".
There are a few options, though:
A) Have two distinct OpenSSH server daemon instances, one for bastion users (port 22, using the default
sshd_configshipped by the bastion), and one for bastion admins (port 222, with a distinctsshd_configonly allowingrootto connect). When a node reboots, the port 22 service would stay down (not auto-started at boot) until an admin connects to port 222, unlocks/home, and starts the port 22 SSH daemon. This way, port 22 would be closed until the service is actually up and running.B) Modify the
/etc/ssh/bannerso that it clearly states that this node is currently sealed, and that another node of the cluster should be used. This could be done by having an/etc/ssh/banner.sealedfile stating this, and a/etc/ssh/banner.okfile containing the usual default banner. Then a systemd unit file running once at boot could create a symlink/etc/ssh/banner => banner.sealed, so that the users can tell that this node is sealed, as the banner is always shown before starting the authentication phase. Then, when an admin unseals the node, they can move the symlink to/etc/ssh/banner => banner.ok.@jon4hz commented on GitHub (Sep 13, 2025):
Thanks a lot for your answer.
I think both of those approaches have their pros and cons.
A) Would be a bit more complex, and you'd have to maintain a second
sshd_config, but it would also bring a great benefit in HA environment. If you have a loadbalancer or geo dns, you could set up probes to only send traffic to The Bastion if the node is actually available.B) This wouldn't prevent The Bastion from actually receiving traffic, but it would be much simpler to implement.
What do you think about implementing approach B for now? I think the systemd unit to create symlink at boot could be created by the
setup-encryption.shscript, and theunlock-home.shscript could then adjust the banner once the node is unsealed.I'd be happy to submit a PR for that.
@speed47 commented on GitHub (Sep 13, 2025):
Yes exactly. Geodns would work out of the box, or even a simple DNS RR where faulty nodes are automatically removed and the TTL kept to a low value. But to have a loadbalancer would require OpenSSH to support proxy-protocol, so that we get the real client IP, which is paramount for traceability and ensuring the
from=""part of the pubkeys are applied. Unfortunately OpenSSH doesn't support it, and I don't really expect it to, even in the future. There is a workaround though: a setup using something similar to https://github.com/cloudflare/mmproxy should work.This is actually the solution we will probably use internally!
The systemd unit could preexist, distributed in
etc/systemdas the other shipped unit files, but be disabled by default. Just dropping a new service file in this folder will do exactly that, as per:github.com/ovh/the-bastion@c8b86b718a/bin/admin/install (L1118-L1127)The
setup-encryption.shscript could then automatically enable it however!It would be great to also craft a good old
init.dfile, as you'll find inetc/init.d, for distros not using systemd... (do those still exist?).github.com/ovh/the-bastion@c8b86b718a/bin/admin/install (L1075-L1079)The install script automatically detects whether systemd is present or not, and installs the proper files.
That would be great!