mirror of
https://github.com/ovh/the-bastion.git
synced 2026-05-09 08:25:27 +02:00
[GH-ISSUE #194] Question on master - slave in DR scenario #48
Labels
No labels
answered
bug
documentation
enhancement
enhancement
feature
feature
kept-open-for-info
pull-request
question
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/the-bastion#48
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ghost on GitHub (Jun 7, 2021).
Original GitHub issue: https://github.com/ovh/the-bastion/issues/194
Imagine that you have your master bastion in region1 and your slave bastion in region2. Could I make the slave a master in case region1 goes offline for a longer period of time? Is there a way to rollback in case region1 comes online again?
I would like to avoid hosting a multiple masters as that adds a burden on administration of users and keys.
@speed47 commented on GitHub (Jun 7, 2021):
Hello,
Good point, this needs to be documented. I'll summarize the tl;dr: here and use it as basis to write the documentation.
Imagine the A bastion instance is a master, and the B and C instances are configured as slaves, synchronized to the A instance.
If B or C goes down for a short period of time (hours, days), there is nothing to do. If they go down and you don't plan to put them back up for any reason, you'll just have to remove their declaration in the master's configuration, so that the A instance stops trying to sync to the missing slaves (
remotehostlistin/etc/bastion/osh-sync-watcher.shon the A instance).If the A instance goes down for a short period of time, and you can accept to have your cluster deny all modifications (accounts creation/deletion, groups membership change, etc) during this amount of time, there is nothing to do, the B & C instances will work (accept connections to remote machines, etc.)
If the A instance goes down for a longer period and you want to promote B or C, here's what you need to do:
authorized_keysof thebastionsyncuser on the B and C instances. This way, the A instance won't be able to push data even if it wakes up from the dead and tries to. You can do this other ways, such as removing its IPs from the B and C instances firewall, for example. What's important is that in the end, A can no longer connect to B or C.readOnlySlaveModeoption in/etc/bastion/bastion.confto0instead of1. When this is done (you don't need to restart anything), this instance will start to accept modifying commands (account creation/deletion, etc.)./root/.ssh/id_master2slave.pubkey of B to theauthorized_keysof thebastionsyncuser on Cosh-sync-watcherdaemon on B (using systemd or sysVinit), and settingenabled=1in the/etc/bastion/osh-sync-watcher.shof CYou should then observe in the
/var/log/bastion/bastion-scripts.logfile of B (if you're using our provided syslog template) thatosh-sync-watcheris now syncing to C successfully. You now have B as a master, and C as a slave.Your infra is stable again and service is fully up
If/when A goes up again
readOnlySlaveModdeto1, disable the sync daemon on it, then push B's key to A'sbastionsyncuser, add A's IP to B'sremotehostlist, reload the sync daemon on B, and B will then sync its data to A.There are a few configuration choices you can make to make these steps even shorter, such as ensuring you have the sync configuration properly set on all nodes, but the daemon enabled only on one, and the bastionsync keys shared between the nodes, with just a
from="IP.OF.INSTANCE.Ain from of the declared key everywhere, so that it's the only thing to change in case of promotion of another node. Or you can trade a bit of security to remove yet more steps: allowing any node to connect to any other node from the beggining, so that you mainly have to enable the sync daemon on the new master (and STONITH the other one). It's a tradeoff depending on what you can accept in your environment. I'll document that too.