Skip to content
This repository has been archived by the owner on Feb 18, 2025. It is now read-only.

No FailureDetection read_only=on on all servers in a topology #865

Closed
dontstopbelieveing opened this issue Apr 17, 2019 · 10 comments
Closed
Assignees
Labels
contribution-friendly good for new contributors

Comments

@dontstopbelieveing
Copy link

dontstopbelieveing commented Apr 17, 2019

I have a case in which all instances in a topology have read only on, which means that there is no write-able master available. However this is not detected as a failure and the OnFailureDetectionProcesses does not kick off. Nor does the Auto-Recovery.

Is there something I am missign in the configuration that will enable this detection and recovery?

~  ./orchestrator -c topology -i 10.176.3.10
10.176.3.10:3306      [0s,ok,5.7.23-23-log,ro,ROW,>>,GTID]
+ 10.176.12.136:3306  [0s,ok,5.7.23-23-log,ro,ROW,>>,GTID]
  + 10.176.14.19:3306 [0s,ok,5.7.23-23-log,ro,ROW,>>,GTID]
+ 10.176.13.44:3306   [0s,ok,5.7.23-23-log,ro,ROW,>>,GTID]
@shlomi-noach
Copy link
Collaborator

This scenario was not designed for failover. What would be the solution? Perhaps just turn off read_only on the master of the topology?

I'm happy to make this a structure warning analysis ; but I'm not sure if I want to take it to failure detection + recovery.

Do you know at this time how you ended up with a read-only master? Does your monitoring cover such scenario?

@dontstopbelieveing
Copy link
Author

We have machines that restart quite frequently due to various reasons, that can't be avoided. The my.cnf file has read only set to ON to avoid scenarios that would lead to writes going to two different servers. In these cases currently a DBA manually sets read only off on servers that are master.

Currently we don't have monitoring that is "topology-aware". In case such a restart happens I was wondering if we could use Orchestrator to notify us, since it is topology-aware. The remediation would be to set read only off but I am not depending on orchestrator to take this action. It would still be a manual step. Since there could be complexities related to which node should the read only off be on. Just looking at if I can configure Orchestrator to detect such a scenario and notify.

@shlomi-noach
Copy link
Collaborator

Here's one way of doing it:
orchestrator-client -c api -path all-instances | jq '.[] | select(.MasterKey.Hostname == "")' | jq -r '. | select (.ReadOnly == true) | .Key.Hostname'

see some ideas on https://github.com/github/orchestrator/blob/master/docs/script-samples.md

@dontstopbelieveing
Copy link
Author

Yes, those might be helpful, trying to figure out a way of doing this as a script. However it will help if this can be a structure warning analysis. Perhaps as a part of the replication-analysis command? A warning could definitely make my use case easier :)

@shlomi-noach
Copy link
Collaborator

yes, a structure warning would be part of replication-analysis.

@shlomi-noach
Copy link
Collaborator

@jfudally is this something you might be interested in?

@jfudally
Copy link
Contributor

@dontstopbelieveing PR #878 was just merged into master. This PR introduces the NoWriteableMasterStructureWarning structure warning when the master node is read-only. Please take a look and let me know if this addresses your use-case.

@liortamari
Copy link

@shlomi-noach sorry for bumping this, I happen to notice in vitess the read-only master is handled.
https://github.com/vitessio/vitess/blob/25762d88e50d77352dfd5cc0bb41902f7215a3d9/go/vt/orchestrator/logic/topology_recovery.go#L1563
Do you think that vitess code could be useful here also? I would be happy to try and write PR if you see no reason not to.

@liortamari
Copy link

@dontstopbelieveing may I ask how solved this scenario eventually?

@shlomi-noach
Copy link
Collaborator

@liortamari good timing. Please look at #1332, unmerged yet. It solves the issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
contribution-friendly good for new contributors
Projects
None yet
Development

No branches or pull requests

4 participants