One of the key aspects of a highly-available system is that routine maintenance can be carried out without any service interruption to users. MySQL Cluster achieves this through its shared-nothing architecture, and in this recipe, we will show how to restart the three types of nodes online (without taking the cluster down as a whole).
The setup
For this post, we will be using the following cluster setup:
- Four storage nodes
- One management node
- Two SQL nodes
The output of the SHOW command on the management client for this cluster is as follows:
ndb_mgm> SHOW Cluster Configuration --------------------- [ndbd(NDB)] 4 node(s) id=3 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0) id=4 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0) id=5 @10.0.0.3 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1, Master) id=6 @10.0.0.4 (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1) [ndb_mgmd(MGM)] 1 node(s) id=1 @10.0.0.5 (mysql-5.1.34 ndb-7.0.6) [mysqld(API)] 4 node(s) id=11 @10.0.0.1 (mysql-5.1.34 ndb-7.0.6) id=12 @10.0.0.2 (mysql-5.1.34 ndb-7.0.6) id=13 (not connected, accepting connect from any host) id=14 (not connected, accepting connect from any host)
Restarting a storage node
There are two ways to restart a storage node. For both the methods, the first step is to check the output of SHOW command in the management client to ensure that there is at least one other online (not starting or shutdown) node in the same nodegroup. In our example cluster, we have storage node ID 3 and 4 in nodegroup 0 and storage node ID 5 and 6 in nodegroup 1.
The two options for restarting a node are as follows: Firstly, from the management client a node can be restarted with the [nodeid] RESTART command:
ndb_mgm> 3 status Node 3: started (mysql-5.1.34 ndb-7.0.6) ndb_mgm> 3 RESTART Node 3: Node shutdown initiated Node 3: Node shutdown completed, restarting, no start. Node 3 is being restarted ndb_mgm> 3 status Node 3: starting (Last completed phase 4) (mysql-5.1.34 ndb-7.0.6) Node 3: Started (version 7.0.6) ndb_mgm> 3 status Node 3: started (mysql-5.1.34 ndb-7.0.6)
Secondly, on the storage node itself the ndbd process can simply be killed and restarted. Remember that ndbd has two processes — an angel process in addition to the main process. You must kill both these processes at the same time.
[root@node4 ~]# ps aux | grep ndbd root 4082 0.0 0.4 33480 2316 ? Ss Jul08 0:00 ndbd --initial root 4134 0.1 17.4 426448 91416 ? Sl Jul08 0:02 ndbd --initial root 4460 0.0 0.1 61152 720 pts/0 R+ 00:11 0:00 grep ndbd [root@node4 ~]# kill 4082 4134 [root@node4 ~]# ps aux | grep ndbd | grep -v grep | wc -l 0
Once we have killed the ndbd process, and ensured that no processes are running with the name ndbd, we can restart the ndbd process:
[root@node4 ~]# ndbd 2009-07-09 00:12:03 [ndbd] INFO -- Configuration fetched from '10.0.0.5:1186', generation: 1
If you were to leave a management client connected during this process, you can see that the management node picks up on the dead node and then allow it to rejoin the cluster:
ndb_mgm> Node 6: Node shutdown completed. Initiated by signal 15. ndb_mgm> Node 6: Started (version 7.0.6) ndb_mgm> 6 status Node 6: started (mysql-5.1.34 ndb-7.0.6)
Restarting a management node
Restarting a management node is best done by simply killing the ndb_mgmd process and restarting it. When there is no management node in the cluster, there is no central logging for the cluster, and the storage and the API nodes cannot start or restart (so if they fail they will stay dead). In addition, processes that are initiated from the management client (such as hot backups) cannot be run.
Firstly, we will pass the process ID of the ndb_mgmd process to the kill command:
[root@node5 mysql-cluster]# kill $(pidof ndb_mgmd)
This will kill the management node, so now start it again:
[root@node5 mysql-cluster]# ndb_mgmd --config-file=config.ini 2009-07-09 00:30:00 [MgmSrvr] INFO -- NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.6 2009-07-09 00:30:00 [MgmSrvr] INFO -- Loaded config from '//mysql-cluster/ndb_1_config.bin.1'
Finally, verify that the management node is working:
[root@node5 mysql-cluster]# ndb_mgm -- NDB Cluster -- Management Client -- ndb_mgm> 1 status Node 1: connected (Version 7.0.6)
Restarting a SQL node
Restarting a SQL node is trivial —just restart the mysqld process as normal, and carry out the checks mentioned earlier to ensure that the node restarts correctly.
[root@node1 ~]# service mysql restart Shutting down MySQL.. [ OK ] Starting MySQL.. [ OK ]