Category Archives: galera

Galera set up troubleshooting

In a previous post on galera, I mentioned it’s important to look at both the joiner and donor logs to get a full picture of what problems you may be running into. Sometimes even that is not enough and you’ll need to spend time narrowing down the issue. It’s important to keep in mind that Galera is relatively young and so the documentation, error messages and online help is not comparable to mysql.

We had a situation where we were building a staging environment, duplicating our production run of Galera, yet we were running into problems not previously encountered. Two nodes, (one an arbitrator) were online and we were attempting to join a third node and have xtrabackup transfer over the data. The node would initiate xtrabackup, appear to be in operation as both nodes participated in the operation, but then quit after a few moments. Typically you might run into permissions issues, file locations, etc.. with either xtrabackup and the logs, either /varlog/mysqld.log on the donor, or the accompanying /var/lib/mysql/innobackup.backup.log will let you know what the problem is. In this case however, we were getting no obvious error,

the joiner log simply quit with,

SST failed: 32 (Broken pipe)

and the donor had a little more information,

WSREP_SST: [ERROR] innobackupex finished with error: 2.  Check /var/lib/mysql//innobackup.backup.log (20130422 11:23:08.361)

[ERROR] WSREP: Failed to read from: Process completed with error: wsrep_sst_xtrabackup --role 'donor' --address 'xx.xx.xxx.xxx:4444/xtrabackup_sst' --auth 'xxxxx:xxxxxxxx' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid 'a77b0b24-ab77-11e2-0800-f92413e82717:0'

Process completed with error: wsrep_sst_xtrabackup --role 'donor' --address 'xx.xx.xxx.xxx:4444/xtrabackup_sst' --auth 'xxxxx:xxxxxxxx' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid 'a77b0b24-ab77-11e2-0800-f92413e82717:0': 22 (Invalid argument)

which in retrospect did subtly point out the issue, but the pointer to innobackup.backup.log distracted from the real issue. Looking at that log, again a broken pipe error 32 was noted,
which doesn’t really tell you anything.

And so commencing with the testing of port/db/file permissions, switching which was a donor vs. joiner to see if the issue was two ways, comparisons to our production environment, etc.. mysql seemed to be fine, xtrabackup seemed to work correctly, etc..

The test that made is obvious was switching the SST type, to mysqldump, which then proceeded to behave the same way, but with some much more obvious error messages,

ERROR 2003 (HY000): Can't connect to MySQL server on 'xx.xx.xxx.xxx' (4) 130422 11:24:11 [ERROR] WSREP: Process completed with error: wsrep_sst_mysqldump --user 'root' --password 'xxxxxxxx' --host 'xx.xx.xxx.xxx' --port '3306' --local-port '3306' --socket '/var/lib/mysql/mysql.sock' --gtid 'a77b0b24-ab77-11e2-0800-f92413e82717:0': 1 (Operation not permitted) 130422 11:24:11 [ERROR] WSREP: Try 1/3: 'wsrep_sst_mysqldump --user 'root' --password 'xxxxxxx' --host 'xx.xx.xxx.xxx' --port '3306' --local-port '3306' --socket '/var/lib/mysql/mysql.sock' --gtid 'a77b0b24-ab77-11e2-0800-f92413e82717:0'' failed: 1 (Operation not permitted) hostname: Unknown host hostname: Unknown host

the ‘invalid argument’ in the first set of errors was referencing an unrecognized host. The staging servers were using internal ips, which work fine as far as Galera was concerned, except SST (xtrabackup/mysqldump) does not use the Galera node ips specified in ‘wsrep_cluster_address’  directly, a lookup is done and SST subsquently used the public ips, which were not open.

the solution is simple, you can use the variable,

wsrep_sst_receive_address=xx.xx.xxx.xxx

in your my.cnf to explicitly specify what SST should use as the ip.

 

 

 

Galera gotchas

We’ve recently implemented Galera clustering and have been pleased with the relatively easy install and implementation. A quick description of galera is the joining of individual mysql dbs as nodes to create a cluster that features multi-threaded synchronous replication, which allows for true high availability while still using your original db and engine (innodb). Likewise, the ability to quickly break down the cluster to individual servers if need be.  Our specific set up includes two dbs and an arbitrator to avoid ‘split-brain’ in case of a node failure. The process of adding a node simply involves connecting to the group, and the data transfer and sync is automatically done via SST (state snapshot transfer) with the typical overhead associated with the familiar backup methods, mysqldump, rsync and more recently xtrabackup. With xtrabackup you truly have a non-blocking method to add and re-add nodes. Recent improvements also include the addition of IST (incremental state transfer) which allows you to disconnect a node, do work, and reconnect and quickly catch up on the missing transactions on that node.

As mentioned, the install and implementation has been quite smooth, however here are a few things to keep in mind,

1. On installation, when running into errors, it’s important you analyze both the joiner AND donor error logs, as they of course will have differing messages. We ran into what ended up being an xtrabackup issue, which was misleading from the joiner logs, but clear as day in the donor logs.

2. On initial installation, you’ll be told to set wsrep_cluster_address=gcomm:// for your first node, as there is nothing to join to. However, DON’T keep this in your my.cnf, as on restart, you’ll end up creating a new cluster (again), not join the one you’ve made. Change it to specify the ip of one of the other nodes.

3. Similar to replication, Galera will auto-increment offset by the number of nodes, this is automatic, however, keep this in mind regarding large tables and datatype limits.

4. You may be surprised to learn that some fundamental operations will lock the entire cluster for the duration of the operation without some care. Here are two examples and by no means the only statements that can be long running and cause grief,

  • An alter table locks the entire cluster, even on a table that is not in use
  • A load data infile also locks the entire cluster even on a table that is not in use

The first is partially due to how galera handles DDL statements, total order isolation (TOI) is the default, more info can be found here,

http://www.codership.com/wiki/doku.php?id=rolling_schema_upgrade

the fact that is affects all tables and dbs is a bug / feature, more details here,

https://bugs.launchpad.net/galera/+bug/928919

I assume the load data infile lock is due to the synchronous requirements of the cluster waiting to receive confirmation of the commit on the second node.

you have a couple of options to avoid a cluster wide lock,

a) for the alter table scenario, as detailed at the link above, you can use a DDL handling method known as rolling schema upgrade (RSU) which automatically detaches the node for the duration of the operation, then resynchronizes when finished.

b) for both the load data infile and alter table you can do a more manual version of this by simply disconnecting each node, performing the operation, and reconnecting

c) A third version is to issue a command to only apply the command locally,

SET wsrep_on=0;
do your command (alter, load data, etc...)
SET wsrep_on=1;

All methods would have to be performed on all nodes and particularly with ALTER, you’ll need to consider whether inconsistent schemas on the nodes will cause replication problems.

5. As noted in the limitations section,

http://www.codership.com/wiki/doku.php?id=limitations

only innodb is supported (myisam is starting to become experimental). when dealing with the mysql permissions db (which are myisam), use ‘grant’ and ‘create’ instead of ‘insert’ if you want the commands to replicate.

6. A caveat and further explanation to the last point – to tables other than innodb, DML statements are not replicated, but DDL statements are. This is because DDL statements seem to be passed on using a different method than DML statements. This difference also has implications that can cause some confusion; on our two nodes, we were mystified as to why the dbs would replicate DDL statements both directions node1 <–> node2 , yet would not replicate DML statements, only node1 –> node2. Figuring that the DDL was replicating both ways, we ruled out a configuration restriction issue, which was wrong, as the eventual cause was replicate-wild-do-table= specification in my.cnf for a particular db, while we were using  the ‘test’ db for the test. The setting would not allow DML replication, yet Galera allowed the replication of the DDL (create and alter) of the table in the ‘test’ db.

7. It may take some time for the node to become fully operational, for instance you might not be able to login at first, or you’ve logged in and issue a command, ‘command not found’ is returned. Just give it a little time and/or check the error logs to confirm it’s on track.