Recreation? Version 2?

I am contemplating recreating the cluster, not only to move it from one network to another also taking a few network hops out of the picture and adding them not only to my main switch but putting them all on a UPS rather then just one of each service set., which in its self would not require recreating the setup but after watching the process usage on the different nodes they don’t use half their available usage which is saying a lot for a raspberry pi.

So version two would make fully redundant usage of each node rather then segregating the two services into two groups of Pis.

Giving each Pi, aside from the haproxy host, the Glusterfs backed Webserver and a clustered database, taking the redundancy from three to six.

I may also add a VM to backup the lot, not in anyway part of the active system but a part of each cluster service to keep a backup of the data.

 

Recovering From a Power Outage

These Pis are not on a UPS,

 

They will be eventually, but for now I don’t have them close enough to one that is big enough to support all seven of them at once, this became a problem yesterday morning then there was a power blip around 2 am, everything dies, and it didn’t automatically come back up.

 

After a little investigation I found that the gluster automount and Mariadb/galera, for each node in either cluster the node expects one node to be online when it comes online, if its not then it fails, worse yet comes when trying to find out which node in Maria/galera has the latest data set, I started two nodes, got them syncing and when trying to start the third node it would not start complaining:

 

2018-05-02 8:45:31 1618998080 [ERROR] WSREP: gcs/src/gcs_group.cpp:group_post_state_exchange():321: Reversing history: 682322 -> 633380, this member has applied 48942 more events than the primary component.Data loss is possible. Aborting.

 

After a bit of googling I found that this means that node has newer data then the other two and wont start because that would mean discarding the new data, thus I had to bring back down the other two nodes, start this one as the primary node, and when the other two came online they were updated with the new data.

 

For the time being I have two small two port UPS’s I plan to put one of each on one of them, making one of them power loss resistant, this should allow the others in the cluster to come back online with minimal disruption.

Day Three

These wont continue to be daily posts,

Noticed that the first pi in the http/s line is failing checks once in a while and going down, not a big┬áproblem as the other two are there to pick up the slack, I actually expected this as I have a few services running on the “primary” that are not on the other two.

 

The primary SQL is the one running on the flash stick, the backups are the spinning HDDs over USB, and they are not fast in any means but they were a means to an end in this project and will eventually be upgraded.

 

Day Two

athis is the second day with this cluster running, I have downed the other webservers I had running, added a few more web apps, oh and most importantly I have made the entire cluster https enabled.

While I read you can terminate (originate? I digress) the ssl connections on the haproxy box its self I couldn’t get this to work so I used the ssl-passthrough method, i configured letsencrypt on the “primary” webserver and have it distribute the keys to the others.

I have also semi-stress tested it with jmeter, it handles fifty connections just fine and jmeter errors when I try five hundred.

Thats it for now!

Hello from the Pis

As you will see in the about this site, it runs on seven Raspberry Pis:

One Raspberry Pi 2 B acts as a proxy load balancer for both incoming http connections and mysql connections.

Three Raspberry Pi 3B+ equipped with 32GB class 10 Micro SD cards, 32 GB stick as swap (Overkill but on sale) and 128GB stick setup with glusterfs between the three of them mounter at /var/www

Three Raspberry Pi 3 as a Clustered MariaDB server, one runs the same setup as the Pi 3B+’s above, the other two have slower SD cards, smaller swap sticks and Mechanical HDDs. The HDDs and 128GB house the databases. The slower Pis in this lot are setup as backups.