29/10/2018 |

The Zero-Waste OpenIO cluster - Part 2

Software and Test

In the first part of this series, we explained how we put together a working three-node cluster using only old parts we had lying around. Working with limited resources and second-hand legacy hardware proved to be a bit of a challenge, and our team encountered quite a few setbacks before finally managing to get three machines to boot up and be stable. For more details on the specs and the build itself, see the first part of our Zero-Waste Cluster series. 

While the build itself was an enriching experiment, our end goal was to use the jury-rigged cluster as a platform for an OpenIO object store. In this article, we’re going to discuss the entire setup process from its beginnings on the drawing board to actually connecting and benching Nextcloud, an S3-enabled application on the zero-waste OpenIO cluster. 

The drawing board 

Once we had actual hardware to work with, it was time to come up with a plan detailing how we were going to turn it into a cluster. We would obviously start with an operating system, then proceed to set up OpenIO by following the Three-Node Setup Guide, which would give us three S3 gateways connected to a working OpenIO SDS cluster. 

We would then load balance those with an HAProxy, which would give us a virtual IP to plug our storage application into. We chose Nextcloud as the storage app for our use case, mainly because it is easy to set up and use. We’ve seen that there is an opportunity to fine-tune Nextcloud to work on low-performance OpenIO clusters in the future. 

Finally, we would want to monitor our cluster, so we could set up our standard monitoring stack on it.  

 The following diagram sums up the architecture we planned: 

a2a50153-5b51-4658-ad16-2f0ca8b79836

The setup 

Equipped with our trusty Ubuntu Xenial CDs, we fired up the machines and went through the setup process. We had to burn two different CDs as the blue node requires a 32-bit setup. We went with the following partitioning scheme: 

Device Partition size Filesystem Mounted on
SSD 20 GB 

EXT4

/ (bootable)

SSD 100GB / 205GB 

XFS

/mnt/ssd1 
HDD1  100%

XFS

/mnt/ssd1 
HDD2  100%

 XFS

/mnt/hdd2 

 

Once the operating system was installed, we moved on to installing OpenIO SDS. We did our best to stick closely to the three-node install, tweaking only a few parameters. We had to bind the mount points listed above to data/metadata services, and swap the default three-copy replication policy with the better 4+2 erasure coding. Apart from these small changes, the default setup suited our needs, and, with the help of Ansible, we completed the setup process quite effortlessly. We also wanted to add an haproxy and keepalived, so we added the corresponding roles to our playbook. This ensured that our S3 gateway was load-balanced and accessible, even in the case of a node failure. As a side note, OpenIO packages for i386 architecture have been built and are still available in our 18.04 repository for Ubuntu Xenial. 

The next step was to set up monitoring on all nodes. Again, we opted for our fully Ansible managed monitoring setup that uses Prometheus as the metric storage system. At this point we put some thought into installing the administration machine on an external server, but considering the amount of unused computing resources still available on the zero-waste cluster, we figured that a standalone setup might do the job. Yet again the setup went well, with the exception of the “blue” node, which unfortunately didn’t support the SSE2 instruction set. We had to manually compile both netdata and blackbox-exporter, which serve as metric collectors and health check tools respectively, with the -mno-sse2 flag to make them work on our hardware. Apart from that minor setback, the setup went smoothly and we soon had a Grafana packed with our custom dashboards at our disposal. 

Finally, we installed Nextcloud 14 and connected it to our cluster. Again, we chose to keep the standalone aspect of our cluster, and installed the application on the second most powerful node (white), the most powerful one being in charge of metric storage and processing. To sum up, it was just a matter of unpacking the app onto a webroot managed by Nginx with PHP-fpm with the right modules, and going through the web installer. The only special configuration to be added was, of course, the object store, which needed to be done manually. We will make sure to revisit the Nextcloud setup process in more detail in a future article. Below is our fully functional OpenIO enabled Nextcloud app.

22f63079-4a50-4999-af3a-786c9de3005bResultsLearn a new concept of smarter storage

We first tried to do a reference test using the AWS CLI to upload a 1.6GB file. We obtained a decent speed of around 127Mb/s (as seen by our gateways). We then tried to upload the same file via Nextcloud and had worse performance. We carried out some optimizations as described in the Nextcloud administration manual for handling big files, after which our performance peaked at almost triple the reference. Here are the combined results: 

Test Actual bandwidth (peak)
Reference - AWS CLI 127Mbps
Nextcloud - unoptimized 42Mbps
Nextcloud - optimized 315Mbps

 

We could have made more optimizations and tweaks to our setup, which we will certainly aim to play with in the future, but for this first run, we decided that the performance we achieved was more than acceptable. After all, our three nodes shared a gigabit connection, which we used at a decent level (as expected the raw bandwidth was more than double that of the actual bandwidth in the last test). 

In terms of capacity, we could achieve 1.5TB actual storage space, but only 400GB would be erasure coded. Since we are using all our disks to securely write chunks, we are limited by the capacity of the largest device, which is 100GB, resulting in an actual capacity of k * total = 4 * 100GB = 400GB (where k is the number of data chunks configured in the erasure coding policy). The rest of the data would then be stored on the remaining disks with a three-copy replication policy, which would result in an additional 1TB of space. Replacing the smallest drive with a larger one would clearly be the way to go, but using either top-notch hardware or buying a new HDD would certainly go against our initial motivation, which is why we decided to keep the heterogeneous storage distribution. 

Below is a screenshot of the Grafana overview dashboard during the final run: 

73a43748-27bf-425b-842c-1a19ec372b7aFinal thoughts

From a crazy idea to a functional product: that sums up our work on the zero-waste OpenIO cluster. While we initially had doubts about the hardware, we eventually managed to have a working cluster, and soon enough turn it into an almost production-ready platform, achieving decent results in simple performance tests. Most of the tools we employed were the same we use to set up and operate larger platforms, and, while this setup still requires some tinkering, it is a remarkable feat considering where we started from. In the future we are planning to revisit the zero-waste project, either by competing it against other hardware (e.g. ARM), or by extending it to create an even larger cluster. 

New Call-to-action

COMMENTS