22/05/2017 |

The Importance of Performance in Object Storage

I recently read an internal report regarding a benchmark related to a PoC we are conducting with a potential customer. I can't share anything about it, but it reminded me of some old articles I wrote on Juku.it and how important overall system performance is, even for object storage infrastructures.

We usually consider object storage to be very scalable, but with limited data access speed and latency consistency. These two limitations narrow the range of applications for object stores. The general idea is that you have to trade speed for $/GB and scalability; but this is not the case for everyone.

Defining performance

I’ve been using this slide recently to show the differences between different storage types.

The content of this table is true in general terms, and, while it is true that you can't expect anything closer to the performance of a block device, object stores can be much faster than you may think.

OpenIO’s main differentiators are its lightweight backend, flexibility, and event-driven integrated data processing. All of this translates to efficiency, but even though this light weight and flexibility lead to efficiency, performance, which is a direct consequence of efficiency, is not always part of the message.

I have to say that I do not often mention performance on purpose. Most object storage customers are still focused on traditional use cases, such as archiving, where performance is less relevant than other factors.

But with the quick growth of object storage, and a larger audience interested in object stores, efficiency and performance will make a huge difference. For smaller customers (those storing between 100TB and 1PB), overall scalability is less important, and efficiency and performance become key. And since we are talking about small clusters–just a few nodes—nothing is more important than "do more with less."

When it comes to performance, OpenIO offers at least two options. Depending on your needs, you can:

  1. Configure a cluster to run applications directly on the storage system and boost overall infrastructure efficiency and performance.
  2. Optimize your storage infrastructure for pure performance.

The full list of options is longer, but let's stick on these two for this article.

Flexibility and performance redefined

In the benchmark results I mentioned earlier, two findings stood out. The first is the fact that our throughput is much higher than that of our competitors. This is important, but it doesn't tell the whole story. The most important thing to me was consistency. Our graph was practically a flat line, while our competitors show a much more scattered progression with a lot of peaks.

Those peaks reflect erratic behavior of the object store, which in turn affects latency. This is what you would expect from a traditional object store, but it is also why this kind of technology isn't very good for building a future-proof storage infrastructure.

Better performance, lower TCO

Everything about object storage comes down to $/GB. And even though you can use object storage software for free today (think about all the open source options out there, including OpenIO SDS), the reality is that you need something efficient and manageable, especially at scale.

Thanks to its overall efficiency, OpenIO SDS allows you to build smaller infrastructures with fewer servers and obtain the same performance of competing solutions. For example, think about a cluster that could be made with 50% fewer nodes, or with denser nodes. This means less datacenter footprint, less power consumption, less hardware to buy and manage. In other words, lower TCO.

The perfect storage for containers?

But there is much more than just TCO. Lower and consistent latency, higher throughput, and the ability to handle smaller objects at speed can open the door to a larger number of use cases.

One of the use cases I'm looking at is the storage for containers. Most of the initiatives around stateful containers and persistent storage for containers are not going very well. This is because of the nature of containers but also because the approach of these solutions to the problem is wrong.

If you look at what is happening in the storage industry, it is clear to me that the best approach isn't to reuse a traditional storage system, but to leave the containers as stateless and efficient as they are, working on RAM as primary storage and syncing data to an object store for persistency, but you need the right level of performance to do something like this! Unfortunately, it doesn’t make sense to redesign a legacy application to do it… but is a container the right tool to run a legacy application?

Thinking about next-generation applications

Containers are not the leading edge. We will soon be dealing with Industrial IoT, artificial intelligence, and machine learning on a daily basis, and applications will need to access a huge amount of data continuously and consistently. Performance, again, will be the key, alongside scalability.

Traditional object stores can't cope with that. Performance will be important to get applications closer to data for faster results. OpenIO developed Grid for Apps as an answer to this data/application relationship, but if a customer simply wants fast storage, they can have that too.

With millions of sensors producing data, and hundreds, even thousands of applications accessing it for the most diverse needs, performance, consistency, and predictability will be key. Is your object store ready for this?

On May 24, we will host a webinar discussing this type of application, and we will show how to use AI libraries on OpenIO SDS to get more data, more quickly, directly from the storage system.

TakeawaysOpenIO Core Solution Description

Traditional object storage, with its rigidity and lacklustre performance, poses serious limits to the development of next-generation storage infrastructures. It is good for specific use cases, but it's not future-proof.

In the long term, building an infrastructure based on an object store that does not offer consistent performance and latency alongside scalability puts at risk the adoption of a whole set of next-generation applications. And it will cost much more than expected.

OpenIO SDS is a next-generation solution, which can deliver the best $/GB today because of its efficiency, and proven scalability. At the same time, thanks to its flexibility and performance, it is ready to support the end user with applications that would be far out of reach with other object storage platforms.

Want to know more about OpenIO SDS?

OpenIO SDS is available for testing in four different flavors: Linux packages, the Docker image, and Raspberry Pi.

Stay in touch with us and our community through Twitter and our Slack community channel, to receive the latest info, support, and to chat with other users.

Solve your storage problem