Understanding Event-Driven Storage
This is the first in of a series of articles about the event-driven framework that is part of our SDS object storage solution. This framework allows users to process data at scale; we call it Grid for Apps.
I recommend that you first read this article to understand what we describe below: Run Applications Directly on the Storage Infrastructure.
There are many use cases for Grid for Apps. Today, this technology is used for video transcoding, metadata enrichment, image recognition and manipulation, pattern recognition in images and data files, real-time video transcoding and watermarking, and more. But, if you think of the future and the quantity of data we expect to produce, the number of use cases is even bigger, with applications in fields like industrial IoT, artificial intelligence, big data; the only limit is your imagination.
Let’s give it a try
Let’s start with a very simple use case: adding a new metadata field to an object right after its upload. We will tackle more complex use cases in the coming weeks.
To deploy an OpenIO SDS cluster, we will use the Docker container that we provide as a quick and easy way to use the software. But you can use the same steps to implement OpenIO Grid for Apps and use it on a very large platform with hundreds of nodes and billions of objects.
Retrieve the OpenIO SDS Docker container:
Start your new OpenIO SDS environment:
You should now be at the prompt with an OpenIO SDS instance up and running.
Next, we will configure the trigger, so that every time you add a new object, the data is processed and a new metadata field is added.
Add the following content to the file /etc/oio/sds/OPENIO/oio-event-agent-0/oio-event-handlers.conf:
As you can see in the configuration file, there are many events that can be triggered (such as storage.container.new, storage.content.deleted, etc.), but for this tutorial we will just focus on the storage.content.new event.
According to the configuration file, each time we put new content in the object store ([handler:storage.content.new"]), we will use the pipeline “process” (pipeline = process).
The pipeline “process” will then take the event and put it in the tube oio-process in the local beanstalk instance, as described at the end of the configuration file:
Then, restart the openio event agent to enable the modification:
Your event-driven system is now up and running. The next step is to write a small script that will take the events stored in the beanstalk to process the object.
Let's create a script called add-metadata.py with the following content:
Finally, launch it in background:
Please note that the script is written in Python, but you can write it any other language.
How does it work?
It's time to add a new object to see if it works. Using the OpenIO CLI, let's upload the new object /etc/fstab to the container mycontainer in the account myaccount:
And check that the new metadata was properly set:
With the following result:
You can see that the metadata was added to the object meta.uploaded | true
Join us on May 25
As I mentioned above, this is the first of a series of articles that will demonstrate our Grid for Apps technology with some interesting use cases (image recognition and manipulation, pattern recognition, content indexation, and more).
We are also planning a webinar for May 24, and we’ll give you a glimpse of what you can expect from Grid for Apps in the near future. This will be the chance for you to ask all your questions about how this technology works and how you can implement it in your environment.