29/05/2017 |

Face recognition, integrated into the object store

It’s time for the third in our series of articles about Grid for Apps, the event-driven framework that is part of the OpenIO SDS object storage solution. For those who missed the previous articles, here’s a link to the first article A Technical Introduction to Grid for Apps, and to the second one Simple Metadata indexing through Grid for Apps.

In this article, we will tackle a common use case often raised by some of our customers. How can we recognise patterns in pictures at scale? By “scale,” we mean when you are storing billions of files. We have prepared three possible scenarios, using the most common open-source. One has been showcased during our webinar on May 24 with Tensorflow. The following week, we will do the same using the deep learning framework Caffee.

But let's start with OpenCV. We will use it to detect faces in pictures and localize them. Using our event-driven processing framework, we will use this framework to analyse each object that is uploaded; a pre-trained neural network will detect people in images. As usual, we will enrich the object with new metadata: the number of faces detected, their location, and the sizes of the images. If you think about it, there are many use cases for this sort of procedure: face blurring, face detection for CCTV, or like the cover picture you can see behind.

As a side note, using Elasticsearch, presented in the second article, we will demonstrate how you can build more complex workflows with OpenIO Grid for Apps technology: store, enrich, index, and search!

Let’s do it!

As in our previous articles, we will use our docker container image to easily spawn an OpenIO SDS environment. Retrieve the OpenIO SDS Docker image:

<em>#</em> docker pull openio/sds

Start your new OpenIO SDS environment:

<em># </em>docker run -ti --tty openio/sds

You should now be at the prompt with an OpenIO SDS instance up and running.

Next, we will configure the trigger, so each time you add a new object, the metadata from the object will be pushed to Elasticsearch. Add the following content to the file /etc/oio/sds/OPENIO/oio-event-agent-0/oio-event-handlers.conf:

pipeline = process

pipeline = content_cleaner

pipeline = account_update

pipeline = account_update

pipeline = account_update

pipeline = volume_index

pipeline = volume_index

use = egg:oio#content_cleaner

use = egg:oio#account_update

use = egg:oio#volume_index

use = egg:oio#notify
tube = oio-process
queue_url = beanstalk://

If you want to learn more about this configuration file, please refer to our previous blog post.

Then, restart the openio event agent to enable the modification:

<em>#</em> gridinit_cmd restart @oio-event-agent

Your event-driven system is now up and running. The next step is to write the script that will analyse objects using OpenCV. To do so, we will need first to install the OpenCV python module:

<em>#</em> yum install opencv-python

Download the pre-trained neural network:

<em>#</em> curl https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_alt.xml -o /etc/oio/sds/OPENIO/haarcascade_frontalface_alt.xml

Then write the script; let’s call it detect-faces.py:

#!/usr/bin/env python
import cv2
import json
import numpy as np

from oio.api import object_storage
from oio.event.beanstalk import Beanstalk, ResponseError

def faceclassifier(image):

# Specify the trained cascade classifier
face_cascade_name = "/etc/oio/sds/OPENIO/haarcascade_frontalface_alt.xml"

# Create a cascade classifier
face_cascade = cv2.CascadeClassifier()

# Load the specified classifier

# Run the classifiers
faces = face_cascade.detectMultiScale(image, 1.1, 5, 0|cv2.cv.CV_HAAR_SCALE_IMAGE, (30, 30))

return faces

b = Beanstalk.from_url("beanstalk://")

while True:
event_id, data = b.reserve()
except ResponseError:

# Retrieve the information from the event (namespace, bucket, object name ...)
meta = json.loads(data)
url = meta["url"]

s = object_storage.ObjectStorageAPI(url["ns"], "")
meta, stream = s.object_fetch(url["account"], url["user"], url["path"])

image = cv2.imdecode(np.frombuffer("".join(stream), np.uint8), 1)
faces = faceclassifier(image)
# Update the object with new metadatas, number of faces + position
s.object_update(url["account"], url["user"], url["path"], {"face_number" : str(len(faces)), "position" : json.dumps(faces.tolist())})

Finally, launch it in background:

<em>#</em> python detect-faces.py &amp;

Please note that the script is written in Python, but you can write it any other language.

How does it work?

It’s time to add a new picture to see if the process works. First, download a picture to upload it to OpenIO. Let's use the family picture we took during our last hackathon:

<em>#</em> curl http://community.openio.io/uploads/default/original/1X/bfb14e31430cd10b3f719db3d20359864ef611a8.jpg -o /tmp/family.jpg

Using the OpenIO CLI, let’s upload this new object family.jpg to the container mycontainer in the account myaccount:

<em>#</em> openio --oio-ns OPENIO --oio-account myaccount object create mycontainer /tmp/family.jpg

Well done! You’ve just uploaded the picture and OpenCV was launched as a background task to process it. Now, and to conclude, let’s check the metadata added to the object:

<em>#</em> openio --oio-ns OPENIO --oio-account myaccount object show mycontainer family.jpg

We obtain the following result:

| Field | Value |
| account | myaccount |
| container | mycontainer |
| ctime | 1495148185 |
| hash | B2F24793EF43A0837F13B74090B4B0F1 |
| id | 0A117650D44F050027DFB3C35803A77B |
| meta.face_number | 10 |
| meta.position | [[370, 285, 46, 46], [472, 289, 44, 44], [844, 296, 48, 48], [758, 319, 43, 43], [358, 344, 48, 48], [642, 348, 44, 44], [192, 305, 52, 52], [270, 330, 41, 41], [570, 306, 45, 45], [50, 223, 66, 66]] |
| mime-type | application/octet-stream |
| object | family.jpg |
| policy | SINGLE |
| size | 92434 |
| version | 1495148185129235 |

As you can see, OpenCV detected 10 faces meta.face_number = 10 with their following position and size (x,y,w,h) in meta.position. 

Want to know more about OpenIO SDS and Grid for Apps?

OpenIO SDS is available for testing in four different flavors: Linux packages, the Docker image, and Raspberry Pi.

Stay in touch with us and our community through Twitter and our Slack community channel, to receive the latest info, support, and to chat with other users.

Learn how to choose the right object storage