Why Object Detection is a Necessity for E-Commerce

Object detection used for labradoodles and fried chicken

Machine learning and object detection models are changing the landscape of E-commerce daily. The key to the success of these technologies lies in the use of convolutional neural networks (CNNs), which can identify both low-level and abstract features of images and objects.

CNN Processes and Outputs

The convolution layer is the first element of a CNN. An image (say a blue skirt) is broken down into a series of overlapping tiles of a predetermined number of pixels. The tiles are run through a simple, single-layer neural network without weighting them, turning them into an array. If the image tile dimensions are small, the processing requirements remain manageable. The resulting output values are arranged into a three-dimensional array that numerically represents the content area, with axes representing height, width, and color.

A pooling layer is then introduced that takes the three-dimensional array and downsamples it along the spatial dimensions. This produces a pooled array that only displays the most important parts of the image (including our blue skirt) and discards the rest. This minimizes computational requirements and offsets ML overfitting.

Finally, the downsampled array is used to create a fully connected neural network. The output of this final step represents how confident our system is that we have an image of our blue skirt. CNNs facilitate the coding of object recognition APIs that enable advanced forms of categorization, product search, smart recommendations and content filtering.

Object Detection and Cognition

Object detection can be defined as the “problem of finding and classifying a variable number of objects on an image.” It’s the variability of objects in an image that make it problematic, since “the number of objects detected may change from image to image.”  Another problem is cognition, or the ability of the object recognition API to discern between object types that differ but closely resemble one another. A favorite example is the “Chihuahua vs. Muffin” test.


Object Detection used for chihuahuas and muffins
Chihuahuas Vs. Blueberry Muffins

Implications for E-Commerce

Although the potential uses for object and image analysis are numerous, E-commerce applications typically fall into three categories: Image classification, augmented reality and content filtering.  For the purposes of this post, we’ll focus on image classification and content filtering.

The implications for an E-commerce retailer may best be illustrated by the following use case. Let’s suppose an Instagram follower sees a celebrity wearing our favorite blue skirt, and she wants to purchase a similar one from an online retailer. She can upload an inspirational image to the retailer’s webpage where an object recognition API returns a list of possible keywords and images, such as “white blouse,” “blue skirt” and “brown handbag.” This enables our shopper to quickly narrow her search to blue skirts and evaluate options recommended by the retailer’s website. No human intervention is needed, and the path from inspirational image to the checkout page is streamlined.

The image recognition APIs can also analyze and filter for inappropriate content. Since this adds a layer of protection for the online retailer, more shoppers can be encouraged to upload images without fear of inappropriate content landing on their website.

Machine Learning Made Simple

Proper image management is a continuous struggle.  Tagging, categorizing, screening all cause headaches and take precious time.  Image analysis utilizing machine learning can ease the pain. Filestack uses state of the art platforms to provide analysis of images and return actionable data.

Talk to us to see how Filestack Image Intelligence can help ease the burden of image management.

Read More →

Ready to get started?

Create an account now!