Machine learning and object detection models are changing the landscape of E-commerce daily. The key to the success of these technologies lies in the use of convolutional neural networks (CNNs), which can identify both low-level and abstract features of images and objects.
CNN Processes and Outputs
The convolution layer is the first element of a CNN. An image (say a blue skirt) is broken down into a series of overlapping tiles of a predetermined number of pixels. The tiles are run through a simple, single-layer neural network without weighting them, turning them into an array. If the image tile dimensions are small, the processing requirements remain manageable. The resulting output values are arranged into a three-dimensional array that numerically represents the content area, with axes representing height, width, and color.
A pooling layer is then introduced that takes the three-dimensional array and downsamples it along the spatial dimensions. This produces a pooled array that only displays the most important parts of the image (including our blue skirt) and discards the rest. This minimizes computational requirements and offsets ML overfitting.
Finally, the downsampled array is used to create a fully connected neural network. The output of this final step represents how confident our system is that we have an image of our blue skirt. CNNs facilitate the coding of object recognition APIs that enable advanced forms of categorization, product search, smart recommendations and content filtering.
Object Detection and Cognition
Object detection can be defined as the “problem of finding and classifying a variable number of objects on an image.” It’s the variability of objects in an image that make it problematic, since “the number of objects detected may change from image to image.” Another problem is cognition, or the ability of the object recognition API to discern between object types that differ but closely resemble one another. A favorite example is the “Chihuahua vs. Muffin” test.
Uses for Object Detection
You might think that object detection is not too visible in everyday life because of its complexity. However, it’s a lot more widespread than that. In fact, it might even be something that we come across everyday. From unlocking our mobile phones with our faces to tracking the movement of a ball, checking whether there’s movement in an area, and even self-driving cars, use of object detection is becoming more and more common. Let’s get to know them more in this section.
Let’s start off with one of the most ambitious and futuristic uses of object detection – self-driving cars. These vehicles can travel the road without any driver, providing a comfortable and sci-fi-like experience for passengers. These cars (rather, the computer integrated in them) can move safely by detecting objects and performing the correct action. While self-driving cars are still not as mature or trustworthy as other vehicles, they’re heading there. Whatever happens in the future, object detection will remain an important part of the creation of these vehicles.
Our second object detection use case still lies within the vehicle industry. This time, it’s about monitoring roads, vehicle traffic, and even pedestrians and vehicle behavior. This is mostly used for security or research purposes and can be very helpful in a number of scenarios. These include investigating accidents and crimes, checking the average number of vehicles that pass by a certain route, or even estimating the potential number of customers for a food stall. Object detection can help discern people or vehicles in images or CCTV feed, helping authorities, scientists, and business owners alike.
Facial recognition has become one of the most popular uses of object detection. Commonly seen in sci-fi and spy movies before, it now delights people across the world with its easy identity verification. For instance, businesses use it to grant access to the office building or other restricted areas like data centers. People may also use facial recognition on a daily basis if they set their smartphone to authenticate their identity using it. Facial recognition will likely stay as a popular form of authentication, thanks to the brilliance of object detection technology.
This is another easy way to authenticate users. Similar to how facial recognition works, iris authentication uses Machine Learning to detect someone’s iris to grant access to a device, room, or other objects.
Object detection can help describe or estimate people’s emotions in videos. People usually make use of emotion recognition for photos (e.g., apps that say “smile to take a picture”) or analysis.
Have you ever used Google’s image search wherein you’ll take a picture of something you don’t know and hope Google will identify it? Object detection directly contributes to that image search. An application with object detection can analyze the image and map it based on data sets, coming up with (usually) the correct information about it afterwards.
People can also use object detection for counting objects in images or videos because, well, counting them manually can be tasking. This applies to industries like research, business/marketing, security, and more.
Sports nowadays involve a lot of research and analysis. As such, people have found ways to have more information about ball movement with the use of object detection. Sports analysts and athletes can then digest the information to make better decisions in the future.
There’s another popular use for object detection – e-commerce. To find out how object detection works in that industry, let’s look at the section below.
Implications for E-Commerce
Although people use object and image analysis for different purposes, E-commerce applications typically fall into three categories: Image classification, augmented reality and content filtering. For the purposes of this post, we’ll focus on image classification and content filtering.
The implications for an E-commerce retailer may best be illustrated by the following use case. Let’s suppose an Instagram follower sees a celebrity wearing our favorite blue skirt, and she wants to purchase a similar one from an online retailer. She can upload an inspirational image to the retailer’s webpage where an object recognition API returns a list of possible keywords and images, such as “white blouse,” “blue skirt” and “brown handbag.” This enables our shopper to quickly narrow her search to blue skirts and evaluate options recommended by the retailer’s website. There is no human intervention needed, and the path from inspirational image to the checkout page is streamlined.
The image recognition APIs can also analyze and filter for inappropriate content. Since this adds a layer of protection for the online retailer, you can encourage more shoppers to upload images without fear of inappropriate content landing on their website.
Machine Learning Made Simple
Proper image management is a continuous struggle. Tagging, categorizing, screening all cause headaches and take precious time. Image analysis utilizing machine learning can ease the pain. Filestack uses state of the art platforms to provide analysis of images and return actionable data.
Talk to us to see how Filestack Image Intelligence can help ease the burden of image management.
Read More →