Overview
Have you ever thought about how your brain understands the world when you look at it?
Your brain is an extremely powerful processor which interprets everything you see in your entire life. But what actually happens in your mind? When you look at a scene, your eyes capture and store it in a part of your memory, then your brain processes every single detail in that image. It compares details to the familiar learned facts, and if they match, the neurons in your brain respond with a description of what has been seen. Depending on the expansion of your visual and verbal memory, this unconscious process can be done in milliseconds or it may take much longer.
Now the main question is “How a computer can do this, i.e. looking at an image and describing it in phrases or sentences”? At Filestack we’ve developed a new intelligence service which performs the same scenario in an intelligent manner. Using state-of-the-art machine learning algorithms, we launched a service that ingests your image, extracts the visual features from it, uses the trained neural networks with learned words to generate a semantic sequence of words correlated with image visual features, and finally, describes it.
How to use Image Captioning service?
In order to use our Image Captioning service, you can either use FS Processing API (synchronous method) or FS Workflows (asynchronous method). If you wish to use Processing API to use Captioning task, you should follow these templates as URLs in your web browsers:
https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/caption/<IMAGE_HANDLE>
or
https://cdn.filestackcontent.com/<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/caption/<IMAGE_EXTERNAL_URL>
In both URL templates the <UPPERCASE> values should be replaced with the following description:
<FILESTACK_API_KEY>
: The API Key you can find in your Filestack Developer Portal<POLICY>
and<SIGNATURE>
: Filestack Security Policy and Signature. To use “caption” service, you must use FS Security Policy and Signature.<IMAGE_HANDLE>
: Filestack Handle generated after uploading your image<IMAGE_EXTERNAL_URL>
: Any URL to your image
If you wish to learn more about how to use this feature, please contact our support team to help you.
Examples
The following are examples showing you how to use the Filestack Image Captioning service as a part of FS Processing API, and what your results would yield. Clicking on each URL would process the input image and show you the image caption in real time.
Example 1
{
"result": {
"caption": "a large skyscraper in a city"
},
"url": "https://unsplash.com/photos/KTSX2GJ6OKg"
}
Example 2
{
"result": {
"caption": "a man riding a motorcycle down a street"
},
"url": "https://unsplash.com/photos/BwXsi8tcXlk"
}
Example 3
{
"result": {
"caption": "a dog lying on a bed"
},
"url": "https://unsplash.com/photos/QTGoDM3bGuE"
}
Example 4
{
"result": {
"caption": "a car parked on the side of a road"
},
"url": "https://unsplash.com/photos/GnOhjv0QoCc"
}
Useful Resources
- Image Captioning Documentation
- Blog: Filestack Image Captioning: An Image Describer Using Attention Networks
- Filestack Security
- Filestack Processing API
- Filestack Workflows
Filestack is a dynamic team dedicated to revolutionizing file uploads and management for web and mobile applications. Our user-friendly API seamlessly integrates with major cloud services, offering developers a reliable and efficient file handling experience.
Read More →