Translate Images to Words with Filestack Image Captioning

Overview

Have you ever thought about how your brain understands the world when you look at it?

Figure 1: Human brain describes what it sees

 

Your brain is an extremely powerful processor which interprets everything you see in your entire life. But what actually happens in your mind? When you look at a scene, your eyes capture and store it in a part of your memory, then your brain processes every single detail in that image. It compares details to the familiar learned facts, and if they match, the neurons in your brain respond with a description of what has been seen. Depending on the expansion of your visual and verbal memory, this unconscious process can be done in milliseconds or it may take much longer.

Now the main question is “How a computer can do this, i.e. looking at an image and describing it in phrases or sentences”? At Filestack we’ve developed a new intelligence service which performs the same scenario in an intelligent manner. Using state-of-the-art machine learning algorithms, we launched a service that ingests your image, extracts the visual features from it, uses the trained neural networks with learned words to generate a semantic sequence of words correlated with image visual features, and finally, describes it.

 

How to use Image Captioning service?

In order to use our Image Captioning service, you can either use FS Processing API (synchronous method) or FS Workflows (asynchronous method). If you wish to use Processing API to use Captioning task, you should follow these templates as URLs in your web browsers:

https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/caption/<IMAGE_HANDLE>

or

https://cdn.filestackcontent.com/<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/caption/<IMAGE_EXTERNAL_URL>

In both URL templates the <UPPERCASE> values should be replaced with the following description:

  • <FILESTACK_API_KEY>: The API Key you can find in your Filestack Developer Portal
  • <POLICY>and<SIGNATURE>: Filestack Security Policy and Signature. To use “caption” service, you must use FS Security Policy and Signature.
  • <IMAGE_HANDLE>: Filestack Handle generated after uploading your image
  • <IMAGE_EXTERNAL_URL>: Any URL to your image

If you wish to learn more about how to use this feature, please contact our support team to help you.

 

Examples

The following are examples showing you how to use the Filestack Image Captioning service as a part of FS Processing API, and what your results would yield. Clicking on each URL would process the input image and show you the image caption in real time.

Example 1

http://cdn.filestackcontent.com/security=p:eyJleHBpcnkiOjE4OTM1NjIyMDAsImNhbGwiOlsiY29udmVydCJdLCJoYW5kbGUiOiJEZDUwTDhKVGNxZGhsRWp5eWpXUyJ9,s:ff3cc1505d57f37fea063b73f6a1cad7a6ba481c5a540be98a2a6a7e63ced4c5/caption/Dd50L8JTcqdhlEjyyjWS

 

Example 1
Figure2: Example 1
{
    "result": {
        "caption": "a large skyscraper in a city"
    },
    "url": "https://unsplash.com/photos/KTSX2GJ6OKg"
}

Example 2

http://cdn.filestackcontent.com/security=p:eyJleHBpcnkiOjE4OTM1NjIyMDAsImNhbGwiOlsiY29udmVydCJdLCJoYW5kbGUiOiIyRkJtZEtMd1RtVzFzSHFiU3BSRCJ9,s:6e333877c4b18fe05a49e0644a31816d2afc4a424e69d6b373850ee7e407fe40/caption/2FBmdKLwTmW1sHqbSpRD

Example 2
Figure 3: Example 2
{
    "result": {
        "caption": "a man riding a motorcycle down a street"
    },
    "url": "https://unsplash.com/photos/BwXsi8tcXlk"
}

Example 3

http://cdn.filestackcontent.com/security=p:eyJleHBpcnkiOjE4OTM1NjIyMDAsImNhbGwiOlsiY29udmVydCJdLCJoYW5kbGUiOiJkQWdHY3hheFQ0bVBqZDlqNFBWbiJ9,s:28482adb5c686502833fa2da2dbfb4452df86a42fb5bce16430866a5ccc50f48/caption/dAgGcxaxT4mPjd9j4PVn

Example 3
Figure 4: Example 3
{
    "result": {
        "caption": "a dog lying on a bed"
    },
    "url": "https://unsplash.com/photos/QTGoDM3bGuE"
}

Example 4

http://cdn.filestackcontent.com/security=p:eyJleHBpcnkiOjE4OTM1NjIyMDAsImNhbGwiOlsiY29udmVydCJdLCJoYW5kbGUiOiJHM1BRY0Q0OVNOV2J2U25TWmNEUyJ9,s:a85b57cf684f9ada183eabb4aab611980fce6c72149a7222eb5e98dfaa972662/caption/G3PQcD49SNWbvSnSZcDS

Figure 5
Figure 5: Example 4
{
    "result": {
        "caption": "a car parked on the side of a road"
    },
    "url": "https://unsplash.com/photos/GnOhjv0QoCc"
}

 

Useful Resources

  1. Image Captioning Documentation
  2. Blog: Filestack Image Captioning: An Image Describer Using Attention Networks
  3. Filestack Security
  4. Filestack Processing API
  5. Filestack Workflows

Read More →