Develop an Image Captioning Web App with Filestack in 2024

How do you think automatic captions are generated for images on websites? Images form a bulk of the content today. Describing them through image captioning is becoming more important. Did you know that 93% of the information our brain processes is visual? Yet, many of us aren’t proficient in reading. However, captions are often missing. Users with visual impairments or slow internet might lose vital information.

This blog helps solve that problem. We’ll show how to build an image captioning web app using Filestack. It’s a tool that simplifies image uploading and management. Filestack reduces the complexity of creating photo collages and captions. It enhances usability, practicality, and SEO.

We’ll guide you through each step. From getting an API key to building and testing your app. Let’s get started.

Key Takeaways

Image captions enhance accessibility, user experience, and SEO.
Filestack provides an easy tool for building apps that add captions to images.
The blog outlines steps to get an API key, build the app, and test it.
You’ll learn the advantages of using Filestack for image captioning.
Following the guide, you’ll create a basic app that adds captions to images.

What is image captioning?

Image captioning involves providing relevant text for a given image. This approach uses computer vision. It also employs natural language processing to form a description.

For example, a picture shows a dog playing on a beach. The system might caption it as: “A dog playing with a ball on the sandy beach.” It focuses on the objects and actions to describe the image better.

Image captioning has many benefits. Many e-commerce websites can automatically create product captions. It helps visually impaired people as screen readers narrate the images.

Image captioning is also used in apps like Google Photos. These apps tag pictures with labels like ‘cat’ or ‘beach’ to make searching easier.

Applications

Image captioning has many real-life uses. It increases accessibility and improves user experience. In e-commerce, it helps businesses automatically generate product descriptions. For instance, a caption could say, “blue cotton T-shirt with short sleeves.”

Social media sites like Instagram or Facebook use captioning to enhance user interaction. It also helps blind users as screen readers describe images.

In medicine, image captioning helps analyze images like X-rays or MRIs. It speeds up diagnosis. Google Photos uses image tags to make searching for specific pictures easier.

Why should you choose Filestack to create an image captioning app?

Filestack is recommended for developing image captioning applications for several reasons:

Easy Integration

Filestack offers a simple-to-use API and SDKs. This makes adding functionalities like file uploads and management easy.

Robust File Handling

Filestack supports various file types, including images. It allows resizing, cropping, and conversion effortlessly.

Dependable Performance

Filestack is fast and reliable. It handles large quantities of files smoothly, ensuring a seamless user experience.

Security Features

Filestack includes file encryption and access control. These features protect user data and prevent unauthorized access.

Scalable Solution

One great advantage of Filestack is its cloud-based storage and management systems. Both a handful of users and millions can access it. No changes are required to handle app growth. Its scalable structure manages increases effectively.

Filestack for image captioning

Filestack’s Image Captioning service describes images using sophisticated attention networks. This function is embedded in the Filestack platform. It operates synchronously or through the Workflows service.

Filestack’s Image Captioning vividly illustrates images and develops related text. These capabilities help users better grasp and control image-based information.

This service is accessed via the Processing API. Filestack offers a captioning task through its Description service. A security policy and signature are required to use these functionalities.

Example Code:

Get Caption of an Uploaded Image:

https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/caption/<HANDLE>

Chain Caption with Other Tasks (e.g., Resize):

https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/resize=h:1000/caption/<HANDLE>

Caption with External URL:

https://cdn.filestackcontent.com/<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/caption/<EXTERNAL_URL>

Caption with Storage Aliases:

https://cdn.filestackcontent.com/<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/caption/src

Response Format

{

    "caption": "a close up of a bird"

}

Response Parameters:

caption: The descriptive text of the image.

Filestack’s Image Captioning provides a powerful tool for describing and managing image content, enhancing your application’s capabilities.

How do you create an image captioning app via Filestack?

Here are all the steps you need to create an image captioning app via Filestack.

Getting the API key

You should visit the Filestack website at: https://www.filestack.com/
Next, you should create an account by clicking on the Sign Up button.
Then, you should log in to your account by entering your login credentials.
It will open up the Filestack dashboard. You will see the API key at the top right corner.
Copy the API key, and don’t share it with someone else.

Testing the API key

Create an account at the Postman API testing tool: https://www.postman.com/
Next, you should open the workspace and create a new collection.
Third, you should create a new request by right-clicking the new collection.
Next, you should create a GET request.
After choosing the GET request, you should enter the image captioning URL with the API key. You should also enter the policy and signature, which you can access through the Filestack dashboard.
Finally, click on the Send button. You will see the API response in the response window. You can see the demo in below images:

Let’s build an application now. We will use React JS for it.

1. Setup Your React App

First, create a new React project if you don’t have one:

npx create-react-app image-captioning-app

cd image-captioning-app

npm install filestack-js

2. Set Up Filestack Client

Create a .env file in the root of your project and add your Filestack API key:

REACT_APP_FILESTACK_API_KEY=<YOUR_API_KEY>

3. Create the React Component

In this step, we create a React component called ImageUploader.js. This component has several responsibilities.

It lets users upload images through the Filestack API.
For every image uploaded, Filestack creates captions automatically.
The component also generates tags to make images easier to search.

Let’s break down the component.

1. Importing Required Libraries

We start by importing React’s useState hook and the Filestack client library:

import React, { useState } from "react";

import * as filestack from "filestack-js";

useState: This React hook allows us to store and update data (images, captions, and tags).

filestack: A JavaScript library to interact with Filestack’s APIs, particularly for uploading and fetching images.

2. Initialize Filestack Client

We initialize the Filestack client with the API key stored in the .env file:

const client = filestack.init(process.env.REACT_APP_FILESTACK_API_KEY);

process.env.REACT_APP_FILESTACK_API_KEY

This environment variable stores your Filestack API key, which is used to authenticate your app with Filestack’s service.

3. State Management

We create three state variables to store:

The list of uploaded images.
The captions for each uploaded image.
The tags generated from the captions.

const [images, setImages] = useState([]);

const [captions, setCaptions] = useState({});

const [tags, setTags] = useState({});

images: Holds information (e.g., URLs) of uploaded images.
captions: Stores the generated captions for each image.
tags: Stores the tags generated from each image’s caption.

4. File Upload Handler

This function handles the upload process when the user clicks the “Upload Images” button:

const handleUpload = () => {

  const options = {

    onUploadDone: (res) => {

      const uploadedFiles = res.filesUploaded;

      setImages(uploadedFiles);

      generateCaptions(uploadedFiles);

    },

    accept: "image/*",

    maxFiles: 10, // batch upload

  };

  client.picker(options).open();

};

client.picker(): Opens the Filestack file picker to allow users to upload images.
onUploadDone: This callback is triggered when the upload is complete. It retrieves the uploaded file details and stores them in the images state using setImages().
generateCaptions(): This function is called after the images are uploaded to fetch captions for each uploaded image.

5. Generate Captions for Uploaded Images

This function makes a request to the Filestack captioning service to generate captions for each uploaded image:

const generateCaptions = async (uploadedFiles) => {

  const newCaptions = {};

  const newTags = {};




  await Promise.all(

    uploadedFiles.map(async (file) => {

      const url = `https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/caption/${file.handle}`;

      try {

        const response = await fetch(url);

        const data = await response.json();

        newCaptions[file.url] = data.caption;

        newTags[file.url] = extractTags(data.caption);

      } catch (error) {

        console.error("Error fetching captions:", error);

      }

    })

  );

  setCaptions(newCaptions);

  setTags(newTags);

};

uploadedFiles.map(): Loops through each uploaded file to fetch its caption.
fetch(url): Sends a request to Filestack’s captioning API for the given file.
newCaptions: Captions for each image are stored in this object.
extractTags(): A function that converts captions into tags (explained below).
setCaptions(): Updates the state with the generated captions.
setTags(): Updates the state with the generated tags.

6. Extract Tags from Captions

This simple function splits the caption into individual words and uses them as tags:

const extractTags = (caption) => {

  return caption.split(" ").map((word) => word.toLowerCase());

};

The caption.split(” “) function splits the caption into individual words based on spaces.
map() converts each word to lowercase, ensuring consistency in the tags.

7. Rendering the UI

Finally, we render the component’s UI:

return (

  <div className="App">

    <button onClick={handleUpload}>Upload Images</button>




    <div style={{ marginTop: "20px" }}>

      {images.map((image) => (

        <div className="image-container" key={image.url}>

          <img

            src={image.url}

            alt={captions[image.url] || "Uploaded Image"}

          />

          <p><strong>Caption:</strong> {captions[image.url]}</p>

          <p><strong>Tags:</strong> {tags[image.url]?.join(", ")}</p>

        </div>

      ))}

    </div>

  </div>

);

Button: A button that, when clicked, opens the Filestack picker (handleUpload()).

Image Display: Each uploaded image is displayed with its corresponding caption and tags.

images.map(): Loops over the uploaded images and renders an image with its caption and tags.
alt: Provides a caption as an alternative text for the image.

8. Exporting the Component

Lastly, we export the component so it can be used in other parts of the app:

export default ImageUploader;

4. Add the Component to Your App

In the src/App.js file, import and use the ImageUploader component.

import React from 'react';

import './App.css';

import ImageUploader from './ImageUploader';




function App() {

  return (

    <div className="App">

      <h1>Batch Image Captioning App</h1>

      <ImageUploader />

    </div>

  );

}




export default App;

5. Explanation of Key Parts:

Batch Upload

Filestack’s picker allows batch uploads of several images at once. The maximum file limit is 10.

Caption Generation

Each time an image is uploaded, Filestack’s captioning service is called. The image file handle is used to generate captions. These captions are stored in the captions state.

Tag Generation

A basic extractTags method splits captions into words. These words are treated as tags. You can create more complex methods that use advanced linguistics.

6. Run the App

npm start

Get the Complete Code for this App Here.

Output

When you run the app, you get the below web page:

Click on Upload Image and you will get the below file picker.

Select the desired files from your local directory or by searching through the file picker.

After uploading the images, you will see the images with captions and tags as shown below:

Conclusion

In 2024, developing an image captioning app with Filestack is fast and cost-effective. Filestack simplifies image uploads and caption generation. The API handles files efficiently, making organization and SEO easier.

Setting up a React app with Filestack is very straightforward. This improves the user interface, especially for users with special needs. The app can create photo captions, improving descriptions and search engine optimization.

Filestack is ideal for processing large numbers of pictures and captions. It is useful in sectors like digital marketing, e-commerce, and social networking. The result is a user-friendly app for creating image captions.

FAQs

How to get a caption for an image?

Upload the image to Filestack. Use its API to generate a caption automatically.

What is the best method for image captioning?

The best method for image captioning uses deep learning with computer vision and NLP.

How do you build an image captioning model?

You build an image captioning model by using computer vision and natural language processing techniques.

What is the goal of the image captioning task?

The goal is to generate descriptive text for images using computer vision and NLP.

Ayesha

Ayesha Zahra is a Geo Informatics Engineer with hands-on experience in web development (both frontend & backend). Also, she is a technical writer, a passionate programmer, and a video editor. She is always looking for opportunities to excel in her skills & build a strong career.