Site icon Filestack Blog

Implementing Efficient Mobile OCR with Filestack: A Developer’s Guide

Implementing Efficient Mobile OCR with Filestack A Developer's Guide

Optical Character Recognition/OCR technology is widely used in various industries today. It has totally transformed the way we extract textual information from documents. OCR basically allows us to automatically extract text (printed or handwritten text) from scanned documents, PDF files, and images. With the advent of mobile OCR data capture, OCR has become even more accessible and integral to a variety of modern apps. Fortunately, various OCR SDKs and APIs are available for iOS and Android, making it even easier to integrate OCR functionality in mobile apps. Filestack is one such tool.

Filestack offers a specialized SDK for both iOS and Android. You can use these SDKs to integrate Filestack file management capabilities seamlessly into your mobile apps. Filestack also offers advanced OCR capabilities through its processing API, which you can use for mobile data extraction.

In this comprehensive guide, we’ll:

Setting Up Filestack’s Mobile OCR SDK

Filestack offers powerful OCR functionality as a part of its intelligence services. We can use it through Filestack processing API.

Filestack’s OCR uses advanced machine learning models and neural networks to recognize and extract text accurately. Moreover, Filestack OCR is backed by advanced document detection and pre-processing solutions. This enables the OCR engine to detect complex documents, such as rotated, folded, and wrinkled documents.

The illustration below shows how Filestack OCR software works:

Integration Steps for Filestack iOS SDK

Here, we’ll show you the basic steps for integrating Filestack iOS SDK into your apps. This SDK essentially integrates Filestack’s powerful file uploader in your iOS app. We can use this uploader for mobile scanning and to upload scanned documents for OCR. We’ll implement the OCR functionality through Filestack Processing API.

First, we’ll install the iOS SDK through CocoaPods.

gem install cocoapods

Now, we need to integrate FilestackSDK into our Xcode project (remember to specify it in your Podfile):

source ''

platform :ios, '16.0'


target '<Your Target Name>' do

    pod 'Filestack', '~> 2.0'


Next, run the following command:

pod install

Here is how you can present the Filestack File Picker/Uploader in your app:

// Create `Config` object.

let config = Filestack.Config.builder

    .with(appUrlScheme: "YOUR-APP-URL-SCHEME")

    .with(availableCloudSources: [.dropbox, .googledrive, .googlephotos, .customSource])

    .with(availableLocalSources: [.camera, .photoLibrary, .documents])


// Instantiate the Filestack `Client` by passing an API key obtained from

// If your account does not have security enabled, then you can omit this parameter or set it to nil.

let client = Filestack.Client(apiKey: filestackAPIKey, config: config)

// Store options for your uploaded files.

// Here we are saying our storage location is S3 and access for uploaded files should be public.

let storeOptions = StorageOptions(location: .s3, access: .public)

// Instantiate picker by passing the `StorageOptions` object we just set up.

let picker = client.picker(storeOptions: storeOptions)

// Optional. Set the picker's delegate.

picker.pickerDelegate = self

// Finally, present the picker on the screen.

present(picker, animated: true)


iOS OCR implementation

We can use this CDN URL to implement OCR through Filestack processing API:<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/ocr/<EXTERNAL_URL/CDN URL>

Here is an example code for implementing the OCR mobile data extraction in your app:

func performOCRwithProcessingAPI(fileURL: String) {
    // Construct the Processing API URL
    let processingAPIURL = "<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/ocr/<EXTERNAL_URL/CDN URL>"

    // Create the URLRequest
    var request = URLRequest(url: URL(string: processingAPIURL)!)
    request.httpMethod = "POST"

    // Set up the request body with the file URL
    let requestBody = ["url": fileURL]
    request.httpBody = try? requestBody)

    // Create a URLSession task to make the API request
    let task = URLSession.shared.dataTask(with: request) { (data, response, error) in
        // Handle the API response
        if let error = error {
            print("Error: \(error)")
        } else if let data = data {
            // Parse and handle OCR results
            if let ocrResults = try? JSONSerialization.jsonObject(with: data, options: []) as? [String: Any] {
                print("OCR Results: \(ocrResults)")

    // Start the URLSession task

Also Read: New Capabilities in iOS SDKs for OCR Data Capture and Image Processing.

Integration Steps for Filestack Android SDK

Include the following Filestack Android SDK dependency in your app’s build.gradle file:

implementation 'com.filestack:filestack-android:6.0.0'

Here is how you can set up and configure the Filestack picker in your Android app:

FilestackPicker picker = new FilestackPicker.Builder()
picker.launch(activity); //use an Activity instance to launch a picker                 

You can then use Filestack OCR URL to perform OCR:<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/ocr/<EXTERNAL_URL/CDN URL>

Filestack OCR Demo

Input image

Filestack OCR output

Optimizing Camera Capture for OCR Mobile Data Capture

Here are best practices for optimizing camera capture for physical documents:

Implementing Real-Time OCR

Here are some techniques to consider to implement real-time OCR:

Structured Data Capture

Structured OCR data capture involves extracting specific text from forms and other structured documents. This includes:

We can use template matching to extract data from structured documents. Template matching involves using predefined templates to match the structure of the document. The OCR engine then identifies key areas or regions of interest based on the template.

We can also use text detection and layout analysis techniques to locate fields within a specific document. Then, we can apply OCR to the detected fields to extract text.

It’s also crucial to implement validation rules. These rules help ensure the extracted data is accurate and correct, such as date formats. Using regular expressions and checksums for specific data types is also helpful for data validation.

Specialized OCR Use Cases

OCR has various use cases across different industries. Here are some specialized OCR use cases:

Invoice Scanning and Data Extraction

Extracting data, such as invoice numbers, dates, and vendor details, from invoices is a common task for invoice processing. OCR automates this process by extracting relevant data from invoices accurately. This eliminates the need for manual data entry and reduces human errors.

You can utilize a specialized OCR solution for invoice data extraction. These OCR solutions are specially designed to detect various invoice formats and extract relevant data from different types of invoices.

For example, Filestack OCR can efficiently extract data from invoices.

Business Card Information Capture

Advanced OCR solutions, such as Filestack, can also extract useful details from business cards accurately. These include names, phone numbers, email addresses, and company details.

Businesses can use these automated OCR solutions to directly import useful data into their CRM platforms or contact management systems. This helps with lead management, personalized marketing, and effective communication with clients.

ID Document Scanning and Verification

OCR is also quite helpful in ID card verification. We can use it to extract accurate information from various types of ID cards automatically. These include national IDs, driver’s licenses, and passports. This information can then be used for identity verification processes for security, access control, and registration purposes.

Filestack OCR is designed to efficiently extract relevant information for ID cards.

OCR technology is also widely used for:

Offline OCR Capabilities

We need to perform OCR locally to implement OCR functionality without an internet connection or offline mobile OCR. This allows apps to work even in areas with no internet connectivity.

We can use on-device/user’s device OCR libraries to perform OCR locally. For example, we can use:

It’s also essential to manage offline data and sync it with a server once connectivity is restored. For example, we can store the OCR results locally using local file storage or a database. We also need to implement a queuing mechanism to track data that needs to be synced.

We can use Filestack capabilities to manage and store OCR data.

Optimizing OCR Performance on Mobile

Here are some effective tips and techniques for optimizing OCR performance on mobile:

Enhancing User Experience


Here’s how you can enhance user experience in your OCR mobile data capture applications:

OCR Security Considerations

Your OCR app may process documents containing sensitive and confidential information. Thus, it’s crucial to implement robust security features to protect user data.

If you’re using Filestack, you get the following security features:

You can also implement Role-Based Access Control to protect OCR data.

Also Read: Integrating Robust Security Features in the Latest Android OCR SDK.

Cross-Platform Mobile OCR Considerations

If you’re looking to create cross-platform OCR mobile data capture applications, you can consider using cross-platform frameworks like React Native, Flutter, or Xamarin. These frameworks allow developers to write a single codebase for both platforms (iOS and Android) while accessing native functionality.

Moreover, it’s essential to choose an OCR library or SDK that supports both iOS and Android, such as Tesseract OCR. It’s also recommended to implement the OCR logic into a separate module or service that can be shared across both platforms. This way, any changes to the OCR logic are applied on both platforms without duplication.


OCR technology is widely used in mobile devices (iOS and Android) these days. With various OCR SDKs and libraries for mobile OCR data capture, implementing OCR functionality in mobile devices is even easier.

For example, Filestack offers a specialized SDK for both iOS and Android. With these SDKs, you can seamlessly integrate Filestack file management capabilities and intelligence services (including OCR) into your mobile apps.

In this article, we’ve discussed the implementation of efficient mobile OCR data capture using Filestack’s mobile SDKs. We’ve also explored various techniques for:


How does Filestack’s mobile OCR perform in low-light conditions?

Filestack’s OCR SDK includes image enhancement features to improve performance in various lighting conditions. However, optimal lighting will always yield the best results.

Can Filestack’s mobile OCR SDK handle handwritten text?

Yes, the SDK can process handwritten text. However, accuracy may vary depending on the clarity of the handwriting.

How does implementing OCR affect app size and performance?

While OCR functionality can increase app size due to necessary libraries, Filestack’s SDK is optimized for mobile use.

Sign up for Filestack and leverage its powerful mobile OCR capabilities!

Exit mobile version