I have a rather bad habit of taking screenshots of random things I want to remember. Something like a product I want to buy and compare price later. On my Mac I have Desktop folder full of these screenshots. And of course it is almost impossible to find anything there weeks or months later among hundreds of files.

When I needed to find something once again I thought that it would be nice to have a way to search these images by text. I spent a few moments searching for a tool that could do this, found nothing and decided to write some code instead. I suspect I didn’t search thoroughly enough and was just looking for an excuse to code that evening though 😄. Especially considering that I didn’t have any experience with OCR (Optical Character Recognition) libraries and it was a good opportunity to learn something new.

I did a quick research and picked EasyOCR library. It is lightweight, easy to use, supports multiple languages and provides quite good results according to various benchmarks. Since it was a Python library, it also predefined my choice of programming language for this task.

I ended up with a CLI app that has two subcommands:

load_and_index

Scans a folder for image files and processes each file with OCR. The file path and extracted text are saved to a PostgreSQL database. During subsequent runs, only new files are processed and added to the database. A list of languages to use for OCR can be specified in the config file.

load_and_index

search

Searches the database for a specified query and returns a list of files containing the query.

search

On my laptop it takes around 3 seconds to process a single image file, so it can take a while on initial run.

I’m quite happy with the results, it is good enough for my needs. I may implement actual full-text search instead of simple match using ILIKE operator in future. Sounds like a good reason to try meilisearch which I wanted to play with for a while.

The code along with instructions on how to use it can be found in this repository on GitHub. Feel free to use it or modify it to your needs.