TIL: Resolving RAGatouille OOM Error and `faiss-gpu` Warning

information retrieval

deep learning

RAGatouille

A couple of fixes as I work on indexing large document collections (6M+) using RAGatouille.

Author

Vishal Bakshi

Published

May 8, 2025

I’m in the process of indexing the UKPLab/DAPR datasets, which span in size from ~70k to ~32M documents. Using a RTX3090, I ran into an OOM error (during search) and a warning stating that faiss-cpu was being used instead of faiss-gpu, causing the indexing process to take longer.

I found this RAGatouille GitHub issue which recommended lowering the batch_size in ColBERT’s IndexScorer.score_pids method. I made that change (from 2^20 to 2^16) and that resolved the OOM error, at least for the 2.68M document collection (NaturalQuestions).

When I was using Google Colab GPUs, the following install commands correctly installed faiss-gpu after installing RAGatouille:

pip uninstall -y faiss-cpu
pip install faiss-gpu-cu12

Using an RTX3090 (not on Colab), this was not correctly installing faiss-gpu, leading to the following RAGatouille warning during indexing, and as a result, using the CPU for indexing (which eventually crashed the kernel):

________________________________________________________________________________
WARNING! You have a GPU available, but only `faiss-cpu` is currently installed.
This means that indexing will be slow. To make use of your GPU
Please install `faiss-gpu` by running:
pip uninstall --y faiss-cpu & pip install faiss-gpu
________________________________________________________________________________

This warning is thrown in RAGatouille’s PLAIDModelIndex.build if hasattr(faiss, "StandardGpuResources") is False.

Looking at the faiss repo, they recommend using conda for installation. I ran conda install pytorch::faiss-gpu, restarted the kernel, confirmed that hasattr(faiss, "StandardGpuResources") returns True and was successfully able to circumvent that warning. As a result, RAGatouille was able to use faiss-gpu and it was able to index 2M document.

It’s still TBD if this allows me to finish indexing all of my datasets (especially the 13M and 32M ones).

In a conversation with Claude, I outlined a few different scenarios that I may have to (get to) pursue:

Since both repos are open sourced, I can fork them (which I have) and add print statements/modify code to debug as needed.

I am running into a couple issues that I’m trying to resolve. I don’t want you to suggest any code yet, let’s think this through.

When performing retrieval on a 2.6M document collection on an RTX3090, RAGatouille.search throws an OOM error.

So I chose to run retrieval on the RAGatouille index using vanilla ColBERT and it did not run out of memory.

However, the retrieval results are significantly different between ColBERT and RAGatouille.

Each of these gives me a uniquely interesting direction to pursue:

Why does RAGatouille throw the OOM error? 2.6M documents (index with 8.5GB disk space) is not small, but not terribly large. There’s an issue open in RAGatouille where they note that changing batch_size in score_pids in IndexScorer resolves an OOM error during search. I want to give this a try!

Why does ColBERT not run out of memory? But RAGatouille does?

Why are the retrieval results between RAGatouille and ColBERT different? The RAGatouille documentation says the following, which leads me to believe they should yield the same results:

If you’d like to use more than RAGatouille, ColBERT has a growing number of integrations, and they all fully support models trained or fine-tuned with RAGatouille! The official ColBERT implementation has a built-in query server (using Flask), which you can easily query via API requests and does support indexes generated with RAGatouille! This should be enough for most small applications, so long as you can persist the index on disk.

Each of these explorations are fascinating, and I think I’m going to pursue each one.

resolving the RAGatouille OOM error would solve my immediate problem. ideally I tackle this first.

Understanding memory usage between RAGatouille and ColBERT has been an ongoing interest of mine. I have memory profiled both before during indexing, but not during search. This would be a very interesting research task.

Debugging the searching/scoring difference would be probably the hardest task. I would likely have to trace down all function calls, checking intermedite values, comparing them between the two frameworks. Absolutely fascinating and would learn a ton. Would also be a significant achievement to resolve the discrepancy (maybe something in the Config? Maybe something more fundamental?)

TBD on whether I pursue points 2 and 3.