Very long RDB load times with 190 GB RediSearch vector index

290 views
Skip to first unread message

Simon Gibbs

unread,
Jan 15, 2023, 3:21:14 PM1/15/23
to Redis DB
We are running a single Redis instance with a very large vector index. Saving the index to .rdb takes about 10 minutes, loading takes about 20 hours. When loading starts, redis-cli info gives a loading_eta_seconds of about 40k but this increases to over 70k seconds before decreasing.

Just wanted to ask if a 20+ hour load is expected behavior for an index of this size.

Some details of the configuration:

Redis version=7.0.0
RediSearch version 2.4.5
Host OS: Ubuntu 20.04
Host memory: 256 GB
Index Size: about 190 GB (estimate from htop)
Index Vector Count: 43,741,918 (from ft.info 'num_docs')
Index Type: HNSW (plus no indexing of metadata fields)
Vector Dimension: 384
Vector Schema Type: hash
RDB File size: 70 GB (using 'rdbcompression no' in Redis config file)

Alon Reshef

unread,
Jan 17, 2023, 12:38:45 PM1/17/23
to Redis DB
The short answer is yes, 20+ hours load is expected behavior for an index of this size. The reason is that upon loading the index from RDB, we actually re-index all the vectors. So for 20 hours of loading, we get an average of 43,741,918/(20*60*60) =~ 607 vectors being indexed per second.
Note that you can reduce the loading time if you use a cluster with multiple shards - in which case, the vectors should be split among the shards, and indexing is done in parallel. In addition, we are currently working on having a background indexing mechanism in HNSW that will enable parallel indexing on a single Redis instance and reduction of the loading time as a result.

Also, be aware that RediSearch 2.6.3 version is available and has some performance improvements compared to the 2.4.x versions - so you are encouraged to upgrade. 

Regards,
Alon

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

Simon Gibbs

unread,
Jan 19, 2023, 10:48:21 AM1/19/23
to Redis DB
Hello Alon,

Thanks very much for the reply and suggestion about upgrading, we'll definitely do so.

Simon
Reply all
Reply to author
Forward
0 new messages