Bookmark
Quantized Retrieval - a Hugging Face Space by sentence-transformers | Tom Aarsen | 10 comments
→ linkedin.com"You can perform 200ms search over 40 million texts using just a CPU server, 8GB of RAM, and 40GB of disk space. The trick: Binary search with int8 rescoring."