Ask HN: 关于 "QSS" 的反馈——一个用 C 语言编写的量化向量搜索引擎
3 分•作者: wmolino•6 个月前
Hi HN,
我一直在开发一个名为 QSS(量化相似性搜索)的向量搜索引擎。它用 C 语言编写,探索了将嵌入向量积极量化为每维度 1 位的想法。它使用 XOR + popcount 进行快速近似搜索,然后使用原始向量的余弦相似度进行重新排序。
主要目标是看看在不牺牲太多搜索质量的前提下,量化能做到什么程度——同时在内存使用和速度方面获得显著提升。
工作原理
嵌入被量化为每维度 1 位(例如,300D → 300 位 → ~40 字节)。
搜索使用按位 XOR 和 popcount(汉明距离)完成。
使用原始(浮点)嵌入的余弦相似度对短列表进行重新排序。
支持 GloVe、Word2Vec 和 fastText 格式。
目标
分析量化和搜索精度之间的权衡。
衡量潜在的速度和内存增益。
探索这种方法如何随更大的数据集扩展。
初步测试
到目前为止,我只运行了几个小规模的测试,但初步迹象令人鼓舞:
对于某些查询(例如“hello”、“italy”),前 30 个结果与全精度余弦搜索匹配。
在 Word2Vec 嵌入上,量化管道的速度比标准余弦相似度循环快 18 倍。
这些结果目前只是个例——我会在深入基准测试之前分享该项目,以获得反馈。
其他说明
目前,单词查找是线性的,并且未优化——重点是相似性搜索逻辑。
测试是在 2018 年的 iMac(3.6 GHz Intel i3)上单线程进行的。
如果您对向量搜索、量化或低级性能技巧感兴趣,我很乐意听取您的想法:
您认为这种激进的量化可以在大规模应用中发挥作用吗?
您是否推荐其他值得探索的快速近似搜索技术?
代码库在这里:https://github.com/buddyspencer/QSS
感谢您的阅读!
查看原文
Hi HN,<p>I've been working on a vector search engine called QSS (Quantized Similarity Search). It's written in C and explores the idea of aggressively quantizing embedding vectors to 1-bit per dimension. It uses XOR + popcount for fast approximate search, followed by re-ranking using cosine similarity on the original vectors.<p>The main goal is to see how far you can push quantization without sacrificing too much search quality—while gaining significantly in memory usage and speed.<p>How it works
Embeddings are quantized to 1 bit per dimension (e.g. 300D → 300 bits → ~40 bytes).<p>Search is done using bitwise XOR and popcount (Hamming distance).<p>A shortlist is re-ranked using cosine similarity on the original (float) embeddings.<p>Supports GloVe, Word2Vec, and fastText formats.<p>Goals
Analyze the trade-offs between quantization and search accuracy.<p>Measure potential speed and memory gains.<p>Explore how this approach scales with larger datasets.<p>Preliminary tests
I’ve only run a few small-scale tests so far, but the early signs are encouraging:<p>For some queries (e.g. "hello", "italy"), the top 30 results matched the full-precision cosine search.<p>On Word2Vec embeddings, the quantized pipeline was up to 18× faster than the standard cosine similarity loop.<p>These results are anecdotal for now—I’m sharing the project early to get feedback before going deeper into benchmarks.<p>Other notes
Word lookup is linear and unoptimized for now—focus is on the similarity search logic.<p>Testing has been done single-threaded on a 2018 iMac (3.6 GHz Intel i3).<p>If you're interested in vector search, quantization, or just low-level performance tricks, I'd love your thoughts:<p>Do you think this kind of aggressive quantization could work at scale?<p>Are there other fast approximate search techniques you'd recommend exploring?<p>Repo is here: https://github.com/buddyspencer/QSS<p>Thanks for reading!