HackerNews中文版

Hi HN，我一直在开发一个名为 QSS（量化相似性搜索）的向量搜索引擎。它用 C 语言编写，探索了将嵌入向量积极量化为每维度 1 位的想法。它使用 XOR + popcount 进行快速近似搜索，然后使用原始向量的余弦相似度进行重新排序。主要目标是看看在不牺牲太多搜索质量的前提下，量化能做到什么程度——同时在内存使用和速度方面获得显著提升。工作原理嵌入被量化为每维度 1 位（例如，300D → 300 位 → ~40 字节）。搜索使用按位 XOR 和 popcount（汉明距离）完成。使用原始（浮点）嵌入的余弦相似度对短列表进行重新排序。支持 GloVe、Word2Vec 和 fastText 格式。目标分析量化和搜索精度之间的权衡。衡量潜在的速度和内存增益。探索这种方法如何随更大的数据集扩展。初步测试到目前为止，我只运行了几个小规模的测试，但初步迹象令人鼓舞：对于某些查询（例如“hello”、“italy”），前 30 个结果与全精度余弦搜索匹配。在 Word2Vec 嵌入上，量化管道的速度比标准余弦相似度循环快 18 倍。这些结果目前只是个例——我会在深入基准测试之前分享该项目，以获得反馈。其他说明目前，单词查找是线性的，并且未优化——重点是相似性搜索逻辑。测试是在 2018 年的 iMac（3.6 GHz Intel i3）上单线程进行的。如果您对向量搜索、量化或低级性能技巧感兴趣，我很乐意听取您的想法：您认为这种激进的量化可以在大规模应用中发挥作用吗？您是否推荐其他值得探索的快速近似搜索技术？代码库在这里：https://github.com/buddyspencer/QSS 感谢您的阅读！

查看原文

Hi HN,I've been working on a vector search engine called QSS (Quantized Similarity Search). It's written in C and explores the idea of aggressively quantizing embedding vectors to 1-bit per dimension. It uses XOR + popcount for fast approximate search, followed by re-ranking using cosine similarity on the original vectors.The main goal is to see how far you can push quantization without sacrificing too much search quality—while gaining significantly in memory usage and speed.How it works Embeddings are quantized to 1 bit per dimension (e.g. 300D → 300 bits → ~40 bytes).Search is done using bitwise XOR and popcount (Hamming distance).A shortlist is re-ranked using cosine similarity on the original (float) embeddings.Supports GloVe, Word2Vec, and fastText formats.Goals Analyze the trade-offs between quantization and search accuracy.Measure potential speed and memory gains.Explore how this approach scales with larger datasets.Preliminary tests I’ve only run a few small-scale tests so far, but the early signs are encouraging:For some queries (e.g. "hello", "italy"), the top 30 results matched the full-precision cosine search.On Word2Vec embeddings, the quantized pipeline was up to 18× faster than the standard cosine similarity loop.These results are anecdotal for now—I’m sharing the project early to get feedback before going deeper into benchmarks.Other notes Word lookup is linear and unoptimized for now—focus is on the similarity search logic.Testing has been done single-threaded on a 2018 iMac (3.6 GHz Intel i3).If you're interested in vector search, quantization, or just low-level performance tricks, I'd love your thoughts:Do you think this kind of aggressive quantization could work at scale?Are there other fast approximate search techniques you'd recommend exploring?Repo is here: https://github.com/buddyspencer/QSSThanks for reading!

Ask HN: 关于 "QSS" 的反馈——一个用 C 语言编写的量化向量搜索引擎