我们为什么还要“扁平化”嵌入空间?
5 分•作者: Intrinisical-AI•5 个月前
大多数稠密检索系统依赖于余弦相似度或点积,这隐含地假设了嵌入空间是平坦的。但嵌入空间通常存在于具有非均匀结构的弯曲流形上——稠密区域、语义间隙、不对称路径。
我一直在探索使用:
* 里奇曲率作为重新排序的信号
* 软图以保留局部密度
* 训练期间的测地线感知损失
好奇是否有人尝试过类似的方法?特别是在信息检索、问答或可解释性方面。如果有兴趣,很乐意分享一些实验(FiQA/BEIR)。
查看原文
ost dense retrieval systems rely on cosine similarity or dot-product, which implicitly assumes a flat embedding space. But embedding spaces often live on curved manifolds with non-uniform structure—dense regions, semantic gaps, asymmetric paths.<p>I’ve been exploring the use of:<p>- Ricci curvature as a reranking signal<p>- Soft-graphs to preserve local density<p>- Geodesic-aware losses during training<p>Curious if others have tried anything similar? Especially in information retrieval, QA, or explainability. Happy to share some experiments (FiQA/BEIR) if there's interest.