马修·鲁索(麻省理工学院)讲解语义查询引擎
3 分•作者: CShorten•7 个月前
AI 正在改变数据库系统。 迄今为止,影响最大的或许是自然语言到查询语言的转换,也就是文本到 SQL (Text-to-SQL)。 然而,另一项重大创新正在酝酿中。
我非常兴奋地发布了 Weaviate 播客的第 131 集,嘉宾是麻省理工学院的博士生 Matthew Russo!
AI 为我们的查询语言带来了新的语义运算符。 例如,我们都熟悉 WHERE 过滤器。 现在我们有了 AI_WHERE,其中 LLM 或其他 AI 模型计算过滤器值,而无需它已在数据库中可用!
```sql
SELECT * FROM podcasts AI_WHERE “Text-to-SQL” in topics
```
语义过滤器仅仅是冰山一角,语义运算符的名册还包括语义连接、映射、排序、分类、分组和聚合!
而且这还不止于此! 关系代数及其对数据库系统影响的核心思想之一是查询规划和寻找应用过滤器的最佳顺序。 例如,假设您有两个过滤器,汽车是红色的,汽车是宝马。 现在假设数据集只包含 100 辆宝马,但有 50,000 辆红色汽车!! 首先应用宝马过滤器将限制下一个过滤器集合的大小!
现在 LLM 参与其中,这个基本思想有了各种各样的扩展! 这一机遇正在催生新的查询引擎和声明式优化器,例如 Palimpzest、LOTUS 等!
这个播客里有这么多有趣的干货,很喜欢和 Matthew 讨论这些事情,希望您觉得它也很有趣!
YouTube:https://youtu.be/koPBr9W4qU0
Spotify:https://spotifycreators-web.app.link/e/ddUhVMmLoYb
Medium:https://medium.com/@connorshorten300/semantic-query-engines-with-matthew-russo-weaviate-podcast-131-131a42bbc521
查看原文
AI is transforming Database Systems. Perhaps the biggest impact so far has been natural language to query language translations, or Text-to-SQL. However, another massive innovation is brewing.<p>I am SUPER EXCITED to publish the 131st episode of the Weaviate Podcast with Matthew Russo, a Ph.D. student at MIT!<p>AI presents new Semantic Operators for our query languages. For example, we are all familiar with the WHERE filter. Now we have AI_WHERE, in which an LLM or another AI model computes the filter value without needing it to be already available in the database!<p>```sql
SELECT * FROM podcasts AI_WHERE “Text-to-SQL” in topics
```<p>Semantic Filters are just the tip of iceberg, the roster of Semantic Operators further includes Semantic Joins, Map, Rank, Classify, Groupby, and Aggregation!<p>And it doesn’t stop there! One of the core ideas for Relational Algebra and how its influenced Database Systems is query planning and finding the optimal order to apply filters. For example, let’s say you have two filters, the car is red and the car is a BMW. Now let’s say the dataset only contains 100 BMW, but 50,000 red cars!! Applying the BMW filter first will limit the size of the set for the next filter!<p>This foundational idea has all sorts of extensions now that LLMs are involved! This opportunity is giving rise to new query engines and declarative optimizers such as Palimpzest, LOTUS, and others!<p>So many interesting nuggets in this podcast, loved discussing these things with Matthew, and I hope you find it interesting!<p>YouTube: https://youtu.be/koPBr9W4qU0<p>Spotify: https://spotifycreators-web.app.link/e/ddUhVMmLoYb<p>Medium: https://medium.com/@connorshorten300/semantic-query-engines-with-matthew-russo-weaviate-podcast-131-131a42bbc521