Show HN: 将 Kafka 数据流导入 Ducklake

1作者: dm035146 个月前
Ducklake 是 MotherDuck 推出的一种新的湖存储格式。它旨在通过将元数据集中存储在 Postgres 中(而不是直接存储在 Blob 存储中)来解决 Iceberg 的一些问题。 SQLFlow 是一个流处理引擎,它从 Kafka 摄取数据,对该流运行 SQL 查询,并将输出结果输出。 SQLFlow 在流处理期间提供 DuckDB 上下文。这使得将数据从 Kafka 流式传输到 Ducklake 变得非常简单! [https://sql-flow.com/docs/tutorials/ducklake-sink/](https://sql-flow.com/docs/tutorials/ducklake-sink/) [https://github.com/turbolytics/sql-flow](https://github.com/turbolytics/sql-flow)
查看原文
Ducklake is a new lake storage format by MotherDuck. It aims to solve some of the issues with Iceberg by centralizing metadata in Postgres, instead of directly on blob storage.<p>SQLFlow is a stream processing engine that ingests data from kafka, runs sql against that stream and sinks the output.<p>SQLFlow has a duckdb context available during stream processing. This make it trivial to stream data from Kafka to Ducklake!<p><a href="https:&#x2F;&#x2F;sql-flow.com&#x2F;docs&#x2F;tutorials&#x2F;ducklake-sink&#x2F;" rel="nofollow">https:&#x2F;&#x2F;sql-flow.com&#x2F;docs&#x2F;tutorials&#x2F;ducklake-sink&#x2F;</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;turbolytics&#x2F;sql-flow">https:&#x2F;&#x2F;github.com&#x2F;turbolytics&#x2F;sql-flow</a>