Show HN: SJT - 一种轻量级的结构化 JSON 表格格式,适用于 API
1 分•作者: yukiakai•9 个月前
大家好,
我构建了一个名为 SJT(结构化 JSON 表格)的小型实验性格式,用于优化 API 中的数据传输。
这个想法很简单:SJT 将结构(标头)与值分开,而不是为每一行重复对象键。这使得它更紧凑,也更容易进行流式传输。
例如,使用 Discord 的 /messages 端点:
原始 JSON 负载:约 50,110 字节
使用 SJT 编码的相同数据:约 26,494 字节
因此,您大约可以减少 50% 的大小,同时仍然能够增量解码(逐条记录)。令人惊讶的是,解码甚至可以比纯 JSON 更快,因为字符串解析的开销更少。
快速基准测试:
| 格式 | 大小 (KB) | 编码时间 | 解码时间 |
| ----------- | --------- | ----------- | ----------- |
| JSON | 3849.34 | 41.81 毫秒 | 51.86 毫秒 |
| JSON + Gzip | 379.67 | 55.66 毫秒 | 39.61 毫秒 |
| MessagePack | 2858.83 | 51.66 毫秒 | 74.53 毫秒 |
| SJT (json) | 2433.38 | 36.76 毫秒 | 42.13 毫秒 |
| SJT + Gzip | 359.00 | 69.59 毫秒 | 46.82 毫秒 |
测试条件:
数据集:合成表格数据集,包含 50,000 条记录,包含混合基本字段、嵌套数组和嵌套对象(代表大型 REST API 负载)。
运行时:Node.js 20 (V8 引擎)。
实现:JavaScript (通过 sjt.js)。
大小 (KB):未压缩大小,以千字节为单位(针对二进制格式进行估算)。
编码 / 解码 (ms):序列化/反序列化整个数据集的平均时间,以毫秒为单位。
规范:[https://github.com/SJTF/SJT](https://github.com/SJTF/SJT)
JS 实现:[https://github.com/yukiakai212/SJT.js](https://github.com/yukiakai212/SJT.js)
很想听听那些使用过 JSON 密集型 API、流式传输或紧凑数据格式(CSV、Parquet 等)的人的反馈。
查看原文
Hi HN,
I built a small experimental format called SJT (Structured JSON Table) to optimize data transport in APIs.
The idea is simple: instead of repeating object keys for every row, SJT separates the structure (headers) from the values. This makes it both more compact and easier to stream.<p>For example, with Discord’s /messages endpoint:<p>Raw JSON payload: ~50,110 bytes<p>Same data encoded with SJT: ~26,494 bytes<p>So you get about a 50% reduction in size, while still being able to decode incrementally (record by record). Surprisingly, decoding can even be faster than plain JSON, because there’s less string parsing overhead.<p>Quick benchmark:<p>| Format | Size (KB) | Encode Time | Decode Time |<p>| ----------- | --------- | ----------- | ----------- |
| JSON | 3849.34 | 41.81 ms | 51.86 ms |<p>| JSON + Gzip | 379.67 | 55.66 ms | 39.61 ms |<p>| MessagePack | 2858.83 | 51.66 ms | 74.53 ms |<p>| SJT (json) | 2433.38 | 36.76 ms | 42.13 ms |<p>| SJT + Gzip | 359.00 | 69.59 ms | 46.82 ms |<p>Test conditions:<p>Dataset: Synthetic tabular dataset containing 50,000 records with mixed primitive fields, nested arrays, and nested objects (representative of large REST API payloads).<p>Runtime: Node.js 20 (V8 engine).<p>Implementation: JavaScript (via sjt.js).<p>Size (KB): Uncompressed size in kilobytes (estimated for binary formats).<p>Encode / Decode (ms): Average time in milliseconds to serialize/deserialize the entire dataset.<p>Spec: <a href="https://github.com/SJTF/SJT" rel="nofollow">https://github.com/SJTF/SJT</a><p>JS implementation: <a href="https://github.com/yukiakai212/SJT.js" rel="nofollow">https://github.com/yukiakai212/SJT.js</a><p>Curious to hear feedback from people who have worked with JSON-heavy APIs, streaming, or compact data formats (CSV, Parquet, etc.).