提问 HN:从零开始构建相对来说达到 SOTA 水平的 LLM Agent?
2 分•作者: solsane•18 天前
众所周知,OpenAI 并不那么“开放”。
2023 年,我曾玩转过 transformers、RNNs,并且从头到尾理解了它们的工作原理(例如,自己编写了 Keras,可以在白板上画出小型网络),而且我能很快地用 Keras 或 TensorFlow 搭建模型。
后来我找到了一份工作,就再也没碰过这些了。
撇开数据和算力不谈,用最新的技术做一个个人项目性质的基础模型有多难?我听说过 MoE 之类的东西,我想我们现在肯定不会再仅仅在 Keras 里堆砌一堆层和 dropout 了。
查看原文
As we know, OpenAI is not so open.<p>In 2023, I was playing with transformers, RNNs and I had an understanding how it worked from top to bottom (e.g. made my own keras, could whiteboard small nets) and I can throw things together in keras or tf pretty quick<p>I got a job and never touched that again.
Data and compute notwithstanding, how hard would it be to make a pet project foundation model using the latest techniques? I’ve heard about MoE, things like that and I figure we’re not just throwing a bunch of layers and dropout in Keras anymore.