研究工具体验

1作者: hodltothestars8 个月前
我最近一直在使用 wandb,也研究过 neptune.ai 和一些开源替代方案,但我一直觉得协作和版本控制(例如,将代码快照与训练运行关联等)很笨拙。我还认为,如果能对我的长时间运行进行某种监控,以便在满足特定条件时向我发出警报,甚至能够远程停止或重启带有超参数修改的运行(采取潜在的智能体行动),比如通过手机,那就太好了。 我很好奇大家在使用这些(以及类似的)AI 开发者平台/可观测性层方面的经验是什么,以及您发现了现有解决方案的哪些不足或抱怨(如果有的话)。我发现研究过程非常痛苦,想知道这是否只是我个人的问题。
查看原文
I’ve been using wandb quite a bit and looked into neptune ai and some open source alternatives, but I’ve always felt that collaboration and version control (e.g. associating code snapshots with training runs etc) is clunky. I was also thinking it’d be nice to have some kind of monitoring on my longer runs to alert me on certain criteria, or even be able to stop or restart a run with hyperparam modifications remotely (take potentially agentic actions), like from my phone.<p>I was curious what all of your experiences have been with these (and similar) AI developer platforms &#x2F; observability layers and what you’ve found lacking or gripes you have with the existing solutions (if anything). I&#x27;ve found the research process extremely painful and was wondering if this was just me.