Tag: reinforcement-learning
All the articles with the tag "reinforcement-learning".
Swirling Thoughts on AI, LLMs, Novel Hypothesis Generation and Crypto?
Published: at 09:45 PM in 14 min readIn this post, I swirl my own thoughts on the very topic of swirling thoughts in the context of AI, LLMs and their potential for novel hypothesis generation. I ground this discussion in some of the other ongoing parallel lines of thoughts/investigation including GRPO, reinforcement learning, the future of science and scientific peer review and the notion of being able to "purchase" scientific innovation.
Multistep Reasoning Agents (with GRPO & RLEF) - Project Euler Edition
Published: at 02:38 AM in 21 min readConverting chat LLMs into reasoning agents through GRPO and reinforcement learning from execution feedback — achieving performance improvements on Project Euler's algorithmic challenges with multi-step reasoning, tool use, and code execution.
o3 and the Future of Science
Published: at 10:04 PM in 23 min reado3 and Human-AI scientific collaboration