Tag: reinforcement-learning

All the articles with the tag "reinforcement-learning".

Swirling Thoughts on AI, LLMs, Novel Hypothesis Generation and Crypto?
Published:Apr 3, 2025 at 09:45 PM in 14 min read
In this post, I swirl my own thoughts on the very topic of swirling thoughts in the context of AI, LLMs and their potential for novel hypothesis generation. I ground this discussion in some of the other ongoing parallel lines of thoughts/investigation including GRPO, reinforcement learning, the future of science and scientific peer review and the notion of being able to "purchase" scientific innovation.
Multistep Reasoning Agents (with GRPO & RLEF) - Project Euler Edition
Published:Mar 23, 2025 at 02:38 AM in 21 min read
Converting chat LLMs into reasoning agents through GRPO and reinforcement learning from execution feedback — achieving performance improvements on Project Euler's algorithmic challenges with multi-step reasoning, tool use, and code execution.
o3 and the Future of Science
Published:Dec 23, 2024 at 10:04 PM in 23 min read
o3 and Human-AI scientific collaboration

Swirling Thoughts on AI, LLMs, Novel Hypothesis Generation and Crypto?