Posts
All the articles I've posted.
o3 and the Future of Science
Published: at 10:04 PM in 22 min reado3 and Human-AI scientific collaboration
GPU Puzzles
Published: at 05:32 AM in 10 min readMy solutions to srush's GPU Puzzles
Computing the Jacobian of a Matrix Product
Published: at 03:31 AM in 6 min readA step-by-step illustration of how you can visualize and derive the Jacobian of a matrix product, using a concrete example.
AutoDiff Puzzles
Published: at 05:32 AM in 19 min readMy solutions to srush's AutoDiff Puzzles. This is useful as a quick refresher for computing gradients.
Back to Backprop
Published: at 06:42 AM in 29 min readA review of backpropagation, the workhorse of deep learning.
Estimating Transformer Model Properties: A Deep Dive
Published: at 05:44 AM in 8 min readIn this post, we'll explore how to estimate the size of a Transformer model, including the number of parameters, FLOPs, peak memory footprint, and checkpoint size.