
Reinforcement Learning in LLMs - Why and How
From imitation to optimization: when LLMs need RL, how verifiable rewards unlock reasoning, and a minimal GRPO playbook.
Thoughts on machine learning, life, and everything else.

From imitation to optimization: when LLMs need RL, how verifiable rewards unlock reasoning, and a minimal GRPO playbook.

Concentration of measure pushes Gaussian samples onto a thin shell—here's the intuition, the math, and why typicality matters for generative models.

This note provides a high-level summary of the progress in large language models (LLMs) covering major milestones from Transformers to ChatGPT. The note serves as a fast-paced recap for readers to catch up on this field quickly.

A living collection of advice from mentors, friends, and books.

FIRE solves the money constraint, not the life question. What changes after work becomes optional: days, meaning, contribution, optional work, relationships, health, and how to experiment without over-optimizing retirement.

An overview of major philosophers from the Pre-Socratics to Existentialism and beyond, with curated books, videos, podcasts, and references.
How we built a custom markdown pipeline that handles LaTeX math, image galleries, and rich embeds while keeping content in plain .md files—no MDX required.

Music notation gives us a tidy grid of notes, but physics delivers a messy spectrum of vibrations. Here's why tuning is always a compromise.

A short list of interview preparation resources for Data Scientists, Machine Learning Engineers, Machine Learning Scientists, Quant Developers and Quant Researchers.