Posts
Transformer Notes (V): Training Techniques
2025-12-20
Transformer Notes (II): Core Components
2025-12-20
Transformer Notes (I): Fundamentals
2025-12-20
RL Notes (6): LLM Alignment (Part 2)
2025-12-19
RL Notes (5): LLM Alignment (Part 1)
2025-12-19
RL Notes (4): Model-Based Methods & MARL
2025-12-19
RL Notes (3): Policy-Based RL
2025-12-19
RL Notes (2): Value-Based RL
2025-12-19
RL Notes (1): Fundamentals
2025-12-19
Tags
Transformer (8)
RLHF (4)
Inference (3)
Reasoning (2)
PPO (2)
LLM (2)
GRPO (2)
Efficiency (2)
Alignment (2)
Training (1)
Speculative Decoding (1)
Reproducibility (1)
RLVR (1)
Negative Samples (1)
Multimodal (1)
MoE (1)
MCTS (1)
Evaluation (1)
Entropy (1)
Determinism (1)
DQN (1)
CUDA (1)
Batch Invariance (1)
Attention (1)
AlphaZero (1)
RL (10)