ML Notes — English

Diffusion Language Models

Naoto Iwase — Fri, 15 May 2026 00:00:00 GMT

Diffusion Language Models (DLLM) bring the ideas behind the diffusion models that succeeded in image generation into language modeling, with recent large-scale implementations such as LLaDA and Dream. This book organizes the key references needed to understand modern DLLMs, covering their formulation, sampling strategies, and the correspondence with continuous diffusion models.

One-Step Generation

Naoto Iwase — Wed, 11 Feb 2026 00:00:00 GMT

Between 2025 and 2026, methods that overcome the multi-step inference of diffusion models and Flow Matching to generate high-quality images with a single network evaluation (1-NFE) have been rapidly advancing. This series curates four papers driving this field, tracing the technical evolution from extensions of Flow Matching to entirely new paradigms.

Molmo2

Naoto Iwase — Tue, 03 Feb 2026 00:00:00 GMT

Molmo2 (Multimodal Open Language Model 2) is a fully open Vision-Language Model (VLM) family developed by the Allen Institute for AI (AI2) and the University of Washington. Its key distinguishing feature is video grounding capability, which enables the model to precisely indicate “when and where” specific events or objects occur within a video.

Using 9 new datasets (constructed entirely without relying on proprietary models), Molmo2 achieves state-of-the-art performance among open-source models. In particular, it surpasses proprietary models such as Gemini 3 Pro in video pointing and tracking.

Paper: arXiv:2601.10611

Code: github.com/allenai/molmo2

Demo: playground.allenai.org

Olmo 3

Naoto Iwase — Mon, 02 Feb 2026 00:00:00 GMT

Olmo 3 is a family of state-of-the-art, fully-open language models at the 7B and 32B parameter scales developed by the Allen Institute for AI (AI2). This release includes the entire Model Flow, i.e., the full lifecycle of the family of models, including every stage, checkpoint, data point, and dependency used to build it.

Paper: arXiv:2512.13961