Blog & Demos

Tutorials, case studies, benchmarks, and open-source demos — everything you need to build with small language models.

Making FunctionGemma Work: Multi-Turn Tool Calling at 270M Parameters
Demo Tool Calling

Making FunctionGemma Work: Multi-Turn Tool Calling at 270M Parameters

Google's FunctionGemma scores just 10-39% on multi-turn tool calling out of the box, but after fine-tuning with distil labs it reaches 90-97% accuracy across three benchmarks, matching or exceeding a 120B teacher model at 270M parameters.

When Does Reinforcement Learning Help Small Language Models?
Benchmark ClassificationQuestion AnsweringTool Calling

When Does Reinforcement Learning Help Small Language Models?

A controlled experiment across 12 datasets reveals that adding RLVR after SFT consistently improves text generation tasks (+2.0pp) but provides no reliable benefit for structured tasks like classification and function calling.

The LLM in Your Voice Assistant Is the Latency Bottleneck. Replace It with an SLM.
Guide Tool CallingOn-Prem / Edge

The LLM in Your Voice Assistant Is the Latency Bottleneck. Replace It with an SLM.

Voice assistants using cloud LLMs add 700+ms of latency per turn. A fine-tuned small language model drops the brain stage to ~40ms while matching or exceeding LLM accuracy on bounded tasks, with full data privacy.

pytest-generator: AI-Powered Unit Test Generation
Demo On-Prem / Edge

pytest-generator: AI-Powered Unit Test Generation

Generate high-quality pytest test cases from Python function signatures and docstrings. Runs entirely on your local machine with zero API costs and complete privacy.

Helping Rocketgraph's customers with an OpenCypher-specialized small language model
Case Study On-Prem / EdgeTool Calling

Helping Rocketgraph's customers with an OpenCypher-specialized small language model

How distil labs partnered with Rocketgraph to finetune a small language model specialized in translating user questions to Rocketgraph-compliant Cypher queries on IBM Power hardware.

Vibe-Tuning: The Art of Fine-Tuning Small Language Models with a Prompt
Guide Classification

Vibe-Tuning: The Art of Fine-Tuning Small Language Models with a Prompt

Fine-tuning is a pain – you need datasets, ML expertise, and a stack of GPUs just to get started. Not anymore. With model vibe-tuning, you go from prompt to production-ready model without these headaches. This blog post shows you exactly how to build one, starting with just a prompt.

Teaching Small Language Models New Skills - Training a Local Cybersecurity Agent
Case Study Agentic AIOn-Prem / Edge

Teaching Small Language Models New Skills - Training a Local Cybersecurity Agent

How distil labs partnered with Octodet to train a small language model that outperforms LLMs 30x its size at analyzing cybersecurity logs, while running entirely on-premises to meet strict privacy requirements.

AI Slop Detector: Catch AI-generated text with a 270M model that runs in your browser
Demo ClassificationOn-Prem / Edge

AI Slop Detector: Catch AI-generated text with a 270M model that runs in your browser

A fine-tuned 270M parameter model that detects AI-generated text entirely in your browser — no API keys, no cloud, no data leakage. Matches 120B teacher accuracy at 400x smaller size.

Train Your SLM with the distil labs Claude Skill
Guide Question Answering

Train Your SLM with the distil labs Claude Skill

A step-by-step walkthrough of training a Text2SQL small language model using the distil labs Claude Code skill, going from raw conversation data to a working local model in a single conversation.