Replace LLMs with custom SLMs
Faster, cheaper, just as accurater
curl -fsSL https://distillabs.ai/install.sh | sh
Copied!
Talk to us → Trusted by
What distil labs Does
Every Agent Type Supported
All supported out of the box.
- Routing & classification
- Function calling — single and multi-turn
- Structured data extraction
- Question answering
Problem In, Model Out
Start training in 30 min from a prompt, 5–50 examples or production traces. Use the CLI or connect your observability platform directly — whatever fits your workflow.
- Start training with minimal data
- Automated data generation & training
- LLM-level accuracy, up to 280x smaller
Easy Integration
Side-by-side evaluation and hosted API endpoint out of the box. No infrastructure to provision, no GPU clusters to manage.
- Managed inference endpoint included
- Automated teacher evaluation & metrics
- Integrate directly into your stack
From Problem to Model
$ distil model create my-classifier
ID: $MODEL_ID
Name: my-classifier
Created At: 2026-03-23 15:11:08
$ distil model upload-data $MODEL_ID --data ./data-dir
Upload successful. Upload ID: $UPLOAD_ID
01
Upload Data
Create a model and upload your dataset in one go — 10 to 50 diverse examples is usually enough. Supports classification, QA, tool calling, multi-turn tool calling, and more.
$ distil model run-training $MODEL_ID
Kicked off SLM training ID $TRAINING_ID
$ distil model training $MODEL_ID
Training ID: $TRAINING_ID
Status: ◐ Distilling
02
Train Model
Start training with a single command. Get feedback on task performance in minutes and model ready in a few hours. Trained SLMs consistently match frontier models 100x larger.
$ distil model deploy remote $MODEL_ID
Training ID: $TRAINING_ID
Deployment ID: efd60b29-...-4f56dbc0b13f
Status: ✓ Active
URL: https://your.deployment.url/$TRAINING_ID/v1
Secrets api_key: QtwT1Ah7Jaf6DPIHBPDMYdNTKT6ujrnn1hZkGtsb21U
$ distil model invoke $MODEL_ID
╭────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ ℹ Run uv run .../$TRAINING_ID/remote_client.py --question "Your question here" to invoke the model │
╰────────────────────────────────────────────────────────────────────────────────────────────────────╯
03
Deploy & Invoke
Deploy your trained model to a hosted endpoint with one command, then invoke it immediately. No infrastructure to set up — just deploy and call.
from openai import OpenAI
# Just change the base URL — everything else stays the same
client = OpenAI(
base_url="https://your.deployment.url/$TRAINING_ID/v1",
api_key="your-distil-api-key",
)
response = client.chat.completions.create(
model="model",
messages=[{"role": "user", "content": "Classify: I want to return my order"}],
)
print(response.choices[0].message.content)
# → {"label": "return_request", "confidence": 0.97}
04
Integrate
Swap one URL in your existing code — that’s it. The distil labs endpoint is OpenAI-compatible, so any SDK or client that talks to OpenAI works out of the box.
What Our Customers Say
We needed a small model that could power our product on an IBM P11, entirely on-premises. distil labs’ fine-tuned models allowed us to ship a self-contained solution where the SLM and our graph platform coexist on the same hardware. For customers in regulated industries, this means AI-powered query generation with complete data privacy – nothing ever leaves their environment.
David J. Haglin
Co-Founder and CTO at Rocketgraph
Using distil labs, we were able to spin up highly accurate custom small models tailored to our workflows in no time. Those models cut our inference costs by roughly 50% without sacrificing quality. The distil labs team was incredibly supportive as we got started and helped us get to production smoothly.
Lucas Hild
Co-Founder & CTO at Knowunity
The distil labs platform accelerated the release of our cybersecurity-specialized language model, KINDI, enabling faster iterations with greater confidence. As a result, we ship InovaGuard improvements sooner and continuously boost investigation accuracy with every release.
Samir Bennacer
Co-Founder and CTO at Octodet
The Team
Backed by
Demos & Blog
The 10x Inference Tax You Don't Have to Pay
Benchmarking fine-tuned small language models (0.6B-8B) against 10 frontier LLMs across 8 datasets shows that task-specific SLMs match or beat frontier models at 10-100x lower inference cost.
Read more →What Small Language Model Is Best for Fine-Tuning
We benchmarked 15 small language models across 9 tasks to find the best base model for fine-tuning. Qwen3-8B ranks #1 overall. Liquid AI's LFM2 family is the most tunable. Fine-tuned Qwen3-4B matches a 120B+ teacher on 8 of 9 benchmarks.
Read more →Train Your SLM with the distil labs Claude Skill
A step-by-step walkthrough of training a Text2SQL small language model using the distil labs Claude Code skill, going from raw conversation data to a working local model in a single conversation.
Read more →