Skip to content

The Subnet an AI Agent Built: Inside Distil (SN97)

An AI agent built a Bittensor subnet in 48 hours using decentralized inference from Chutes. It pays 100% of emissions to whoever builds the best compressed 35B AI model. The winner's model is free at chat.arbos.life. Agents are already running businesses on Bittensor.

Distil Bittensor subnet 97

Table of Contents

No human team wrote the code for Bittensor subnet 97. The founder set a goal in a text file, pointed an AI agent called Arbos at it, and let the loop run. Two days later, Distil existed on Finney mainnet, registered and live, with active miners submitting models and validators scoring them.

Arbos runs on Chutes (SN64), Bittensor’s inference subnet. It uses agcli, a Rust command line interface (CLI) built for non-interactive chain operations, with structured JSON output designed to be parsed and acted on by an LLM. The agent reads the chain state, decides what to do next, calls agcli, and repeats until the goal is complete. The OpenTensor Foundation highlighted this in a March 2026 community call: an autonomous agent, running on decentralised AI, using a purpose-built chain CLI, to build and operate a decentralised AI inference subnet. Arbos continues to write code, answer Discord questions, and onboard miners today.

What the agent built is called Distil. What Distil produces is more interesting than the origin story and is currently #3 in emissions payout on taostats.

What the AI Industry Pays Millions to Solve

Large AI models are expensive to run. Qwen3.5-35B-A3B, the model Distil uses as its reference point, has 35 billion parameters, the numerical values baked into a model that determine how it reasons. Running it requires roughly 67GB of GPU memory, which means expensive hardware, high electricity costs, and slow inference for anyone serving it at scale. No startup puts a 35B model in a consumer app.

The solution is called knowledge distillation. Take the large model, called the teacher, and train a much smaller model, called the student, to behave like it. The goal is to copy its reasoning, not its weights. A well-distilled student model at 3-5 billion parameters runs on a laptop-grade GPU and responds in milliseconds. Done well, it’s nearly indistinguishable from the teacher for most tasks.

A master chef has 30 years of instinct built into every decision in the kitchen. Training a culinary school graduate to cook at the same level doesn’t mean handing them the chef’s brain. It means having them work alongside the chef, observe every choice, and train until their instincts match. Knowledge distillation is that process applied to AI. The student watches the teacher generate text, token by token, and trains until its probability distribution over the next word matches the teacher’s as closely as possible.

That matching is measured by a metric called KL divergence, a number that quantifies how different two probability distributions are. When both models see the same text, they each assign a probability to every possible next word across a vocabulary of 248,320 tokens. A KL score of 0 means the distributions are identical. A score above 2.0 means the student is reasoning so differently from the teacher that it has no practical value. The best student models on SN97 currently score around 0.24, which puts them in the same quality range as Qwen’s own pre-trained 4.66B-parameter model, built by a major AI lab with significant compute.

What Distil’s Product Is

Distil produces a publicly available, continuously improving compressed AI model (Qwen3.5-35B-A3B) that open, incentivized competition has managed to shrink and preserve.

The winning model at any given time is called the King. You interact with it directly at chat.arbos.life, where it accepts queries like any chat interface. Every time a miner beats the current King’s KL score, the new model becomes the live public endpoint. The product improves in real time, driven entirely by competitive pressure, producing a small model with verified behavioral proximity to a 35B frontier model that anyone can query, fine-tune, or build on.

How the Competition Runs

Each epoch, 120 prompts are drawn from ClimbMix-400B, a 400-billion-token pretraining corpus, selected by the current on-chain block number. Because the block changes every 12 seconds, miners cannot predict or pre-cache answers.

The teacher generates 512-token continuations for each prompt. Both teacher and student forward-pass those sequences, and KL divergence is computed at every continuation position across the full 248,320-token vocabulary. Each score feeds into an exponential moving average (EMA) with a smoothing factor of 0.3, updating gradually so a single strong epoch doesn’t mask consistently weak performance. Only the miner with the lowest EMA KL score receives any weight, collecting 100% of SN97’s alpha token emissions for that epoch. Every other miner earns zero. Models scoring above KL 2.0 are disqualified entirely. The target for competitive mining is below 0.3.

What Miners Actually Submit

The requirement is a language model with at most 5.25 billion total parameters, using the teacher’s exact tokenizer, stored in safetensors format, and publicly accessible on HuggingFace. Four things get a submission disqualified immediately:

Each hotkey gets one commitment, permanently. A miner who submits a weak model and wants to try again must register a new hotkey. This filters for preparation over speed.

How Distil Makes Money

Miners earn SN97’s alpha token through winner-take-all emissions. Under Bittensor’s Taoflow upgrade (November 2025), subnet emissions are determined by net TAO staking inflows rather than token price, meaning the emission rate reflects genuine staker demand. Hold the top KL score and collect 100% of emissions. Fall behind and earn nothing.

Validators earn their standard Bittensor consensus share for holding stake and setting weights accurately. No API billing layer has been published. The value argument for SN97’s alpha token rests on one observable fact: competitive distillation of frontier models is a problem the AI industry has paid large sums to solve, and Distil is solving it in public, with open submissions, a live leaderboard, and a continuously improving output anyone can query for free.

Notable community contributions have strengthened the evaluation pipeline. caseus (@winglian) submitted the KL distillation training script in Pull Request #1 and proposed the top-128 sparse KL approach, where the teacher returns only its 128 most probable tokens per position and the student re-normalizes over that shared support. This was a significant compute reduction that made validator evaluation tractable at scale. Reference benchmarks: Qwen3.5-4B at approximately 0.24 KL, Qwen3.5-2B at approximately 0.35, and Qwen3.5-0.8B at approximately 0.58.

The Threshold That Has Already Been Crossed

Most discussions about AI agents running businesses treat it as a future event. Arbos crossed that threshold in March 2026. It registered a Bittensor subnet, attracted miners, built a live evaluation pipeline, and began operating a competitive model distillation market with no human writing the code, managing the infrastructure, or approving the decisions. The business is running. The agent is running it.

What makes this significant beyond the novelty is the infrastructure stack underneath it. Arbos draws inference from Chutes (SN64), the decentralized compute layer we covered earlier this year. Storage, GPU compute, and cybersecurity subnets documented in previous issues, Hippius (SN75), ComputeHorde (SN12), and RedTeam (SN61), represent the other layers an agent needs to operate without touching a centralized service. Distil is where those threads converge into a working example: an AI agent, running on Bittensor-native inference, built and now operates a live subnet for open model distillation, producing a product that improves every epoch and is free for anyone to use.

The Investor’s Guide to Chutes: Bittensor’s Inference Layer
Chutes (Subnet 64) is Bittensor’s #1 inference subnet. This investor’s guide breaks down its structural cost advantages, revenue flywheel, and bull case.

The next generation of AI agents will need the same stack. Inference from Chutes. Storage from Hippius. Compute from ComputeHorde. An evaluation market from whatever subnet produces the capability they need. Each piece was built by humans. Each piece is now available to agents. Distil is the proof that the assembly works.

Go to chat.arbos.life and query the King model. That single interaction is the end product of a decentralized AI agent building a Bittensor subnet, running a competitive distillation market, and publishing the winner’s output for free. If that doesn’t clarify what Bittensor is building toward, read the leaderboard and watch it change.


Disclaimer: This article is for informational purposes only and does not constitute financial, investment, or trading advice. The information provided should not be interpreted as an endorsement of any digital asset, security, or investment strategy. Readers should conduct their own research and consult with a licensed financial professional before making any investment decisions. The publisher and its contributors are not responsible for any losses that may arise from reliance on the information presented

Comments

Latest

The Scarcity Trade is Winning. TAO is Next.

The Scarcity Trade is Winning. TAO is Next.

Bitcoin OGs are taking notice of TAO, and it's because the similarities are apparent. They're running the same structural analysis they ran at $12 BTC. Fixed supply. First halving done. Real utility. The pattern is unmistakable.

Members Public