What Are Manako's Vision Agents And How Do They Work?

Every warehouse, retail store, fuel station, and stadium already has cameras all over the place, recording continuously (and generating footage storage costs), but most enterprises act on less than 2% of what their cameras capture.

That's despite 2025 employee theft total losses reaching ~$50 billion, while shoplifting losses cost U.S. businesses $40-$50 billion.

Those metrics could be significantly reduced by cameras acting on what they see, and Manako is building to capture that opportunity, and more, with its vision agents.

What Are Vision Agents?

Vision Agents are AI-powered systems that turn enterprise cameras into real-time operational tools. Rather than simply recording footage for someone to review later, a Vision Agent is built to watch for a specific event, understand when that event occurs, and trigger an action the moment it happens.

Businesses describe what they want monitored in plain English, and Manako builds a Vision Agent designed for that task. One agent might watch for fuel spills at a gas station, another might detect unsafe activity in a warehouse, while another monitors self-checkout behavior in a retail store. When an event occurs, the Vision Agent can send alerts, generate timestamped evidence, and integrate directly into the workflows businesses already use.

Imagine hiring a security guard whose only job is to watch the aisle where merchandise goes missing most often. But that guard never gets tired, never checks their phone, never misses a shift, and the moment they spot someone pocketing an item, they send your store manager a Slack message with a timestamped clip attached. The manager can then review it immediately and take action.

That guard is a Vision Agent.

And the best part? Businesses can deploy as many of these guards as they need across as many locations as they run.

Setting Up A Vision Agent

Setting up a Vision Agent involves four no-code steps.

You first describe what you need to be watched in plain English; for example, "Alert me when a vehicle parks in the loading bay for more than five minutes" or "Flag any customer who spends more than ninety seconds near self-checkout without scanning an item."

Manako then automatically selects the exact vision skills the job requires and composes them into your Vision Agent optimized for the designated task.

Before going live, businesses can upload their own footage and watch the Vision Agent run on real-world clips from their actual location, before connecting a camera for the real deal. Upon seeing how the agent performs, adjustments can be made to further refine performance.

With the agent ready to execute, Manako finds additional cameras on the network and connects to them automatically, with everything running locally so that Manako itself has zero access to company cameras or footage.

Realistic Costs For Businesses

Most enterprise vision deployments fail because the infrastructure costs kill the business case before the rollout leaves the pilot phase.

Manako's model runs at 19MB and operates on any standard CPU. Meta's SAM3 foundation model, by comparison, runs at 848 million parameters. SAM3 is built to handle thousands of possible tasks across any environment.

"The standard answer for the last ten years was to hire an internal vision team, license a platform, label training data, train custom models per use case, deploy on dedicated GPU hardware, and integrate the outputs with existing dispatch and ticketing systems. The cost of doing that for a single use case at a single site routinely hit the high six figures. The cost of repeating it across a hundred sites and ten use cases was a number procurement consistently said no to." - Manako

Manako's Vision Efficiency Score measures accuracy per megabyte of model size, benchmarked against independently generated ground truth with no model involvement in ground truth creation. The accuracy holds at 19MB. At that footprint, deploying Vision Agents across businesses' full camera estate is economically viable — not just the two locations that got approved for a proof of concept.

Examples of live deployments include Reading Football Club, which uses Manako to process football game data. Manako also formed a strategic alliance with PwC France and PwC Maghreb to integrate Manako's Business Operations World Model into PwC's advisory services, targeting large enterprises with full camera estates.

The Network Behind the Agent

Manako is powered by Bittensor's Score Subnet 44. Score is a decentralized computer vision network where miners compete continuously to produce the most accurate models for specific tasks. The best models then get selected and deployed.

That competition is what keeps Manako's Vision Agents improving without requiring any operator to manage model updates. The network handles the improvement cycle automatically, which means the Vision Agent watching your loading bay today is more accurate than the one from six months ago, without you touching a setting.

Activating Idle Footage

Traditional security platforms can at times charge six-figure rollout costs and leave businesses with a notification queue that gets ignored.

With Manako deploying Vision Agents for enterprises, the 98% of footage that sits idle will now be actionable and could make a meaningful difference in businesses' annual losses.

If even a fraction of theft, operational inefficiencies, and missed incidents can be reduced for Vision Agent early adopters, Manako & Score have a shot at bringing the world's largest enterprises into the Bittensor fold as they look to save billions.

Disclaimer: This article is for informational purposes only and does not constitute financial, investment, or trading advice. The information provided should not be interpreted as an endorsement of any digital asset, security, or investment strategy. Readers should conduct their own research and consult with a licensed financial professional before making any investment decisions. The publisher and its contributors are not responsible for any losses that may arise from reliance on the information presented.