Trained on one of the world’s largest real‑world video datasets from Grass and hosted on Inference.net’s scalable AI infrastructure, the model delivers high‑accuracy video annotation at a fraction of the cost and is available today via API.
NEW YORK, August 14th, 2025 – Grass and Inference.net today announced the launch of ClipTagger‑12b, a new video annotation model built to identify actions, objects, and logos in video with high accuracy and detail. Applicable across domains from autonomous vehicles to warehouse robotics, it strengthens the perception capabilities that many AI systems rely on.
In benchmark tests, ClipTagger-12b outperforms Claude 4 and GPT‑4.1 on annotation metrics like ROUGE and BLEU, while running up to 17x cheaper.
See full benchmarks on Hugging Face
Developed through a collaboration between Grass and Inference.net, ClipTagger-12b was trained by Inference on a subset of over 1 billion videos collected from the public web by Grass and is hosted on Inference’s distributed compute network.
“It’s entirely possible to train low-cost, state-of-the-art models with the right data and good engineering,” said Sam Hogan, CEO at Inference.net.
“We believe the future of AI depends on keeping the web open and building the infrastructure needed to turn it into something models can learn from. This was a step in that direction,” said Andrej Radonjic, CEO at Wynd Labs.
The collaboration shows how specialized teams can build and deploy high‑performance models once limited to large AI labs, making advanced video annotation accessible to more developers and businesses.
ClipTagger‑12b is live now on Inference.net, where developers and businesses can access it via API. Model weights and additional resources are also available through the Hugging Face repository. Researchers can apply for up to $10,000 of credits at inference.net/grants.
About Grass
Grass is an application anyone can download to share their unused internet connection, powering a global network used to gather real‑world data for training AI models.
About Inference.net
Inference.net is a distributed compute network optimized for running AI models at scale, enabling developers to deploy and serve models without relying on centralized cloud infrastructure.
Media Contact
media@grassfoundation.io