Training AI that acts in the real world — robots, autonomous vehicles, drones — requires diverse, accurate, and reliable data. Acquiring them only in real environments is expensive, slow, and sometimes dangerous. NVIDIA has upgraded its Open World Foundation Models (WFMs) to close that gap: Cosmos Predict 2.5 and Cosmos Transfer 2.5 integrate with Omniverse and the Isaac ecosystem to generate physically plausible synthetic data at scale and accelerate the transition from simulation to the real world.
The update centers around two ideas: unifying world generation and variating it at will (weather, lighting, terrain) with fine controls and multi-camera consistency. The goal is to test and validate physics-based AI models with scenario coverage that would be impractical to collect in the street, factory, or field.
What Cosmos Predict 2.5 and Cosmos Transfer 2.5 Bring
- Cosmos Predict 2.5: consolidates Text2World, Image2World, and Video2World models within a lightweight architecture. From a prompt, an image, or a video, the system generates coherent, controllable world videos with multicamera consistent outputs. This enables, for example, synthesizing a traffic scene or warehouse with multiple views and trajectories for perception and planning pipelines.
- Cosmos Transfer 2.5: enables spatial style transfer from world to world with high fidelity to expand dataset variability: weather, lighting, or terrain change coherently across all cameras. Additionally, the model is 3.5× smaller than its predecessor, faster, with improved prompt alignment and physics fit.
Both WFMs fit into synthetic data pipelines built on NVIDIA Omniverse and Isaac Sim — NVIDIA’s open-source robotics simulation framework — reducing the sim-to-real gap with photo-realistic video and consistent annotations.
Synthetic Data Pipeline: From a Smartphone to a Trainable World
NVIDIA proposes a four-stage workflow to create truly useful synthetic data:
- Omniverse NuRec: neural reconstruction libraries to build a digital twin in OpenUSD from smartphone captures (real scenes transformed into navigable environments).
- SimReady assets: populate the twin with physically accurate 3D models (materials, masses, friction) ready for simulation.
- Isaac Sim (MobilityGen): generate data (trajectories, sensors, perturbations) at scale, with control over sensors (RGB, LiDAR, depth) and kinematics.
- NVIDIA Cosmos: expand the generated data with plausible variations in weather, lighting, and terrain applied in a spatially consistent and multi-camera manner.
The outcome: millions of controlled synthetic samples, with perfect labels (segmentation, depth, poses) and targeted diversity, complementing — not replacing — real data.
Real-World Cases: From “Robot Brains” to Autonomous Deliveries
- Skild AI uses Cosmos Transfer to expand data with new variations and validate robotic policies trained in Isaac Lab. Their “simulation-first” approach accelerates the generalization of robot brains across different bodies and tasks.
- Serve Robotics combines Isaac Sim and field data into one of the largest autonomous fleets in public space: over 100,000 last-mile deliveries and 1 million miles tracked monthly, with about 170 billion image–LiDAR samples feeding the models in simulation. The company has also shown how their robots can deliver computing hardware: they distributed NVIDIA DGX Spark — 1 PetaFLOP of AI power — to creators like Refik Anadol, will.i.am, and Ollama.
- Zipline, a leader in autonomous delivery drones, also received a DGX Spark per drone and uses NVIDIA Jetson as their edge AI platform in flight systems.
- Lightwheel helps clients close the sim-to-real gap with SimReady and massive synthetic datasets based on OpenUSD, from factory to home.
- In mining, data scientist Santiago Villa uses Omniverse with Blender to generate datasets capable of detecting large rocks blocking crushers. Each incident can stop the plant for about 7 minutes and cost up to $650,000 annually in lost production; synthetic data reduces training costs and improves detection.
- FS Studio created thousands of photorealistic variations of packages with Omniverse Replicator for a logistics leader, boosting detection accuracy and reducing false positives, with direct impact on throughput.
- Robots for Humanity built a comprehensive environment in Isaac Sim for an oil & gas client, generating RGB, depth, and segmentation data, and capturing telemetry from the Unitree G1 robot via teleoperation.
- Omniverse ambassador Scott Dempsey synthesizes cables from manufacturer specifications and creates datasets with Isaac Sim, enriched with Cosmos Transfer to train systems that identify and manipulate cables.
Why Synthetic Data Matters in “Physics AI”
Large Language Models (LLMs) thrive on internet text because it’s abundant. Physics AI requires experiences: collisions, occlusions, specular highlights, LiDAR rain, thermal noise, and rare failures. Waiting for these to occur — or provoking them — in the real world is unfeasible. Using physically based synthetic data, teams can:
- Cover rare scenarios (near accidents, extreme conditions) without risking anyone.
- Control the distribution (number of nights, rain amount, occlusion levels) for balanced, robust training.
- Obtain perfect labels (segmentation, depth, normals) that are very costly or impossible to annotate in real environments.
- Iterate rapidly: when a model fails, generate more instances of that failure case and retrain.
The key is ensuring that synthetic worlds are physically plausible and optically consistent, with a pipeline that maintains view-to-view and sensor-to-sensor coherence. This is where Omniverse, OpenUSD, Isaac, and Cosmos converge.
OpenUSD + Omniverse: The Common Language for Industrial 3D
Using OpenUSD (standard promoted by AOUSD) as the scene format enables digital twins, SimReady assets, and synthetic data to move seamlessly across tools and teams. Omniverse acts as a platform to build, simulate, and render these worlds with consistent physics and lighting, while Isaac Sim adds the robotics layer (sensors, controls, ROS 2, teleoperation).
Getting Started: Learning Path and Resources
Developers and teams can begin with:
- The “Getting Started with Isaac Sim” course (robotic simulation, ROS 2, data generation).
- The recommended workflow for synthetic data with Omniverse.
- The Cosmos Cookbook (technical recipes, examples, detailed steps).
- Guides on capturing scenes with iPhone and reconstructing in Isaac Sim.
- YouTube playlists demonstrating Replicator and Omniverse workflows.
- NVIDIA Brev to access preconfigured GPU environments and launchables tailored for physics AI.
Technical Reading: From “Demo” to Production
Transitioning from lab to operation requires metrics: collision rates, trajectory deviation, planning time, perception false positives/negatives, field MTBF. Cosmos 2.5’s — multi-camera coherence, custom physical variations, and efficiency — aim to monitor these metrics through controlled experiments and reproducible datasets. If field deployment confirms benefits, the loop becomes continuous: simulation → synthetic data → training → field validation → back to simulation with harder cases.
Frequently Asked Questions
How does “Physics AI” differ from generative text/image AI?
Physics AI must perceive, reason, and act in real time within dynamic environments: robots, cars, drones interacting with the world. It needs reality-bound data (physics, sensors, lighting), not just internet statistical patterns.
Why use synthetic data if I already have real robot logs?
Because real data doesn’t cover all cases and lacks labels for everything that matters (depth, precise segmentation). Synthetic data allows control over distribution, enables rare scenarios risk-free, and provides perfect labels for faster training and validation.
What role do OpenUSD/Omniverse play compared to other engines?
OpenUSD provides an interoperable scene format for complex worlds; Omniverse offers photo-realistic rendering, physics, and industrial-scale composition. It integrates with Isaac and Cosmos to close the simulation-data-model loop.
How do Cosmos Predict and Transfer ensure camera and sensor consistency?
Predict 2.5 creates multi-camera consistent worlds from an input (text, image, video). Transfer 2.5 applies styles/conditions (weather, lighting, terrain) in a spatially controlled and synchronized manner across views, preserving scene geometry and physics.
Will synthetic data replace real data?
No. The best results come from a combination: synthetic for broad coverage and perfect labels; real for anchoring the distribution and validating models before deployment.
Note: This article is based on technical insights and use cases shared by NVIDIA regarding Cosmos 2.5, Omniverse, and Isaac Sim, as well as real-world robotics and logistics applications.
via: blogs.nvidia

