Home Case Studies Data for Simulations: 3D Scanning for Robot Training

Data Collection

Data for Simulations: 3D Scanning for Robot Training

How to close the relevant data gap for physical AI through photogrammetry and lidar scanning, and connect real scenes to simulation environments.

The Problem

Humanoid robot developers keep running into the same wall: egocentric video data is scarce, expensive to collect, and slow to accumulate — nowhere near the volume needed to meaningfully advance training. Simulation solves the scale problem: you can spin up hundreds of parallel environments and generate millions of iterations in a short time. But those simulation environments need to be populated with realistic spaces and objects the robot will actually interact with.

Building environments by hand is costly and slow — it requires designers, developers, and virtual environment specialists. The goal was to collect simulation data directly from the real world, without a full production team.

The work split into two parallel streams:

Space scanning

Photorealistic 3D models of apartments and rooms, ready to load into simulation environments like IsaacSim — where a robot can be placed and its object interactions recorded.

Object scanning

3D models of everyday items — mugs, boxes, tools — to populate simulation scenes with geometrically accurate, properly textured objects.

Solution

Space Scanning

Building environments manually without a large team is not realistic. Instead, we scan real rooms and load them directly into the simulator. The setup uses a 360-degree camera with an integrated lidar. Lidar provides metric accuracy; the camera provides photorealistic textures.

Coverage is monitored in real time:

the operator walks through the space with the scanner
the software flags zones with insufficient coverage
data is uploaded to volumetric reconstruction software

Environments are static — drawers and cabinet doors do not open. For most object manipulation scenarios on surfaces, this is not a meaningful limitation. More importantly, scans are tied to the same spaces where egocentric footage is recorded. That connection is the direct integration point between the two data types.

Object Scanning

Lidar is not suitable for individual objects: resolution is insufficient for fine details and textures. The method of choice is photogrammetry with reconstruction via 3D Gaussian Splatting.

The pipeline works as follows:

approximately 150 shots from different angles under controlled lighting
processing in ColMap: computing the position of each frame relative to the object
reconstruction in 3DGS: a precise model with realistic textures as output

The hardest part is lighting. Three parameters are in constant tension, and there is no universal solution.

Here is what happens when each one is off:

High ISO reduces the need for light but introduces grain — ColMap stops correctly matching points between frames
Wide aperture lets in more light but narrows depth of field — part of the object goes out of focus and reconstruction in those zones degrades
Long exposure requires the camera to be completely still — any movement interferes with frame matching

After testing phone cameras, we switched to DSLRs: the larger sensor produces acceptable results in low light without a critical increase in ISO. Shooting parameters were calibrated separately for each object class.

Integration with Egocentric Data

Egocentric recordings and scans of the same spaces form a unified dataset where real data and simulation point to the same environment.

This gives the client capabilities that are unavailable when purchasing the two separately:

reproduce a scenario from egocentric video in simulation — with the same geometry and the same objects
adapt data to the physics of a specific robot: different grip, different height, different degrees of freedom
collect additional data in simulation without another field visit — adjust lighting, object placement, trajectories

A single egocentric data collection session becomes a scalable source of simulation data.

Phase	Input	Scope of Work	Quality Control
Preparation & Calibration	Client requirements, list of target spaces and objects	Lidar scanning and photogrammetry setup, shooting parameters for lighting conditions	Coverage accuracy, ISO / aperture / shutter balance
Pilot Scanning	Test room, set of objects	Trial scans of spaces and objects, identifying problem areas: dark corners, reflective surfaces, fine details	Texture quality, geometry completeness, absence of reconstruction artifacts
Space Scanning	360-degree camera with integrated lidar	Walk-through with real-time coverage monitoring, upload to volumetric reconstruction software	Metric dimensional accuracy, photorealistic textures, no uncovered zones
Object Scanning	DSLR camera, interaction objects	~150 frames from different angles, ColMap processing, 3DGS pipeline reconstruction	Full object in focus, correct frame matching in ColMap
Reconstruction & Processing	Raw scanning and photogrammetry data	Building final 3D models of spaces and objects, geometry and scale verification	GPU / RAM resources, no degradation in underlit zones
Egocentric Integration	Room scans, egocentric video from the same spaces	Aligning scans with recordings, test scene loading in IsaacSim, format compatibility check	Geometry match between scan and real space from video
Final Delivery	Validated 3D models and linked dataset	Packaging with format documentation, handoff with instructions for simulator import	IsaacSim compatibility, metadata completeness, pipeline reproducibility

Week 1

Preparation & Calibration

Week 2

Pilot Scanning

Weeks 3–5

Main Collection

Week 6

Reconstruction & Validation

The Results

Photorealistic 3D scenes of real spaces, ready to load into IsaacSim
A library of 3D objects with accurate geometry and textures for populating simulation environments
A linked dataset: egocentric video tied to scans of the same spaces
A reproducible scanning pipeline that does not require a large team of virtual environment specialists

The key decision was straightforward: record egocentric footage and scan the same space. The client does not receive two separate products, they receive one environment where real data and simulation point to the same place.

Martinian Letunovsky: Head of IT Operations

Similar Cases

Data Collection

Audio Data Collection for Emotion-Sensitive Voice Systems

Unidata collected 750+ unique audio samples of children’s emotional expressions — enabling emotion recognition in family-focused apps.
Lean more
Geospatial Annotation services

Aerial Image Annotation for Urban Planning

We annotated 132,000+ objects in 11,000 aerial images—streamlining urban planning data with scalable workflows and tailored class logic.
Lean more
Data Collection

Multiview Emotion Capture for AI Training

Capturing emotion at scale required more than cameras. We built a system that made it consistent, synchronized, and repeatable.
Lean more
Image Annotation

Urban Image Annotation for Waste Detection

AI meets urban planning: our dataset enabled the automation of waste collection, reducing costs and improving municipal services.
Lean more
Content Moderation

Biometric Spoofing Attack Simulation for Face Recognition Systems

Real-world print and replay attacks were gathered through ongoing attempts to bypass a live system.
Lean more

Ready to get started?

Tell us what you need — we’ll reply within 24h with a free estimate

What service are you looking for? *

What service are you looking for?

Data Labeling

AI Model Testing

Data Collection

Ready-made Datasets

Human Moderation

Medicine

Other

What's your budget range? *

What's your budget range?

< $5,000

$5,000 – $25,000

$25,000 – $50,000

$50,000 – $100,000

$100,000+

Not sure yet

Where did you hear about Unidata? *

Where did you hear about Unidata?

Google LinkedIn Kaggle / Hugging Face / Github Referral (colleague, partner, client) G2 ChatGPT / AI assistant Other

I agree to the Terms of Service and Privacy Policy. By submitting my contact information, I consent to receive emails, messages, and calls from Unidata and its affiliates.

Andrew: Head of Client Success

— I'll guide you through every step, from your first
message to full project delivery

Thank you for your
message

It has been successfully sent!

We use cookies to enhance your experience, personalize content, ads, and analyze traffic. By clicking 'Accept All', you agree to our Cookie Policy.

Data for Simulations: 3D Scanning for Robot Training

The Problem

Space scanning

Object scanning

Solution

Space Scanning

Object Scanning

Integration with Egocentric Data

The Results

Similar Cases

Audio Data Collection for Emotion-Sensitive Voice Systems

Aerial Image Annotation for Urban Planning

Multiview Emotion Capture for AI Training

Urban Image Annotation for Waste Detection

Biometric Spoofing Attack Simulation for Face Recognition Systems

Ready to get started?

Thank you for your message

Ready to get started?

Thank you for your
message