Data Collection

Data for Simulations: 3D Scanning for Robot Training

Image

How to close the relevant data gap for physical AI through photogrammetry and lidar scanning, and connect real scenes to simulation environments.

Image

The Problem

Humanoid robot developers keep running into the same wall: egocentric video data is scarce, expensive to collect, and slow to accumulate — nowhere near the volume needed to meaningfully advance training. Simulation solves the scale problem: you can spin up hundreds of parallel environments and generate millions of iterations in a short time. But those simulation environments need to be populated with realistic spaces and objects the robot will actually interact with.

Building environments by hand is costly and slow — it requires designers, developers, and virtual environment specialists. The goal was to collect simulation data directly from the real world, without a full production team.

The work split into two parallel streams:

Space scanning

Photorealistic 3D models of apartments and rooms, ready to load into simulation environments like IsaacSim — where a robot can be placed and its object interactions recorded.

Object scanning

3D models of everyday items — mugs, boxes, tools — to populate simulation scenes with geometrically accurate, properly textured objects.

Solution

Space Scanning

Building environments manually without a large team is not realistic. Instead, we scan real rooms and load them directly into the simulator. The setup uses a 360-degree camera with an integrated lidar. Lidar provides metric accuracy; the camera provides photorealistic textures.

Coverage is monitored in real time:

  • the operator walks through the space with the scanner
  • the software flags zones with insufficient coverage
  • data is uploaded to volumetric reconstruction software

Environments are static — drawers and cabinet doors do not open. For most object manipulation scenarios on surfaces, this is not a meaningful limitation. More importantly, scans are tied to the same spaces where egocentric footage is recorded. That connection is the direct integration point between the two data types.

Object Scanning

Lidar is not suitable for individual objects: resolution is insufficient for fine details and textures. The method of choice is photogrammetry with reconstruction via 3D Gaussian Splatting.

The pipeline works as follows:

  • approximately 150 shots from different angles under controlled lighting
  • processing in ColMap: computing the position of each frame relative to the object
  • reconstruction in 3DGS: a precise model with realistic textures as output

The hardest part is lighting. Three parameters are in constant tension, and there is no universal solution.

Here is what happens when each one is off:

  • High ISO reduces the need for light but introduces grain — ColMap stops correctly matching points between frames
  • Wide aperture lets in more light but narrows depth of field — part of the object goes out of focus and reconstruction in those zones degrades
  • Long exposure requires the camera to be completely still — any movement interferes with frame matching

After testing phone cameras, we switched to DSLRs: the larger sensor produces acceptable results in low light without a critical increase in ISO. Shooting parameters were calibrated separately for each object class.

Integration with Egocentric Data

Egocentric recordings and scans of the same spaces form a unified dataset where real data and simulation point to the same environment.

This gives the client capabilities that are unavailable when purchasing the two separately:

  • reproduce a scenario from egocentric video in simulation — with the same geometry and the same objects
  • adapt data to the physics of a specific robot: different grip, different height, different degrees of freedom
  • collect additional data in simulation without another field visit — adjust lighting, object placement, trajectories

A single egocentric data collection session becomes a scalable source of simulation data.

PhaseInputScope of WorkQuality Control
Preparation & CalibrationClient requirements, list of target spaces and objectsLidar scanning and photogrammetry setup, shooting parameters for lighting conditionsCoverage accuracy, ISO / aperture / shutter balance
Pilot ScanningTest room, set of objectsTrial scans of spaces and objects, identifying problem areas: dark corners, reflective surfaces, fine detailsTexture quality, geometry completeness, absence of reconstruction artifacts
Space Scanning360-degree camera with integrated lidarWalk-through with real-time coverage monitoring, upload to volumetric reconstruction softwareMetric dimensional accuracy, photorealistic textures, no uncovered zones
Object ScanningDSLR camera, interaction objects~150 frames from different angles, ColMap processing, 3DGS pipeline reconstructionFull object in focus, correct frame matching in ColMap
Reconstruction & ProcessingRaw scanning and photogrammetry dataBuilding final 3D models of spaces and objects, geometry and scale verificationGPU / RAM resources, no degradation in underlit zones
Egocentric IntegrationRoom scans, egocentric video from the same spacesAligning scans with recordings, test scene loading in IsaacSim, format compatibility checkGeometry match between scan and real space from video
Final DeliveryValidated 3D models and linked datasetPackaging with format documentation, handoff with instructions for simulator importIsaacSim compatibility, metadata completeness, pipeline reproducibility
Week 1
Preparation & Calibration
Week 2
Pilot Scanning
Weeks 3–5
Main Collection
Week 6
Reconstruction & Validation

The Results

  • Photorealistic 3D scenes of real spaces, ready to load into IsaacSim
  • A library of 3D objects with accurate geometry and textures for populating simulation environments
  • A linked dataset: egocentric video tied to scans of the same spaces
  • A reproducible scanning pipeline that does not require a large team of virtual environment specialists

Similar Cases

  • Image
    Image Annotation

    License Plate Annotation for Vehicle Recognition System

    How do you annotate 100,000 license plates with dozens of nuances — from Arabic characters to regional codes — and still meet a two-week deadline?

    Lean more
  • Image
    NLP Annotation services

    Banking Call Categorization for NLP Automation

    Fast-tracked annotation of 363,000 banking calls with strict privacy — boosting NLP automation for debit, credit, and deposit queries.

    Lean more
  • Image
    Text Labeling

    Document Annotation for Financial Services

    From contracts to inheritance certificates, we annotated 6,000+ legal documents with high precision and custom validation logic.

    Lean more
  • Image
    NLP Annotation services

    Mathematical Reasoning Validation for AI

    3,500 math problems, three difficulty levels, every solution step checked, not just the final answer. We brought in olympiad students and university instructors to stress-test model logic.

    Lean more
  • Image
    Data Collection

    Fight Detection for Surveillance Systems

    From scenario planning to annotation, we supported a full-cycle dataset build for a CV model trained to detect physical aggression in public spaces.

    Lean more

Ready to get started?

Tell us what you need — we’ll reply within 24h with a free estimate

    What service are you looking for? *
    What service are you looking for?
    Data Labeling
    Data Collection
    Ready-made Datasets
    Human Moderation
    Medicine
    Other
    What's your budget range? *
    What's your budget range?
    < $1,000
    $1,000 – $5,000
    $5,000 – $10,000
    $10,000 – $50,000
    $50,000+
    Not sure yet
    • United States+1
    • United Kingdom+44
    • Afghanistan (‫افغانستان‬‎)+93
    • Albania (Shqipëri)+355
    • Algeria (‫الجزائر‬‎)+213
    • American Samoa+1684
    • Andorra+376
    • Angola+244
    • Anguilla+1264
    • Antigua and Barbuda+1268
    • Argentina+54
    • Armenia (Հայաստան)+374
    • Aruba+297
    • Australia+61
    • Austria (Österreich)+43
    • Azerbaijan (Azərbaycan)+994
    • Bahamas+1242
    • Bahrain (‫البحرين‬‎)+973
    • Bangladesh (বাংলাদেশ)+880
    • Barbados+1246
    • Belarus (Беларусь)+375
    • Belgium (België)+32
    • Belize+501
    • Benin (Bénin)+229
    • Bermuda+1441
    • Bhutan (འབྲུག)+975
    • Bolivia+591
    • Bosnia and Herzegovina (Босна и Херцеговина)+387
    • Botswana+267
    • Brazil (Brasil)+55
    • British Indian Ocean Territory+246
    • British Virgin Islands+1284
    • Brunei+673
    • Bulgaria (България)+359
    • Burkina Faso+226
    • Burundi (Uburundi)+257
    • Cambodia (កម្ពុជា)+855
    • Cameroon (Cameroun)+237
    • Canada+1
    • Cape Verde (Kabu Verdi)+238
    • Caribbean Netherlands+599
    • Cayman Islands+1345
    • Central African Republic (République centrafricaine)+236
    • Chad (Tchad)+235
    • Chile+56
    • China (中国)+86
    • Christmas Island+61
    • Cocos (Keeling) Islands+61
    • Colombia+57
    • Comoros (‫جزر القمر‬‎)+269
    • Congo (DRC) (Jamhuri ya Kidemokrasia ya Kongo)+243
    • Congo (Republic) (Congo-Brazzaville)+242
    • Cook Islands+682
    • Costa Rica+506
    • Côte d’Ivoire+225
    • Croatia (Hrvatska)+385
    • Cuba+53
    • Curaçao+599
    • Cyprus (Κύπρος)+357
    • Czech Republic (Česká republika)+420
    • Denmark (Danmark)+45
    • Djibouti+253
    • Dominica+1767
    • Dominican Republic (República Dominicana)+1
    • Ecuador+593
    • Egypt (‫مصر‬‎)+20
    • El Salvador+503
    • Equatorial Guinea (Guinea Ecuatorial)+240
    • Eritrea+291
    • Estonia (Eesti)+372
    • Ethiopia+251
    • Falkland Islands (Islas Malvinas)+500
    • Faroe Islands (Føroyar)+298
    • Fiji+679
    • Finland (Suomi)+358
    • France+33
    • French Guiana (Guyane française)+594
    • French Polynesia (Polynésie française)+689
    • Gabon+241
    • Gambia+220
    • Georgia (საქართველო)+995
    • Germany (Deutschland)+49
    • Ghana (Gaana)+233
    • Gibraltar+350
    • Greece (Ελλάδα)+30
    • Greenland (Kalaallit Nunaat)+299
    • Grenada+1473
    • Guadeloupe+590
    • Guam+1671
    • Guatemala+502
    • Guernsey+44
    • Guinea (Guinée)+224
    • Guinea-Bissau (Guiné Bissau)+245
    • Guyana+592
    • Haiti+509
    • Honduras+504
    • Hong Kong (香港)+852
    • Hungary (Magyarország)+36
    • Iceland (Ísland)+354
    • India (भारत)+91
    • Indonesia+62
    • Iran (‫ایران‬‎)+98
    • Iraq (‫العراق‬‎)+964
    • Ireland+353
    • Isle of Man+44
    • Israel (‫ישראל‬‎)+972
    • Italy (Italia)+39
    • Jamaica+1876
    • Japan (日本)+81
    • Jersey+44
    • Jordan (‫الأردن‬‎)+962
    • Kazakhstan (Казахстан)+7
    • Kenya+254
    • Kiribati+686
    • Kosovo+383
    • Kuwait (‫الكويت‬‎)+965
    • Kyrgyzstan (Кыргызстан)+996
    • Laos (ລາວ)+856
    • Latvia (Latvija)+371
    • Lebanon (‫لبنان‬‎)+961
    • Lesotho+266
    • Liberia+231
    • Libya (‫ليبيا‬‎)+218
    • Liechtenstein+423
    • Lithuania (Lietuva)+370
    • Luxembourg+352
    • Macau (澳門)+853
    • Macedonia (FYROM) (Македонија)+389
    • Madagascar (Madagasikara)+261
    • Malawi+265
    • Malaysia+60
    • Maldives+960
    • Mali+223
    • Malta+356
    • Marshall Islands+692
    • Martinique+596
    • Mauritania (‫موريتانيا‬‎)+222
    • Mauritius (Moris)+230
    • Mayotte+262
    • Mexico (México)+52
    • Micronesia+691
    • Moldova (Republica Moldova)+373
    • Monaco+377
    • Mongolia (Монгол)+976
    • Montenegro (Crna Gora)+382
    • Montserrat+1664
    • Morocco (‫المغرب‬‎)+212
    • Mozambique (Moçambique)+258
    • Myanmar (Burma) (မြန်မာ)+95
    • Namibia (Namibië)+264
    • Nauru+674
    • Nepal (नेपाल)+977
    • Netherlands (Nederland)+31
    • New Caledonia (Nouvelle-Calédonie)+687
    • New Zealand+64
    • Nicaragua+505
    • Niger (Nijar)+227
    • Nigeria+234
    • Niue+683
    • Norfolk Island+672
    • North Korea (조선 민주주의 인민 공화국)+850
    • Northern Mariana Islands+1670
    • Norway (Norge)+47
    • Oman (‫عُمان‬‎)+968
    • Pakistan (‫پاکستان‬‎)+92
    • Palau+680
    • Palestine (‫فلسطين‬‎)+970
    • Panama (Panamá)+507
    • Papua New Guinea+675
    • Paraguay+595
    • Peru (Perú)+51
    • Philippines+63
    • Poland (Polska)+48
    • Portugal+351
    • Puerto Rico+1
    • Qatar (‫قطر‬‎)+974
    • Réunion (La Réunion)+262
    • Romania (România)+40
    • Russia (Россия)+7
    • Rwanda+250
    • Saint Barthélemy+590
    • Saint Helena+290
    • Saint Kitts and Nevis+1869
    • Saint Lucia+1758
    • Saint Martin (Saint-Martin (partie française))+590
    • Saint Pierre and Miquelon (Saint-Pierre-et-Miquelon)+508
    • Saint Vincent and the Grenadines+1784
    • Samoa+685
    • San Marino+378
    • São Tomé and Príncipe (São Tomé e Príncipe)+239
    • Saudi Arabia (‫المملكة العربية السعودية‬‎)+966
    • Senegal (Sénégal)+221
    • Serbia (Србија)+381
    • Seychelles+248
    • Sierra Leone+232
    • Singapore+65
    • Sint Maarten+1721
    • Slovakia (Slovensko)+421
    • Slovenia (Slovenija)+386
    • Solomon Islands+677
    • Somalia (Soomaaliya)+252
    • South Africa+27
    • South Korea (대한민국)+82
    • South Sudan (‫جنوب السودان‬‎)+211
    • Spain (España)+34
    • Sri Lanka (ශ්‍රී ලංකාව)+94
    • Sudan (‫السودان‬‎)+249
    • Suriname+597
    • Svalbard and Jan Mayen+47
    • Swaziland+268
    • Sweden (Sverige)+46
    • Switzerland (Schweiz)+41
    • Syria (‫سوريا‬‎)+963
    • Taiwan (台灣)+886
    • Tajikistan+992
    • Tanzania+255
    • Thailand (ไทย)+66
    • Timor-Leste+670
    • Togo+228
    • Tokelau+690
    • Tonga+676
    • Trinidad and Tobago+1868
    • Tunisia (‫تونس‬‎)+216
    • Turkey (Türkiye)+90
    • Turkmenistan+993
    • Turks and Caicos Islands+1649
    • Tuvalu+688
    • U.S. Virgin Islands+1340
    • Uganda+256
    • Ukraine (Україна)+380
    • United Arab Emirates (‫الإمارات العربية المتحدة‬‎)+971
    • United Kingdom+44
    • United States+1
    • Uruguay+598
    • Uzbekistan (Oʻzbekiston)+998
    • Vanuatu+678
    • Vatican City (Città del Vaticano)+39
    • Venezuela+58
    • Vietnam (Việt Nam)+84
    • Wallis and Futuna (Wallis-et-Futuna)+681
    • Western Sahara (‫الصحراء الغربية‬‎)+212
    • Yemen (‫اليمن‬‎)+967
    • Zambia+260
    • Zimbabwe+263
    • Åland Islands+358
    Where did you hear about Unidata? *
    Where did you hear about Unidata?
    Andrew
    Head of Client Success

    — I'll guide you through every step, from your first
    message to full project delivery

    Thank you for your
    message

    It has been successfully sent!

    We use cookies to enhance your experience, personalize content, ads, and analyze traffic. By clicking 'Accept All', you agree to our Cookie Policy.