Egocentric Data Collection for Humanoid Robot Training

Egocentric Data Collection for Humanoid Robot Training

Open egocentric datasets give you 2D video with no depth, no pose, no tactile signal. Humanoid training requires all three. How do you build a multimodal setup that captures what open data structurally cannot?
How to build multimodal data collection with real 3D spatial data, body skeleton, and tactile information — while keeping the pipeline scalable.

Egocentric Data Collection for Humanoid Robot Training

The Problem

The client, a humanoid robot developer, needed training data that captures real first-person human interaction with objects. Existing open datasets did not meet this requirement: 2D video without depth information or camera pose is suitable only for pretraining, not for final model fine-tuning.

The primary objective was collecting first-person video with depth, camera intrinsics, and extrinsics — the data consistently missing from open datasets.

Solution

Hardware Setup

Existing egocentric datasets are flat 2D video: no depth, no camera pose, often no metadata. They work for pretraining but not for final training. The setup needed to solve this problem and remain scalable.

The foundation is the Pico 4 Ultra, a VR headset with a stereo camera. It provides what standard egocentric capture cannot:

  • a depth map and metric distance to objects, rather than a flat image
  • camera position in space for every frame
  • lens intrinsic parameters, without which full scene reconstruction is not possible

The setup is fully autonomous — no power connection or external computer required. Without that autonomy, half of the field scenarios become impractical from the start.

Body and Hand Capture

First-person video carries no information about what the body is doing. For humanoids this is critical: the torso and legs influence how a person ultimately grasps an object. To capture the full skeleton, motion trackers are connected to the headset.

Tracker placement:

  • one on each hand and foot
  • one on the waist
  • algorithms build a full-body skeleton in real time from six points

Hand tracking is a harder problem: the built-in AI palm tracking works in real time but loses accuracy under occlusion, when an object blocks part of the hand. To compensate, an additional monocular camera is mounted on each wrist. The final setup has three cameras in total: the stereo camera on the headset and two monocular cameras on the wrists.

Tactile Layer

Visual and kinematic signal alone is not sufficient. Real robots have tactile sensors on their manipulators, and training data needs to reflect that.

Problems that cannot be addressed without tactile data:

  • Heterogeneous objects with an off-center mass are visually indistinguishable from uniform ones
  • Load distribution across grip points differs when handling such objects — the model needs this information
  • Surface texture and stiffness determine the grasping strategy but are not visible in video

Tactile gloves measuring contact-point pressure are included in the setup. This is the data layer teams typically add after the fact, when the model is already failing on real objects. It is built in from the start here.

PhaseInputScope of WorkQuality Control
Setup Selection & ConfigurationClient requirements, target interaction scenariosHardware selection, stereo camera calibration, tracker synchronizationDepth map accuracy, intrinsic and extrinsic correctness
Motion Tracking SetupMotion trackers, Pico 4 Ultra headsetTracker placement on joints, skeleton construction configuration, palm tracking calibrationSkeleton stability, hand tracking quality under occlusion
Tactile Layer IntegrationTactile gloves, objects of varying mass and texturePressure sensor integration into the setup, synchronization with video and kinematic streamsPressure map accuracy, timestamp synchronization
Pilot CollectionTest recording sessionsFull stream verification in combination: video, depth, skeleton, tactileIdentification of systematic tracking errors, scenario coverage check
Main Data CollectionActors, target objects and spacesRecording manipulation scenarios: grasping, transfer, interaction with heterogeneous objectsStream completeness, tracking stability across the session
Validation & AnnotationRaw recordings from all streamsTimestamp synchronization check, tracking failure filtering, dataset packagingStream consistency, absence of artifacts in depth map and skeleton
Final DeliveryValidated multi-stream datasetPackaging with format documentation, handoff to clientCompatibility with client training pipeline, metadata completeness
Weeks 1–2
Setup Design
Week 3
Pilot Collection
Weeks 4–6
Main Data Collection
Week 7
Validation & Delivery

The Results

  • A reusable collection setup suitable for field conditions without a fixed power source
  • A synchronized multi-stream dataset with 3D coordinates, depth, and tactile data
  • Full-body skeleton and hand tracking in a format compatible with simulation environments
  • Documented format structure for integration into the client’s training pipeline

Similar Cases

  • Image
    Data Collection

    Fabric Mask Dataset for Biometric Testing

    Testing biometrics with frontal-only masks hides real weaknesses. We developed fabric mask samples for true multi-angle evaluation.

    Lean more
  • Image
    Data Collection

    Video Data Collection for Street Weapon Detection

    From zero to 99% model accuracy in 28 days: we sourced, staged, and annotated video footage for urban weapon detection systems.

    Lean more
  • Image
    NLP Annotation services

    Hindi Speech Transcription Dataset for ASR Evaluation

    Seven days from raw Hindi audio to a controlled, production-ready transcription system. Expert benchmark, automated SERP scoring, and a vetted team deployed without delay.

    Lean more
  • Image
    Text Labeling

    Chat Message Annotation for Toxic Content Filtering

    Our team supported the development of a reply suggestion system by annotating thousands of user dialogs — focusing on tone, relevance, and linguistic nuance.

    Lean more
  • Image
    Audio Annotation

    Audio Transcription for Finance Sector

    We completed 80 hours of high-complexity audio transcription without relying on pre-labeling — leveraging a scalable workflow designed for accuracy, consistency, and speed.

    Lean more

Ready to get started?

Tell us what you need — we’ll reply within 24h with a free estimate

    What service are you looking for? *
    What service are you looking for?
    Data Labeling
    Data Collection
    Ready-made Datasets
    Human Moderation
    Medicine
    Other
    What's your budget range? *
    What's your budget range?
    < $1,000
    $1,000 – $5,000
    $5,000 – $10,000
    $10,000 – $50,000
    $50,000+
    Not sure yet
    • United States+1
    • United Kingdom+44
    • Afghanistan (‫افغانستان‬‎)+93
    • Albania (Shqipëri)+355
    • Algeria (‫الجزائر‬‎)+213
    • American Samoa+1684
    • Andorra+376
    • Angola+244
    • Anguilla+1264
    • Antigua and Barbuda+1268
    • Argentina+54
    • Armenia (Հայաստան)+374
    • Aruba+297
    • Australia+61
    • Austria (Österreich)+43
    • Azerbaijan (Azərbaycan)+994
    • Bahamas+1242
    • Bahrain (‫البحرين‬‎)+973
    • Bangladesh (বাংলাদেশ)+880
    • Barbados+1246
    • Belarus (Беларусь)+375
    • Belgium (België)+32
    • Belize+501
    • Benin (Bénin)+229
    • Bermuda+1441
    • Bhutan (འབྲུག)+975
    • Bolivia+591
    • Bosnia and Herzegovina (Босна и Херцеговина)+387
    • Botswana+267
    • Brazil (Brasil)+55
    • British Indian Ocean Territory+246
    • British Virgin Islands+1284
    • Brunei+673
    • Bulgaria (България)+359
    • Burkina Faso+226
    • Burundi (Uburundi)+257
    • Cambodia (កម្ពុជា)+855
    • Cameroon (Cameroun)+237
    • Canada+1
    • Cape Verde (Kabu Verdi)+238
    • Caribbean Netherlands+599
    • Cayman Islands+1345
    • Central African Republic (République centrafricaine)+236
    • Chad (Tchad)+235
    • Chile+56
    • China (中国)+86
    • Christmas Island+61
    • Cocos (Keeling) Islands+61
    • Colombia+57
    • Comoros (‫جزر القمر‬‎)+269
    • Congo (DRC) (Jamhuri ya Kidemokrasia ya Kongo)+243
    • Congo (Republic) (Congo-Brazzaville)+242
    • Cook Islands+682
    • Costa Rica+506
    • Côte d’Ivoire+225
    • Croatia (Hrvatska)+385
    • Cuba+53
    • Curaçao+599
    • Cyprus (Κύπρος)+357
    • Czech Republic (Česká republika)+420
    • Denmark (Danmark)+45
    • Djibouti+253
    • Dominica+1767
    • Dominican Republic (República Dominicana)+1
    • Ecuador+593
    • Egypt (‫مصر‬‎)+20
    • El Salvador+503
    • Equatorial Guinea (Guinea Ecuatorial)+240
    • Eritrea+291
    • Estonia (Eesti)+372
    • Ethiopia+251
    • Falkland Islands (Islas Malvinas)+500
    • Faroe Islands (Føroyar)+298
    • Fiji+679
    • Finland (Suomi)+358
    • France+33
    • French Guiana (Guyane française)+594
    • French Polynesia (Polynésie française)+689
    • Gabon+241
    • Gambia+220
    • Georgia (საქართველო)+995
    • Germany (Deutschland)+49
    • Ghana (Gaana)+233
    • Gibraltar+350
    • Greece (Ελλάδα)+30
    • Greenland (Kalaallit Nunaat)+299
    • Grenada+1473
    • Guadeloupe+590
    • Guam+1671
    • Guatemala+502
    • Guernsey+44
    • Guinea (Guinée)+224
    • Guinea-Bissau (Guiné Bissau)+245
    • Guyana+592
    • Haiti+509
    • Honduras+504
    • Hong Kong (香港)+852
    • Hungary (Magyarország)+36
    • Iceland (Ísland)+354
    • India (भारत)+91
    • Indonesia+62
    • Iran (‫ایران‬‎)+98
    • Iraq (‫العراق‬‎)+964
    • Ireland+353
    • Isle of Man+44
    • Israel (‫ישראל‬‎)+972
    • Italy (Italia)+39
    • Jamaica+1876
    • Japan (日本)+81
    • Jersey+44
    • Jordan (‫الأردن‬‎)+962
    • Kazakhstan (Казахстан)+7
    • Kenya+254
    • Kiribati+686
    • Kosovo+383
    • Kuwait (‫الكويت‬‎)+965
    • Kyrgyzstan (Кыргызстан)+996
    • Laos (ລາວ)+856
    • Latvia (Latvija)+371
    • Lebanon (‫لبنان‬‎)+961
    • Lesotho+266
    • Liberia+231
    • Libya (‫ليبيا‬‎)+218
    • Liechtenstein+423
    • Lithuania (Lietuva)+370
    • Luxembourg+352
    • Macau (澳門)+853
    • Macedonia (FYROM) (Македонија)+389
    • Madagascar (Madagasikara)+261
    • Malawi+265
    • Malaysia+60
    • Maldives+960
    • Mali+223
    • Malta+356
    • Marshall Islands+692
    • Martinique+596
    • Mauritania (‫موريتانيا‬‎)+222
    • Mauritius (Moris)+230
    • Mayotte+262
    • Mexico (México)+52
    • Micronesia+691
    • Moldova (Republica Moldova)+373
    • Monaco+377
    • Mongolia (Монгол)+976
    • Montenegro (Crna Gora)+382
    • Montserrat+1664
    • Morocco (‫المغرب‬‎)+212
    • Mozambique (Moçambique)+258
    • Myanmar (Burma) (မြန်မာ)+95
    • Namibia (Namibië)+264
    • Nauru+674
    • Nepal (नेपाल)+977
    • Netherlands (Nederland)+31
    • New Caledonia (Nouvelle-Calédonie)+687
    • New Zealand+64
    • Nicaragua+505
    • Niger (Nijar)+227
    • Nigeria+234
    • Niue+683
    • Norfolk Island+672
    • North Korea (조선 민주주의 인민 공화국)+850
    • Northern Mariana Islands+1670
    • Norway (Norge)+47
    • Oman (‫عُمان‬‎)+968
    • Pakistan (‫پاکستان‬‎)+92
    • Palau+680
    • Palestine (‫فلسطين‬‎)+970
    • Panama (Panamá)+507
    • Papua New Guinea+675
    • Paraguay+595
    • Peru (Perú)+51
    • Philippines+63
    • Poland (Polska)+48
    • Portugal+351
    • Puerto Rico+1
    • Qatar (‫قطر‬‎)+974
    • Réunion (La Réunion)+262
    • Romania (România)+40
    • Russia (Россия)+7
    • Rwanda+250
    • Saint Barthélemy+590
    • Saint Helena+290
    • Saint Kitts and Nevis+1869
    • Saint Lucia+1758
    • Saint Martin (Saint-Martin (partie française))+590
    • Saint Pierre and Miquelon (Saint-Pierre-et-Miquelon)+508
    • Saint Vincent and the Grenadines+1784
    • Samoa+685
    • San Marino+378
    • São Tomé and Príncipe (São Tomé e Príncipe)+239
    • Saudi Arabia (‫المملكة العربية السعودية‬‎)+966
    • Senegal (Sénégal)+221
    • Serbia (Србија)+381
    • Seychelles+248
    • Sierra Leone+232
    • Singapore+65
    • Sint Maarten+1721
    • Slovakia (Slovensko)+421
    • Slovenia (Slovenija)+386
    • Solomon Islands+677
    • Somalia (Soomaaliya)+252
    • South Africa+27
    • South Korea (대한민국)+82
    • South Sudan (‫جنوب السودان‬‎)+211
    • Spain (España)+34
    • Sri Lanka (ශ්‍රී ලංකාව)+94
    • Sudan (‫السودان‬‎)+249
    • Suriname+597
    • Svalbard and Jan Mayen+47
    • Swaziland+268
    • Sweden (Sverige)+46
    • Switzerland (Schweiz)+41
    • Syria (‫سوريا‬‎)+963
    • Taiwan (台灣)+886
    • Tajikistan+992
    • Tanzania+255
    • Thailand (ไทย)+66
    • Timor-Leste+670
    • Togo+228
    • Tokelau+690
    • Tonga+676
    • Trinidad and Tobago+1868
    • Tunisia (‫تونس‬‎)+216
    • Turkey (Türkiye)+90
    • Turkmenistan+993
    • Turks and Caicos Islands+1649
    • Tuvalu+688
    • U.S. Virgin Islands+1340
    • Uganda+256
    • Ukraine (Україна)+380
    • United Arab Emirates (‫الإمارات العربية المتحدة‬‎)+971
    • United Kingdom+44
    • United States+1
    • Uruguay+598
    • Uzbekistan (Oʻzbekiston)+998
    • Vanuatu+678
    • Vatican City (Città del Vaticano)+39
    • Venezuela+58
    • Vietnam (Việt Nam)+84
    • Wallis and Futuna (Wallis-et-Futuna)+681
    • Western Sahara (‫الصحراء الغربية‬‎)+212
    • Yemen (‫اليمن‬‎)+967
    • Zambia+260
    • Zimbabwe+263
    • Åland Islands+358
    Where did you hear about Unidata? *
    Where did you hear about Unidata?
    Andrew
    Head of Client Success

    — I'll guide you through every step, from your first
    message to full project delivery

    Thank you for your
    message

    It has been successfully sent!

    We use cookies to enhance your experience, personalize content, ads, and analyze traffic. By clicking 'Accept All', you agree to our Cookie Policy.