Task
The client required a dataset that reflects how facial features evolve throughout childhood and early adolescence. Core requirements included:
- Accurate age verification for every image
- Diversity across ethnicity, geography, and gender
- Year-by-year continuity, allowing models to distinguish natural growth from identity mismatch
Key Challenges
-
- 01
- Verifying real ages without access to official identity documents
- Covering multiple regions with different cultural and photographic conditions
- Limited availability of high-quality images of children
- Ensuring each photo set belonged to the same individual and matched the declared age
Solution
-
- 01
-
Dataset design and methodology
- Defined the target age range and prioritized ethnic and regional groups
- Developed an age-verification approach combining visual assessment and metadata analysis
- Created clear, standardized instructions for participants and crowd platforms, including capture examples
-
- 02
-
Data collection
- Leveraged established crowd platforms and tested new sources to expand geographic coverage
- Designed simple, engaging tasks to encourage complete and high-quality photo sets
- Provided fair compensation to reduce drop-off and incomplete submissions
- Monitored incoming data in real time to address quality issues early
-
- 03
-
Validation and quality control
- Combined automated checks with manual expert review to confirm age and photo ownership
- Applied multi-layer validation, with multiple reviewers cross-checking each submission
- Minimized inconsistencies and labeling errors, achieving a very low inaccuracy rate
- Delivered a clean, production-ready dataset suitable for model training and research
Results
Achieved high confidence in age accuracy and metadata reliability
Enabled training for face recognition, anti-fraud systems, and academic research
Identified consistent patterns of facial development across diverse ethnic and regional groups