The Task
A client developing facial recognition and anti-spoofing technologies approached us with a clear objective: to collect a high-quality dataset for testing four types of presentation attacks — Live, Print, Crop, and Display.
Due to internal resource limitations, they needed a partner who could take over the full cycle of data collection — from sourcing participants to final validation. After reviewing several providers, the client chose Unidata for our proven track record and flexible approach.
The Solution
Preparation and Technical Briefing
We began by aligning with the client’s technical specifications and validating the workflow with a pilot phase. This ensured we were fully aligned before scaling.
Performer Recruitment and Photo Capture
Using our wide performer base, we organized the collection of five photographs per participant, each taken from specific distances and angles.
All images adhered to strict guidelines for positioning and proportions.
Spoofing Scenario Generation
To simulate various spoofing attacks, we produced supporting materials — printed versions of faces, cropped cutouts, and digital displays. These required precise execution to ensure background consistency and proportional accuracy across all samples.
Validation and Quality Assurance
Collected data underwent a two-stage validation process.
Images were reviewed for visual and technical compliance, and all inconsistencies were corrected before final submission.
| Stage | Input | Workflow Scope | Main Quality Checks |
|---|---|---|---|
| Project Setup | Client requirements & attack scenarios | Requirements alignment, pilot validation, workflow setup | Requirement clarity / Pilot consistency |
| Participant Recruitment | Performer pool | Sourcing participants, briefing, task distribution | Participant compliance / Instruction clarity |
| Image Capture | Participants, capture guidelines | Photo collection across angles, distances, positions | Framing accuracy / Position consistency |
| Spoofing Generation | Printed images, digital displays, cutouts | Creation of print, crop, and display attack materials | Proportional accuracy / Background consistency |
| Data Validation | Collected image sets | Visual and technical review, error correction | Image quality / Guideline compliance |
| Final Delivery | Validated dataset | Dataset structuring, final checks, submission | Dataset completeness / Attack type coverage |
The Results
- Efficient Delivery: Over 10% of the client’s total dataset was completed in just one month.
- High-Quality Data: 50 validated sets totaling 2,000 photos were delivered, covering all required attack types.
- Client Endorsement: The client praised the speed, accuracy, and process clarity, emphasizing their readiness to partner with Unidata on future projects.
High-quality biometric datasets depend on strict control over capture conditions, attack simulation accuracy, and multi-stage validation. Speed is achievable only when collection and QA are tightly integrated from the start.
- Kirill Meshyk
- Head of Data Collection