Data Collection

Image Data Collection for a Palm Recognition Task

Image

Collecting 20,000 standardized photos of people’s palms sounds simple until you try doing it in practice.

Prolific’s reach and filtering features made the scale achievable, while the practical work required disciplined handling of logistics, costs, data verification, and platform-specific constraints.

Image

The Task

Biometric authentication sounds simple until you try to scale it.

The client was building an AI-powered biometric system for banking terminals, designed to ensure secure customer authentication and payment processing through fingerprint recognition. To train and validate the underlying biometric algorithms, the project required a large, high-quality dataset consisting of 20,000 palm image sets.

Each set had to include six strictly standardized photos, three taken with the front camera and three with the camera on a mobile phone.

At this scale, broad geographic coverage and device diversity were also critical. At the same time, we had to keep the cost within a fixed budget, which turned the project into an exercise in precise traffic and workflow optimization.

The Solution

Multi-platform strategy

We approached sourcing as a controlled system rather than a single-channel launch.

Data collection was distributed across several international platforms to balance speed, diversity, and cost. Prolific became the primary source due to its stable participant flow, high response quality, and flexible filtering capabilities.

At the same time, we continuously monitored platform performance and redistributed traffic when needed to avoid slowdowns or quality drops. This allowed us to maintain a consistent collection pace without overloading a single source.

Process organization and quality control

At this volume, most risks come from small inconsistencies repeated thousands of times.

To minimize this, we designed a structured capture flow:

  • detailed step-by-step instructions with visual examples
  • clear requirements for framing, positioning, and lighting
  • device-specific clarifications where necessary

We complemented this with automated validation at the upload stage:

  • format and completeness checks
  • basic quality filters such as resolution and alignment
  • instant rejection of invalid submissions

Quality control was not treated as a final step. We monitored incoming data in real time, conducted regular sampling, and tracked recurring errors.

When patterns appeared, we adjusted instructions and task logic, reducing future error rates instead of only filtering results afterward.

Audience management and filtering

Prolific’s filtering capabilities became a key control mechanism.

We used them not only to match basic criteria, but to stabilize the entire pipeline:

  • selecting participants with suitable devices
  • prioritizing users with strong task history
  • balancing geographic distribution

This helped maintain consistently high upload quality and predictable throughput, while reducing noise and rework.

StageInputWorkflow ScopeMain Quality Checks
Participant SourcingPlatform traffic, targeting rulesMulti-platform launch with Prolific as main sourceDemographic balance, device diversity
Photo CollectionRaw palm photo setsCollection of 6 mandatory images per participantAngle correctness, lighting, focus
Primary ValidationUploaded photo setsAssessor review of metadata and visual criteriaCompleteness, instruction compliance
Quality Control (QC)Validated setsDaily sampling, consistency checks, feedback loopsError rate, assessor accuracy
Dataset AssemblyApproved photo sets + metadataStructuring IDs, metadata files, packagingStructural integrity, format compliance
DeduplicationPrepared datasetAutomated duplicate detection across releasesUniqueness, release integrity
6 days
Pilot & Sampling
10 days
Guidelines & Metrics Alignment
6 weeks
Collection & Validation
2 weeks
QA & Final Dataset Delivery

The Results

  • Collected 20,000 palm sets and launched the next batch of 20,000
  • Delivered a fully verified dataset ahead of schedule with consistently high quality
  • Scaled the process while keeping predictable speed and global coverage, with unified standards regardless of region, device or platform
Large-scale biometric datasets are built on process discipline, not volume alone. Stable quality emerges when sourcing, instructions, validation, and deduplication work as a single continuous pipeline rather than isolated steps.
Hanna Parkhots
Hanna Parkhots
Data Collection Team Lead

Similar Cases

  • Image
    NLP Annotation services

    Banking Call Categorization for NLP Automation

    Fast-tracked annotation of 363,000 banking calls with strict privacy — boosting NLP automation for debit, credit, and deposit queries.

    Lean more
  • Image
    Geospatial Annotation services

    Aerial Image Annotation for Urban Planning

    We annotated 132,000+ objects in 11,000 aerial images—streamlining urban planning data with scalable workflows and tailored class logic.

    Lean more
  • Image
    Image Annotation

    Image Annotation for Strawberry Ripeness Detection

    Our custom dataset powered the transition from manual picking to AI-assisted harvesting — optimizing yield through data-driven ripeness detection.

    Lean more
  • Image
    NLP Annotation services

    Intent Annotation for E-commerce

    We transformed frequent buyer questions into structured intent data, enabling an AI assistant that improves response quality and user satisfaction across the marketplace.

    Lean more
  • Image
    Image Annotation

    Image Annotation for Retail Product Classification

    How do you annotate shelves packed with thousands of ever-changing products? We built a high-speed pipeline to handle real-time updates and ensure merchandising insights stay current.

    Lean more

Ready to get started?

Tell us what you need — we’ll reply within 24h with a free estimate

    What service are you looking for? *
    What service are you looking for?
    Data Labeling
    Data Collection
    Ready-made Datasets
    Human Moderation
    Medicine
    Other
    What's your budget range? *
    What's your budget range?
    < $1,000
    $1,000 – $5,000
    $5,000 – $10,000
    $10,000 – $50,000
    $50,000+
    Not sure yet
    Where did you hear about Unidata? *
    Where did you hear about Unidata?
    Head of Client Success
    Andrew
    Head of Client Success

    — I'll guide you through every step, from your first
    message to full project delivery

    Thank you for your
    message

    It has been successfully sent!

    We use cookies to enhance your experience, personalize content, ads, and analyze traffic. By clicking 'Accept All', you agree to our Cookie Policy.