Data Collection for a Video Analytics System: Children’s Laughter and Crying

We faced a challenging task: collecting 750 unique recordings of children's laughter, crying, and speech within a month, all while meeting strict quality and diversity requirements. Thanks to a flexible data collection approach, multi-level verification, and well-coordinated teamwork, we successfully met the deadline.

Industry:
Development of child response systems for laughter and crying
Data:
750 unique audio files featuring children's voices
Timeline:
1 month
Industry:
Development of child response systems for laughter and crying
Data:
750 unique audio files featuring children's voices
Timeline:
1 month

The Task

The client requested the collection of 750 unique audio recordings of children’s laughter, crying, and speech within one month. Each child could participate only once, eliminating the possibility of using the same actors multiple times. Strict quality and diversity requirements added complexity to the task.

The Solution

To ensure an efficient data collection process, we divided it into several stages:

  • 01

    Defining Data Requirements:

    • Each child needed to provide five recordings: two speech samples, two laughter samples, and one crying sample.
    • Each audio file had to be 20 seconds long.
    • Participants’ ages ranged from 0 to 4 years, with specific quotas for each age group.
  • 02

    Data Collection Approach:

    • A pilot phase using the Yandex.Toloka platform proved to be too slow.
    • We switched to an in-house collection strategy, engaging parents through social media and childcare institutions.
    • To verify the authenticity of the audio, we required submissions in video format to confirm that the laughter, crying, and speech genuinely belonged to a child and that there were no repeated participants.
  • 03

    Data Collection Approach:

    • A pilot phase using the Yandex.Toloka platform proved to be too slow.
    • We switched to an in-house collection strategy, engaging parents through social media and childcare institutions.
    • To verify the authenticity of the audio, we required submissions in video format to confirm that the laughter, crying, and speech genuinely belonged to a child and that there were no repeated participants.
  • 04

    Data Verification and Processing:

    • Initial validation by our team.
    • Audio processing: Extracting sound from video files and segmenting recordings into 20-second clips.
    • Final verification to ensure compliance with all requirements.

The Results

  • 750 unique audio recordings were collected within the deadline.

  • The dataset met the required diversity and authenticity standards.

  • The client successfully validated the data and was fully satisfied with the outcome.

Our Cases

Case studies highlight how our services have enhanced AI model training and improved business outcomes across various industries See more
  • Data Collection for Anti-Spoofing Tasks

    Within a month, more than 10% of the entire database with over 2,000 photographs was collected

  • Data Collection for Facial and Speech Recognition

    Through data collection, the client improved their biometric system for facial and voice recognition by 21%.

  • Content Moderation on the Video

    Content moderation on the video hosting platform enabled a 99% reduction in the influx of prohibited content

Image for form
logo
Andrey,
Head of Sales

Ready to work with us?

    This website uses cookies to enhance your experience, analyze traffic, and deliver personalized content and ads. By clicking "Accept", you consent to the use of cookies, as described in our Cookie Policy. Please choose your cookie preference.