Data Collection for Facial and Speech Recognition
Through data collection, the client improved their biometric system for facial and voice recognition by 21%.
- Industry and use case:
- Security
- Data:
- 10,000 attack videos
- Project duration:
- 21 days
Task:
The client approached us with a task to collect data to improve their facial and speech recognition system. The technical requirements included unique data collection (photos, videos) offline, strict criteria for sound and shooting format, and the need to gather 1,000 participants for the recordings. The client’s team was unable to organize data collection on such a scale and in the required format, so they turned to Unidata.
Solution
To collect this dataset, we organized five studios and invited people to participate for a reward using promoters. Each studio had a manager who assisted participants with the recordings and oversaw the data collection process. Each studio saw 8-10 participants per day.
Results:
- The client improved their biometric system for facial and voice recognition by 21%. We collected a unique dataset with over 400,000 data points from 1,000 offline participants.
- We assembled a team of promoters and identified suitable locations for offline projects throughout the city (NDA).
Our Cases
-
Within a month, more than 10% of the entire database with over 2,000 photographs was collected
-
Through data collection, the client improved their biometric system for facial and voice recognition by 21%.
-
Content moderation on the video hosting platform enabled a 99% reduction in the influx of prohibited content