The Task
Imagine this: A medical company conducting advanced research on baldness approaches us with an unusual request.
They needed a dataset of bald men. Each set had to consist of five photos of the same person taken from different angles: front view, profile (both sides), top view, and back view.
But it wasn’t just about collecting photos—it was about creating a database where each image was "labeled" with medical precision. The reference point was the Norwood scale, a classification system for the stages of baldness. There was no room for error.
The Solution
Preparation Stage:
- Studying the Norwood Scale: We dived deep into the topic, researching everything available about the scale.
- Guideline Creation: With the help of medical experts, we developed an extremely detailed guide for our annotators, describing each stage of baldness down to the millimeter, complete with illustrations and diagrams.
Annotator Training and Preparation:
- Every annotator had to pass an entry test on the Norwood scale (a 50-example test) before starting work.
- We conducted hands-on training on annotating photos with different degrees of baldness, analyzing ambiguous cases, and discussing evaluation criteria.
- After training, we calibrated the annotators’ work to ensure uniformity and precision.
- Annotators were also trained in using spreadsheets, data processing platforms, and annotation tools.
- To support the team, we set up a helpdesk for real-time expert assistance.
Data Collection:
- A major crowd-platform served as the primary channel for collecting photos.
- We created a clear and engaging task for participants, complete with detailed instructions and examples. To encourage high-quality submissions, we offered competitive compensation.
- Our team closely monitored the process to promptly identify and address potential issues.
Data Validation:
- We implemented a multi-layered verification system. First, an automated check ensured that all required photos were present and met technical specifications.
- Then, experts manually annotated the images using the Norwood scale, cross-referencing them with our guidelines.
- To guarantee accuracy, we conducted a cross-validation, where one set of experts reviewed the work of others.
The Result
- Not only did we meet the deadline, but we also achieved near-perfect annotation accuracy. Thanks to thorough annotator training and a multi-step validation process, annotation errors were kept to a minimum—less than 1%.
- The client approved all the data and was highly satisfied with the results.