The Task
The client requested the transcription of 80 hours of audio materials, with a strong focus on accuracy and full alignment between the transcripts and the original recordings. These files contained real conversations and calls with background noise.
A key feature of this project was the absence of pre-labeling. We worked with audio files of varying quality, including challenging cases with noise and overlapping voices.
It was also essential to ensure the synchronization of text with audio, which required close attention — especially in segments where several people spoke simultaneously or where background sounds interfered.
The Solution
Preparation and workflow organization:
- The data was split into short fragments of 5-15 seconds and uploaded to Label Studio.
- Each annotator received clear instructions on working with audio and accurately capturing the spoken text.
Data annotation:
- Annotators carefully listened to each recording and manually transcribed the spoken words.
- Special attention was paid to clarity and understanding in cases of overlapping voices or muffled speech.
Quality control:
- All data underwent validation, and feedback was provided on any errors, which were then sent back to annotators for correction.
- A feedback system was used throughout the project to improve accuracy and efficiency.
| Stage | Input | Workflow Scope | Main Quality Checks |
|---|---|---|---|
| Secure Setup | Client internal annotation module | Access configuration, NDA compliance, infrastructure onboarding | Access control compliance / Data security adherence |
| Team Onboarding | Banking service documentation | Domain training, internal testing, terminology exams | Service understanding / Exam validation |
| Annotation | Banking call audio records | Intent classification and structured labeling inside client system | Correct category assignment / Edge-case handling |
| Calibration | Annotated samples | Twice-weekly review sessions and corner-case discussion | Label consistency / Disagreement resolution |
| Final Validation | Classified records | Parallel validation and final consolidation | Accuracy threshold compliance / Bias reduction |
The Results
- The project was delivered on time — 80 hours of transcription per month.
- Quality was ensured not only through validation but, first and foremost, through effective training and the team’s deep understanding of the guidelines. Validation also played an important role.
- The team’s high productivity allowed us to consistently handle the workload without pre-labeling.
Financial calls often contain unclear pronunciation, background noise, and overlapping speech. In such cases, experience matters more than speed. We relied on senior transcribers who could understand the context, not just the sound.
- Vladislav Barsukov
- Head of SLM&LLM Annotation