The Task
The city administration required a high-quality dataset to train a neural network capable of tracking waste bin fill levels and ensuring timely collection. To enhance the model’s accuracy, the dataset needed to include images of waste bins of various types and capacities, captured under different lighting and weather conditions—ranging from clear skies to rain and snow. The ultimate goal was to optimize waste collection vehicle logistics and reduce operational costs.
Our Solution:
We implemented a comprehensive data collection strategy using two key approaches:
Crowdsourcing:
To cover a wide range of scenarios, we engaged a broad network of contributors who captured 1000 images of waste bins across different city areas.
This approach enabled us to quickly gather diverse images reflecting required variations in lighting and weather conditions.
Rapid Response Data Collection Team:
To fill in missing scenarios, we assembled a mobile team. These specialists followed predefined routes, capturing waste bin images in challenging conditions such as nighttime, heavy rainfall, and densely populated residential areas.
| Stage | Input | Workflow Scope | Main Quality Checks |
|---|---|---|---|
| Scenario Planning | Client requirements, urban use cases | Definition of bin types, locations, weather scenarios | Coverage completeness, scenario relevance |
| Crowdsourcing Collection | Distributed contributors | Image collection across city areas | Diversity, lighting and weather variation |
| Rapid Response Collection | Targeted scenarios | Mobile team capturing missing conditions | Edge case coverage, data consistency |
| Image Annotation | Raw bin images | Labeling fill levels and bin types | Label accuracy, guideline adherence |
| Validation | Annotated dataset | Cross-checking and error detection | Inter-annotator agreement, bias control |
| Final QA | Validated dataset | Dataset consolidation and delivery | Consistency, client acceptance |
The Results
- 87% improvement in model accuracy for waste bin monitoring, significantly increasing system efficiency.
- Optimized waste collection logistics: garbage trucks now respond to bin fill levels in real time, reducing costs and improving city sanitation.
- The city administration gained an automated waste management tool, accelerating municipal services’ response to overfilled bins and reducing citizen complaints.
Waste detection datasets require strong diversity in environmental conditions and consistent annotation standards. Model accuracy depends on capturing real-world variability and ensuring quality control across all stages of data collection and labeling.
- Roman Lukoshin
- Speech and Generative Data Manager