Commercial
Spanish Speech Recognition Dataset
The dataset contains audio of real-world Spanish telephone dialogues between native speakers, providing speech data with detailed annotations for speech recognition, language models, and speech technology, ideal for training recognition systems and developing automatic speech and NLP applications in the Spanish language
Request a demo
-
- Hours
- 488
-
- Speakers
- 600
-
- Word Accuracy Rate
- 98%
- NLP
- LLM
- Machine Learning
- Audio Processing
- ASR
- Voice Recognition
-
- Hours
- 488
-
- Speakers
- 600
-
- Word Accuracy Rate
- 98%
Dataset Info
Characteristic | Data |
Description | Audio of telephone dialogues in Spanish for training NLP models in real-world conversational scenarios. |
Data types | Audio |
Tasks | Speech recognition, NLP |
Country | Spain (ESP) |
Hours of telephone dialogue | 488 |
Number of speakers | 600 |
Labeling | Annotation (text content, speaker's ID, gender, age and other attributes) |
Gender | Male (49%), Female (51%) |
Recording device | Telephone |
Statistics
-
- Distribution by gender
Technical
Characteristics
Characteristic | Data |
Audio Format | PCM, a-law/u-law |
Sampling Rate | 8kHz |
Number of Channels | Mono |
Recording condition | Low background noise (indoor) |
Dataset Use Cases
What can Spanish Speech Recognition Dataset be used for?
It can be used for training automatic speech recognition systems, language models, and NLP applications. It supports voice technology, speech-to-text solutions, and machine learning models that require authentic Spanish speech data.
In what format is the dataset provided?
The dataset is available in PCM, a-law, and u-law formats with a sampling rate of 8kHz and mono channels.
What types of annotations are provided?
Unidata Spanish Speech Recognition Dataset includes fully labeled text transcriptions of conversations, along with speaker metadata such as ID, gender, age, and other attributes. These annotations are essential for building high-accuracy recognition models and language processing systems.
Can I request a sample of the dataset before purchasing it?
Yes, you can request a sample of the dataset. The sample allows you to review audio recordings, transcriptions, and speaker metadata so you can confirm the dataset’s quality for your speech recognition or NLP tasks.
Still have questions about using Unidata datasets?
Read our user-guides
Similar Datasets
What our clients are saying

UniData
Why Choose Us
Unidata offers unparalleled expertise in AI data solutions, delivering superior data quality and optimized workflowsExpertise
Our team consists of industry-leading experts in AI data solutionsQuality
We ensure superior data quality to maximize your AI project's potentialEfficiency
Our optimized workflows accelerate your model training processesProven Results
Our track record of case studies demonstrates our ability to deliver outstanding outcomesCustomization
Our track record of case studies demonstrates our ability to deliver outstanding outcomesSupport
We provide ongoing support and consultation to ensure continuous success
- 1000 +
- full-time assessors
Ready to get started?
Tell us what you need — we’ll reply within 24h with a free estimate

- Andrew
- Head of Client Success
— I'll guide you through every step, from your first
message to full project delivery
Thank you for your
message
We use cookies to enhance your experience, personalize content, ads, and analyze traffic. By clicking 'Accept All', you agree to our Cookie Policy.