Home Datasets Biometric Real vs Fake Human Voice – Deepfake Audio Dataset

Commercial

Real vs Fake Human Voice – Deepfake Audio Dataset

Real vs Fake Human Voice – Deepfake Audio Dataset contains 5,000 audio files featuring both genuine human recordings and AI-generated voice samples. Each set includes four speakers with multiple clips across M4A and MP3 formats. The dataset supports research in deepfake detection, generated speech analysis, and real vs fake human voice recognition tasks.

Audio

5,000

Speech Analysis
ASR
Machine learning
Data generation
Audio Processing

Speech Analysis
ASR
Machine learning
Data generation
Audio Processing

Audio

5,000

Dataset Info

Characteristic	Data
Description	Audio for deepfake voice detection, containing genuine human speech recordings paired with multiple matching synthetic copies.
Data types	Audio
Tasks	OCR, Computer Vision
Total number of files	5 000
Number of files in a set	4 speakers × 5 clips × 4 audio files (1 original + 3 synthetic)
Labeling	Metadata (country, gender, ID, age, audio group, audio name, audio text)
Gender	Male, Female

Statistics

: Distribution by gender

Technical
Characteristics

Characteristic	Data
Audio Extensions	M4a, MP3
Data Type	generated

Source and collection methodology: Data was AI-generated.

Dataset Use Cases

Cybersecurity and Fraud Prevention
Developing Reliable Deepfake Voice Detection Systems

This deepfake audio dataset provides real human and AI-generated speech essential for detecting fake voices used in fraud and impersonation. With detailed metadata on gender, accent, and duration, it enables the training of accurate detection models. Applications include secure banking, telecommunication verification, and government systems, improving data protection and digital security against voice-based attacks.
Voice Authentication and Biometrics
Improving Voice-Based Identity Verification

The dataset supports voice recognition and biometric authentication by including both genuine and synthetic speech. Developers can train systems to detect spoofing, verify liveness, and enhance accuracy in voice-controlled platforms. Metadata on speaker identity and speech characteristics ensures models can differentiate real and fake voices, strengthening security in enterprise access, mobile authentication, and identity verification services.
Media Integrity and Journalism
Detecting Synthetic Voices in Broadcasts and Online Content

Researchers and forensic analysts can use this dataset to identify deepfake voices in media and social platforms. By providing real and AI-generated recordings, it enables training of models to detect manipulation, verify authenticity, and combat misinformation. Applications include journalism verification, monitoring social media, and supporting legal investigations in voice-based deception cases.
AI Ethics and Research
Exploring Responsible Use of Synthetic Voice Technology

This dataset enables research on ethical synthetic voice applications by comparing real and AI-generated speech. Experts can analyze imitation quality, emotional tone, and human-likeness to guide responsible AI use. Applications include improving accessibility, assistive technologies, and entertainment, while ensuring deepfake detection, ethical voice synthesis, and safe deployment in media and AI systems.

FAQs

What is included in this dataset?

The dataset includes 5,000 audio files containing real and synthetic speech samples from male and female speakers. Each human recording is paired with three AI-generated versions, offering direct comparison for deepfake detection and voice authenticity testing.

What types of annotations are provided?

Each file includes metadata annotations such as speaker ID, gender, accent, native locale, text ID, and duration in seconds. These labels help researchers track speaker variations, analyze speech patterns, and improve deepfake voice classification accuracy.

How was the data collected?

The real human voice recordings were collected through crowdsourcing platforms under consented conditions. The synthetic speech samples were generated using AI-based TTS models, ensuring controlled and reproducible comparisons between authentic and generated voices.

Can I request a sample of the dataset before purchasing or downloading it?

Yes. Unidata provides free sample files so you can evaluate audio quality, metadata accuracy, and synthetic generation consistency before purchase. These samples help you determine whether the dataset fits your machine learning or voice recognition project.

How are Unidata datasets licensed?

Unidata datasets follow a dual-licensing model. Free samples are available for initial testing, while the complete dataset is available for purchase to ensure full access to all files and metadata.

Do Unidata datasets follow GDPR or other data privacy regulations?

Yes. All Unidata datasets are created and distributed in compliance with GDPR and international data protection laws. Every voice recording and generated sample is handled ethically, ensuring no personal or identifiable data is included.

How are Unidata datasets stored?

All datasets are securely stored on AWS cloud infrastructure, providing high availability, data integrity, and scalability. Unidata’s storage practices comply with ISO 27001 and ISO 27701 standards, ensuring a secure and privacy-focused environment for handling audio data.

Is this a real-world dataset or synthetic data?

The dataset includes both real-world human voice recordings and synthetic deepfake audio. This combination provides a balanced foundation for training AI models to differentiate between authentic and generated speech, enhancing the performance of deepfake detection systems.

Still have questions about using Unidata datasets?

Unidata Cases

Digital Tree Passport Annotation for Forest Mapping

Forestry Monitoring & GIS
2 months
200,000 trees, 10 species classes

Learn more

License Plate Annotation for Vehicle Recognition System

100,000 images with detailed license plate markup (bounding boxes, digits, regional symbols)
2 weeks

Learn more

Sentiment Annotation for Brand Monitoring

Marketing & Consumer Insights
12,000 text samples, 3 sentiment classes (positive, negative, neutral)
3 weeks

Learn more

Surveillance Video Annotation for Entrance Monitoring

Surveillance & Security
90 minutes of video from three cameras, approximately 50-60 thousand frames
2 week

Learn more

Similar Datasets

Commercial
- Facial Recognition
- Speech Recognition
- Machine learning
- Emotion Recognition
- Computer Vision
AI Avatar Dataset

AI Avatar Dataset provides 1,215 high-resolution videos and 405 speech clips of 45 individuals, annotated with gender, ethnicity, and multi-emotion labels. Designed for AI avatar generation, identity verification, and computer vision tasks, this emotional video and speech dataset supports creating realistic avatars, training deep learning models, and advancing applications in virtual assistants, facial recognition, and digital identity.

45 people
1 215 videos
405 audios
Commercial
- Machine Learning
- Audio Processing
- ASR
- Voice Recognition
Audio Dataset: Various Music Genres

This music genres dataset contains 500,000 studio-grade music tracks in lossless FLAC format, designed for music genre classification and detection tasks. It provides rich music metadata, including detailed genre labels, instruments, and artist information, making it ideal training data for machine learning and deep learning models in audio analysis.

500,000 Audio
Commercial
- NLP
- LLM
- Machine Learning
- Audio Processing
- ASR
- Voice Recognition
American Speech Recognition Dataset

The dataset includes 10+ hours of annotated telephone dialogues from 20+ native speakers across the United States, providing high-quality audio recordings, transcriptions, and speaker metadata to support speech recognition systems, NLP tasks, and machine learning models requiring diverse American speech datasets

10+ Hours
20+ Speakers
Commercial
- NLP
- LLM
- Machine Learning
- Audio Processing
- ASR
- Voice Recognition
British English Speech Recognition Dataset

The dataset consists of 10+ hours of high-quality telephone dialogues from 20+ native speakers in the UK, with detailed annotations (ID, language, format, minutes) to support speech recognition systems, NLP tasks, and machine learning models requiring diverse British English audio datasets.

20+ Speakers
10+ Hours
Commercial
- Emotion Recognition
- Speech Analysis
- Audio
- ASR
- NLP
- Machine learning
Speech Emotion Recognition Dataset

Speech Emotion Recognition Dataset comprises over 30,000 audio recordings labeled with four distinct speech emotions: euphoria, joy, sadness, and surprise. It is designed to train emotion recognition and speech recognition systems using rich audio features, human-labeled metadata, and diverse emotional expressions for advanced machine learning and sentiment analysis tasks.

30,000+ audio
4 emotions

Why Companies Trust Unidata's Datasets

Share your project requirements, we handle the rest. Every service is tailored, executed, and compliance-ready, so you can focus on strategy and growth, not operations.

70+ Datasets

Finance, IT, E-commerce, Retail, Healthcare and 14+ Industries
Multiple supported formats

Unique & Diverse Data

Diversity in ethnicity, age, country, gender, and more
Exclusively collected data, not available from open sources

Custom Dataset Solutions

No manual collection needed from your side; we handle everything
Up to 70% cheaper than in-house

100% Legal, Secure & Compliant

Curated and legally sourced
AWS ISO 27001/27701

Smooth Collaboration & Fast Delivery

87% of datasets delivered in 3–10 days
Dedicated PM, Europe-timezone communication

Need Proof?

See the results we've delivered for leading tech companies and startups.

Explore datasets

What our clients are saying

UniData

4 3 Reviews

Paul 2025-02-21

Very Positive Experience!

The team was very responsive when requesting a specific dataset, and was able to work with us on what data we specifically needed and custom pricing for our use case. Overall a great experience, and would recommend them to others!

Thorsten 2025-01-09

Very good experience

We got in touch with UniData to buy several datasets from them. Communication was very cooperative, quick, and friendly. We were able to find contract conditions that suited both parties well. I also appreciate the team's dedication to understand and address the needs of the customer. And the datasets we bought from UniData matched with our expectations.

Max Crous 2024-10-08

Data purchase

Our team got in touch with UniData for purchasing video data. The team at UniData was transparent, timely, and pleasant to communicate and negotiate with. Their samples and descriptions aligned well with the data we received. We will certainly reach out to UniData again if we're in search of 3rd party video data.

Abhijeet Zilpelwar 2025-02-26

Data is well organized and easy to…

Data is well organized and easy to consume. We could download and use it for training within few hours of receiving the data links.

Trusted by the world's biggest brands

Our Clients Love Us

Enterprise Document Automation

Document AI Lead

The dataset gave us strong value for both pilot and early-stage testing. We plan to broaden coverage as deployment scales.

Identity Verification Lab

Deputy Director

The data was good. We passed PAD level 1 from iBeta.

Ready to get started?

Tell us what you need — we’ll reply within 24h with a free estimate

What service are you looking for? *

What service are you looking for?

Data Labeling

Data Collection

Ready-made Datasets

Human Moderation

Medicine

Other

What's your budget range? *

What's your budget range?

< $1,000

$1,000 – $5,000

$5,000 – $10,000

$10,000 – $50,000

$50,000+

Not sure yet

Оставьте это поле пустым.

Where did you hear about Unidata? *

Where did you hear about Unidata?

Google LinkedIn Kaggle / Hugging Face / Github Referral (colleague, partner, client) G2 ChatGPT / AI assistant Other

I agree to the Terms of Service and Privacy Policy. By submitting my contact information, I consent to receive emails, messages, and calls from Unidata and its affiliates.

Andrew: Head of Client Success

— I'll guide you through every step, from your first
message to full project delivery

Thank you for your
message

It has been successfully sent!

We use cookies to enhance your experience, personalize content, ads, and analyze traffic. By clicking 'Accept All', you agree to our Cookie Policy.

Real vs Fake Human Voice – Deepfake Audio Dataset

Dataset Info

Statistics

Technical Characteristics

Dataset Use Cases

Developing Reliable Deepfake Voice Detection Systems

Improving Voice-Based Identity Verification

Detecting Synthetic Voices in Broadcasts and Online Content

Exploring Responsible Use of Synthetic Voice Technology

FAQs

Unidata Cases

Digital Tree Passport Annotation for Forest Mapping

License Plate Annotation for Vehicle Recognition System

Sentiment Annotation for Brand Monitoring

Surveillance Video Annotation for Entrance Monitoring

Similar Datasets

AI Avatar Dataset

Audio Dataset: Various Music Genres

American Speech Recognition Dataset

British English Speech Recognition Dataset

Speech Emotion Recognition Dataset

Why Companies Trust Unidata's Datasets

70+ Datasets

Unique & Diverse Data

Custom Dataset Solutions

100% Legal, Secure & Compliant

Smooth Collaboration & Fast Delivery

Need Proof?

What our clients are saying

UniData

Very Positive Experience!

Very good experience

Data purchase

Data is well organized and easy to…

Our Clients Love Us

Enterprise Document Automation

Identity Verification Lab

Ready to get started?

Thank you for your message

Ready to get started?

Technical
Characteristics

Thank you for your
message