Commercial

DeepFake Videos Dataset

The deepfake dataset contains real and AI-generated deepfake videos, featuring diverse subjects with detailed metadata on age, gender, and ethnicity to help train powerful deepfake detectors

Request a demo
  • files
    10,000+
  • people
    7,000+
  • Facial Recognition
  • Computer Vision
  • Machine learning
  • Data generation
  • Security
  • files
    10,000+
  • people
    7,000+

Dataset Info

Characteristic Data
Description Real video of people with AI-generated faces, where individuals turn their heads in different directions
Data types Video
Tasks Facial recognition, Computer Vision
Total number of files 10,000
Number of people 7,000
Video generation sites aisaver.io, faceswapvideo.ai, magichour.ai
Labeling Metadata (age, gender, ethnicity)
Gender Male, Female
Ethnicity Asian (30%), African (70%)
Age Min = 18, max = 80, mean = 45
Download sample

Statistics

Distribution by age
Duration of the video duration
Distribution by gender
Distribution by ethnicity

Technical
Characteristics

Characteristic Data
Video extension mp4, MOV
Video Resolutions 1920 x 1080p, 480 x 360p, 1280 x 720p, 720 x 480p, 640 x 480p, 1920 x 920p
Video duration Mean = 9, median = 9, min = 2, max = 34
Frames per second Mean = 26.6
Devices iPhone 13 (30%), Google Pixel (70%)
Source and collection methodology. Data was collected by overlaying generated faces onto real videos using the following websites: aisaver.io, faceswapvideo.ai, and magichour.ai.

Dataset Use Cases

  • Cybersecurity & Digital Forensics

    Detecting Deepfake Content and Fake Videos

    Deepfake Videos Dataset provides critical training data for developing deepfake detectors and detection algorithms. Containing both real videos and fake videos generated using advanced deepfake technology, it allows analysts to train detection systems that identify synthetic media and protect against identity fraud, misinformation, and other digital threats.



  • AI & Machine Learning Research

    Training Models for Deepfake Detection

    This deep fake detection dataset is widely used in machine learning and deep learning projects. The dataset consists of thousands of video clips and face images, including manually labelled examples. Models trained on this data achieve better accuracy in spotting AI-generated videos and distinguishing between real and synthetic video datasets.



  • Media & Journalism

    Verifying Video Content Authenticity

    News organizations use such datasets to enhance video detection tools that verify YouTube videos, interviews, and shared clips. By training recognition systems on datasets containing both source videos and generated faces, journalists can validate footage, identify manipulated content, and strengthen trust in digital reporting.



  • Technology & App Development

    Building Safer Recognition and Verification Systems

    Tech companies rely on the synthetic video dataset to test facial recognition and object detection systems against deepfake content. The dataset comprising high-quality video frames, fake images, and synthetic data helps in creating more reliable generative AI detection methods. This improves authentication solutions and delivers better results in protecting digital platforms.



FAQs

How large is DeepFake Videos Dataset compared to other available datasets?
With 10,000 video clips and diverse demographic coverage, this collection is one of the largest datasets of its kind. Its scale allows models trained on it to achieve higher accuracy in deepfake content detection and facial recognition tasks.
What devices and resolutions are represented in the dataset?
Videos were recorded on iPhone 13 devices (30%) and Google Pixel devices (70%), then processed with deepfake technology. The dataset covers multiple resolutions, including 1080p, 720p, and 480p, supporting a wide range of video detection methods.
How was the data collected?
The dataset was built by generating fake faces using AI models and overlaying them on real videos (using the following tools: aisaver.io, faceswapvideo.ai, and magichour.ai).
Is it possible to request a custom deepfake dataset?
Custom datasets can be created on request, allowing you to specify generation methods, annotation formats, or target demographics. This flexibility ensures better results for applications such as face recognition, synthetic video detection, or generative AI model training.
Still have questions about using Unidata datasets? Read our user-guides

Similar Datasets

What our clients are saying

UniData

4 3 Reviews

PA

Paul 2025-02-21

Very Positive Experience!

The team was very responsive when requesting a specific dataset, and was able to work with us on what data we specifically needed and custom pricing for our use case. Overall a great experience, and would recommend them to others!

TH

Thorsten 2025-01-09

Very good experience

We got in touch with UniData to buy several datasets from them. Communication was very cooperative, quick, and friendly. We were able to find contract conditions that suited both parties well. I also appreciate the team's dedication to understand and address the needs of the customer. And the datasets we bought from UniData matched with our expectations.

Max Crous 2024-10-08

Data purchase

Our team got in touch with UniData for purchasing video data. The team at UniData was transparent, timely, and pleasant to communicate and negotiate with. Their samples and descriptions aligned well with the data we received. We will certainly reach out to UniData again if we're in search of 3rd party video data.

Abhijeet Zilpelwar 2025-02-26

Data is well organized and easy to…

Data is well organized and easy to consume. We could download and use it for training within few hours of receiving the data links.

Why Choose Us

Unidata offers unparalleled expertise in AI data solutions, delivering superior data quality and optimized workflows

Expertise

Our team consists of industry-leading experts in AI data solutions

Quality

We ensure superior data quality to maximize your AI project's potential

Efficiency

Our optimized workflows accelerate your model training processes

Proven Results

Our track record of case studies demonstrates our ability to deliver outstanding outcomes

Customization

Our track record of case studies demonstrates our ability to deliver outstanding outcomes

Support

We provide ongoing support and consultation to ensure continuous success
background
team
1000 +
full-time assessors

Ready to get started?

Tell us what you need — we’ll reply within 24h with a free estimate

    What service are you looking for? *
    What service are you looking for?
    Data Labeling
    Data Collection
    Ready-made Datasets
    Human Moderation
    Medicine
    Other (please describe below)
    What's your budget range? *
    What's your budget range?
    < $1,000
    $1,000 – $5,000
    $5,000 – $10,000
    $10,000 – $50,000
    $50,000+
    Not sure yet
    Where did you hear about Unidata? *
    Where did you hear about Unidata?
    Head of Client Success
    Andrew
    Head of Client Success

    — I'll guide you through every step, from your first
    message to full project delivery

    Thank you for your
    message

    It has been successfully sent!

    We use cookies to enhance your experience, personalize content, ads, and analyze traffic. By clicking 'Accept All', you agree to our Cookie Policy.