Commercial

Mammography Dataset

Mammography Dataset is a comprehensive DDSM dataset containing over 600,000 digital mammograms with pathology segmentation and study-level labels, designed for cancer detection, breast imaging, and training AI models in cancer diagnosis using real-world mammographic images from screening mammography exams.

Get in touch Download sample
  • studies with protocol
    100,000+
  • studies without protocol
    500,000+
  • Medicine
  • Computer vision
  • Machine Learning
  • Segmentation
  • Classification

Mammography Dataset is a comprehensive DDSM dataset containing over 600,000 digital mammograms with pathology segmentation and study-level labels, designed for cancer detection, breast imaging, and training AI models in cancer diagnosis using real-world mammographic images from screening mammography exams.

Get in touch Download sample
  • Medicine
  • Computer vision
  • Machine Learning
  • Segmentation
  • Classification
  • studies with protocol
    100,000+
  • studies without protocol
    500,000+

Dataset Info

Characteristic Data
Description X-ray of the mammary with or without protocols
Data types DiCOM
Markup Segmentation of pathologies
Tasks Pathology recognition, computer vision.
Number of studies 600,000+
Labeling Information about each study, including target pathology (1 for presence, 0 for absence)
Download sample

Technical
Characteristics

Characteristic Data
File extension DiCOM
Extension of labeling file csv
Source and collection methodology. Data was collected by a partner of Unidata

Dataset Use Cases

  • Healthcare & Medical Imaging

    Enhancing Breast Cancer Screening and Diagnosis

    Mammography Dataset is a vital resource for improving cancer detection and diagnosis research in digital mammography. The dataset contains thousands of mammogram photos of breast cancer, including malignant cases and normal scans. It supports the development of computer-aided diagnosis tools that assist radiologists in interpreting mammographic images with higher accuracy and confidence.

  • Artificial Intelligence & Deep Learning

    Training Models for Breast Cancer Detection

    This DDSM dataset provides valuable training data for developing learning models used in cancer detection and breast imaging analysis. The datasets consist of high-resolution digital mammograms annotated with breast masses, density levels, and lesion boundaries, enabling precise image analysis and robust model performance for automated screening mammography systems.

  • Clinical Research & Diagnostics

    Benchmarking Imaging Tools for Cancer Identification

    Mammography Dataset is widely used in clinical data research to benchmark imaging tools and support systems for cancer screenings. By providing standardized mammographic datasets, including curated breast imaging cases, researchers can evaluate computer-aided detection algorithms and compare models performance across different mammography databases and public studies.

  • Academic & Biomedical Research

    Supporting Breast Cancer Image Analysis Studies

    This curated mammography dataset contributes to the research community by offering high-quality X-ray images of breast cancer and related patient information. It complements public datasets such as MIAS datasets and ViNDr-Mammo dataset, enabling new approaches to film mammography, breast density assessment, and full-field digital mammography for advanced diagnostic research and teaching.

FAQs

What is included in this dataset?
The dataset contains over 600,000 mammography studies in DICOM format, representing both healthy and pathological cases. Each study includes segmentation labels and CSV files with metadata describing patient information, breast density, and target pathology indicators.
Can I request a sample of the dataset before purchasing or downloading it?
Yes. Unidata provides free sample files from Mammography Dataset for preliminary evaluation. These samples allow users to test compatibility, assess annotation structure, and examine mammogram image quality before committing to the full dataset purchase.
What types of annotations are provided?
Mammography Dataset includes segmentation masks marking regions of pathology and structured CSV annotations specifying whether a lesion is present (1) or absent (0). This labeling format supports supervised learning for tumor detection, classification, and segmentation accuracy benchmarking.
How are Unidata datasets licensed?
Unidata datasets follow a dual-licensing model: free samples are available for testing and evaluation, while full datasets are provided exclusively through purchase. This allows institutions and developers to review dataset quality before acquisition.
Do Unidata datasets follow GDPR or other data privacy regulations?
Yes. All Unidata datasets are curated in accordance with GDPR and other applicable data protection laws. Medical imaging data is collected and processed from legally authorized and anonymized sources, ensuring full ethical compliance for research and AI development.
How are Unidata datasets stored?
Unidata stores all datasets securely on AWS cloud infrastructure, offering scalable and high-availability access. The system adheres to ISO 27001 and ISO 27701 standards, ensuring a secure, privacy-focused storage environment for large-scale medical imaging datasets like mammography and radiology archives.
How long does it take to receive the dataset?
After your purchase request is submitted, Unidata will verify project details and documentation requirements. Once agreements and payments are finalized, Mammography Dataset will be securely delivered within 3–10 business days.
Is this a real-world dataset or synthetic data?
Mammography Dataset is a real-world medical imaging dataset, sourced from actual digital mammography exams. It provides authentic X-ray images of breast cancer and healthy tissues.
Still have questions about using Unidata datasets? Read our user-guides

Similar Datasets

What our clients are saying

UniData

4 3 Reviews

PA

Paul 2025-02-21

Very Positive Experience!

The team was very responsive when requesting a specific dataset, and was able to work with us on what data we specifically needed and custom pricing for our use case. Overall a great experience, and would recommend them to others!

TH

Thorsten 2025-01-09

Very good experience

We got in touch with UniData to buy several datasets from them. Communication was very cooperative, quick, and friendly. We were able to find contract conditions that suited both parties well. I also appreciate the team's dedication to understand and address the needs of the customer. And the datasets we bought from UniData matched with our expectations.

Max Crous 2024-10-08

Data purchase

Our team got in touch with UniData for purchasing video data. The team at UniData was transparent, timely, and pleasant to communicate and negotiate with. Their samples and descriptions aligned well with the data we received. We will certainly reach out to UniData again if we're in search of 3rd party video data.

Abhijeet Zilpelwar 2025-02-26

Data is well organized and easy to…

Data is well organized and easy to consume. We could download and use it for training within few hours of receiving the data links.

Why Choose Us

Unidata offers unparalleled expertise in AI data solutions, delivering superior data quality and optimized workflows

Expertise

Our team consists of industry-leading experts in AI data solutions

Quality

We ensure superior data quality to maximize your AI project's potential

Efficiency

Our optimized workflows accelerate your model training processes

Proven Results

Our track record of case studies demonstrates our ability to deliver outstanding outcomes

Customization

Our track record of case studies demonstrates our ability to deliver outstanding outcomes

Support

We provide ongoing support and consultation to ensure continuous success
background
team
1000 +
full-time assessors

Ready to get started?

Tell us what you need — we’ll reply within 24h with a free estimate

    What service are you looking for? *
    What service are you looking for?
    Data Labeling
    Data Collection
    Ready-made Datasets
    Human Moderation
    Medicine
    Other (please describe below)
    What's your budget range? *
    What's your budget range?
    < $1,000
    $1,000 – $5,000
    $5,000 – $10,000
    $10,000 – $50,000
    $50,000+
    Not sure yet
    Where did you hear about Unidata? *
    Where did you hear about Unidata?
    Head of Client Success
    Andrew
    Head of Client Success

    — I'll guide you through every step, from your first
    message to full project delivery

    Thank you for your
    message

    It has been successfully sent!

    We use cookies to enhance your experience, personalize content, ads, and analyze traffic. By clicking 'Accept All', you agree to our Cookie Policy.