Grouping Listings into Product Cards

Image

Thousands of listings. Different sellers. Endless naming variations.

Helping buyers navigate this chaos was the challenge facing one of the top classifieds platforms. To group similar offers under clean, easy-to-browse product cards, they needed a model trained on real, structured data. That’s where Unidata came in — providing expert annotation that cut through the clutter and made model identification not only possible, but scalable.

Industry Major online classifieds platform
Timeline 2 months
Data 20,000 listings with triple annotation overlap
Image
Industry Major online classifieds platform
Timeline 2 months
Data 20,000 listings with triple annotation overlap

Client Request

The client aimed to enhance user experience by implementing a system that could automatically and accurately identify the product model mentioned in listing titles and descriptions. The ultimate goal was to help buyers quickly compare relevant offers by grouping similar listings under unified product cards.

To achieve this, the platform engaged Unidata for high-quality data annotation — and we got to work.

Our Approach

  • 01

    Technical Scope and Pilot Phase

    The client provided detailed guidelines for identifying product models from listing text.
    Our team reviewed the instructions and proposed refinements, including:

    • How to handle product attributes (e.g., color or storage capacity) when they appeared in titles but weren’t part of the actual model
    • How to treat variations in naming conventions across different product categories

    During the pilot phase, key challenges included:

    • Multilingual listings
    • Numerous abbreviations and non-standard formatting
    • Ambiguities requiring client clarification
  • 02

    Annotation and Review Process

    Over the course of two months, our team annotated 20,000 listings, focusing on precise model identification. Key challenges we addressed included:

    • Identifying relevant model keywords in long and often cluttered titles
    • Extracting model names from product descriptions, especially in categories like fashion, where listings often contained attributes (e.g., sleeve length, material, color) irrelevant to the model itself
    • Standardizing model names across similar listings

    To ensure consistency across annotators, we:

    • Developed a set of internal rules and examples
    • Conducted training sessions to reduce subjective variation
    • Implemented continuous review and feedback throughout the annotation phase
  • 03

    Validation Workflow

    All annotations underwent a thorough validation process to ensure accuracy.

    Because model identification involved subjective judgment, we took the following steps:

    • Held regular sync meetings to align interpretations
    • Updated annotation guidelines based on team feedback and corner cases
    • Provided ongoing training and clarification sessions for the team

    Validator performance was monitored using analytics to:

    • Identify outliers or inconsistencies
    • Optimize review efficiency
    • Improve overall data quality

Results

  • The model trained on the annotated dataset was successfully deployed into the production system. Listings are now automatically grouped into product cards based on the identified model

  • The grouping logic correctly handles edge cases and non-standard listings

  • Real-user testing conducted by the client confirmed the effectiveness of the model even on complex or ambiguous examples

Similar Cases

  • Image
    Audio Labeling services for ml Audio Transcription

    Banking Call Categorization

    To automate call categorization, one of Eastern Europe’s largest banks entrusted us with sensitive voice data covering credit, debit, deposits, and balances. We built a privacy-first annotation pipeline with in-house experts, multilayer validation, and weekly reporting to ensure both compliance and accuracy—enabling faster, smarter service automation.

    Lean more
  • Image
    Image Annotation

    Semantic Segmentation for Interior Design: A Complex Multiclass Annotation Project

    How do you segment every single object in a cluttered interior photo — 30+ classes per image? We designed a multi-step annotation pipeline to handle complexity without losing precision.

    Lean more
  • Image
    Data Collection

    Fight Detection for a Video Analytics System

    We helped our client build a high-quality dataset of staged fight scenes, enabling them to accelerate the development of a […]

    Lean more
  • Image
    Data Collection

    Data Collection for a Video Analytics System: Children’s Laughter and Crying

    We faced a challenging task: collecting 750 unique recordings of children’s laughter, crying, and speech within a month, all while […]

    Lean more
  • Image
    Data Collection

    Optimizing Waste Collection: Data Gathering for City Administration

    How can AI improve waste collection efficiency? We helped the city administration build a high-quality dataset that boosted waste bin […]

    Lean more

Ready to get started?

Tell us what you need — we’ll reply within 24h with a free estimate

    What service are you looking for? *
    What service are you looking for?
    Data Labeling
    Data Collection
    Ready-made Datasets
    Human Moderation
    Medicine
    Other (please describe below)
    What's your budget range? *
    What's your budget range?
    < $1,000
    $1,000 – $5,000
    $5,000 – $10,000
    $10,000 – $50,000
    $50,000+
    Not sure yet
    Where did you hear about Unidata? *
    Where did you hear about Unidata?
    Head of Client Success
    Andrew
    Head of Client Success

    — I'll guide you through every step, from your first
    message to full project delivery

    Thank you for your
    message

    It has been successfully sent!

    This website uses cookies to enhance your experience, analyze traffic, and deliver personalized content and ads. By clicking "Accept", you consent to the use of cookies, as described in our Cookie Policy. Please choose your cookie preference.