NLP Annotation services

Product Grouping for E-commerce

Image

Thousands of listings. Different sellers. Endless naming variations.

Helping buyers navigate this chaos was the challenge facing one of the top classifieds platforms. To group similar offers under clean, easy-to-browse product cards, they needed a model trained on real, structured data. That’s where Unidata came in — providing expert annotation that cut through the clutter and made model identification not only possible, but scalable.

Image

Client Request

The client aimed to enhance user experience by implementing a system that could automatically and accurately identify the product model mentioned in listing titles and descriptions. The ultimate goal was to help buyers quickly compare relevant offers by grouping similar listings under unified product cards.

To achieve this, the platform engaged Unidata for high-quality data annotation — and we got to work.

Our Approach

Technical Scope and Pilot Phase

The client provided detailed guidelines for identifying product models from listing text.
Our team reviewed the instructions and proposed refinements, including:

  • How to handle product attributes (e.g., color or storage capacity) when they appeared in titles but weren’t part of the actual model
  • How to treat variations in naming conventions across different product categories

During the pilot phase, key challenges included:

  • Multilingual listings
  • Numerous abbreviations and non-standard formatting
  • Ambiguities requiring client clarification

Annotation and Review Process

Over the course of two months, our team annotated 20,000 listings, focusing on precise model identification. Key challenges we addressed included:

  • Identifying relevant model keywords in long and often cluttered titles
  • Extracting model names from product descriptions, especially in categories like fashion, where listings often contained attributes (e.g., sleeve length, material, color) irrelevant to the model itself
  • Standardizing model names across similar listings

To ensure consistency across annotators, we:

  • Developed a set of internal rules and examples
  • Conducted training sessions to reduce subjective variation
  • Implemented continuous review and feedback throughout the annotation phase

Validation Workflow

All annotations underwent a thorough validation process to ensure accuracy.

Because model identification involved subjective judgment, we took the following steps:

  • Held regular sync meetings to align interpretations
  • Updated annotation guidelines based on team feedback and corner cases
  • Provided ongoing training and clarification sessions for the team

Validator performance was monitored using analytics to:

  • Identify outliers or inconsistencies
  • Optimize review efficiency
  • Improve overall data quality
StageInputWorkflow ScopeMain Quality Checks
Guidelines & PilotClient guidelines, sample listingsReview instructions, propose refinements, pilot annotationsGuideline clarity / Pilot consistency
Annotator TrainingAnnotation rules, reference examplesTrain annotators, reduce subjective variation, internal alignmentUnderstanding of rules / Annotation consistency
Full AnnotationListings across categoriesAnnotate product models, standardize names, handle multilingual/abbreviated contentModel identification accuracy / Consistency across annotators
ValidationAnnotated datasetContinuous review, feedback, sync meetings, corner case handlingAnnotation correctness / Outlier detection
Final DeliveryValidated datasetConsolidation, final QA, submission to clientDataset completeness / Usability for AI training
Guidelines & Pilot Phase
2 weeks
Annotator Training & Setup
1 week
Full Annotation Cycle
4 weeks
Validation, QA & Final Delivery
1 week

The Results

  • The model trained on the annotated dataset was successfully deployed into the production system. Listings are now automatically grouped into product cards based on the identified model
  • The grouping logic correctly handles edge cases and non-standard listings
  • Real-user testing conducted by the client confirmed the effectiveness of the model even on complex or ambiguous examples
High-quality product grouping starts with expert annotation: transforming messy, multilingual listings into structured data that AI can reliably interpret.
Vladislav Barsukov
Vladislav Barsukov
Head of SLM&LLM Annotation

Similar Cases

  • Image
    Data Collection

    Fabric Mask Dataset for Biometric Testing

    Testing biometrics with frontal-only masks hides real weaknesses. We developed fabric mask samples for true multi-angle evaluation.

    Lean more
  • Image
    Image Annotation

    Image Segmentation for Retail Applications

    How do you segment every single object in a cluttered interior photo — 30+ classes per image? We designed a multi-step annotation pipeline to handle complexity without losing precision.

    Lean more
  • Image
    Geospatial Annotation services

    Aerial Image Annotation for Urban Planning

    We annotated 132,000+ objects in 11,000 aerial images—streamlining urban planning data with scalable workflows and tailored class logic.

    Lean more
  • Image
    Image Annotation

    Image Annotation for Retail Product Classification

    How do you annotate shelves packed with thousands of ever-changing products? We built a high-speed pipeline to handle real-time updates and ensure merchandising insights stay current.

    Lean more
  • Image
    Data Collection

    Image Data Collection for a Palm Recognition Task

    Collecting 20,000 palm photos sounds easy until you try it. We managed scale, verification, and logistics to deliver a clean dataset.

    Lean more

Ready to get started?

Tell us what you need — we’ll reply within 24h with a free estimate

    What service are you looking for? *
    What service are you looking for?
    Data Labeling
    Data Collection
    Ready-made Datasets
    Human Moderation
    Medicine
    Other
    What's your budget range? *
    What's your budget range?
    < $1,000
    $1,000 – $5,000
    $5,000 – $10,000
    $10,000 – $50,000
    $50,000+
    Not sure yet
    • United States+1
    • United Kingdom+44
    • Afghanistan (‫افغانستان‬‎)+93
    • Albania (Shqipëri)+355
    • Algeria (‫الجزائر‬‎)+213
    • American Samoa+1684
    • Andorra+376
    • Angola+244
    • Anguilla+1264
    • Antigua and Barbuda+1268
    • Argentina+54
    • Armenia (Հայաստան)+374
    • Aruba+297
    • Australia+61
    • Austria (Österreich)+43
    • Azerbaijan (Azərbaycan)+994
    • Bahamas+1242
    • Bahrain (‫البحرين‬‎)+973
    • Bangladesh (বাংলাদেশ)+880
    • Barbados+1246
    • Belarus (Беларусь)+375
    • Belgium (België)+32
    • Belize+501
    • Benin (Bénin)+229
    • Bermuda+1441
    • Bhutan (འབྲུག)+975
    • Bolivia+591
    • Bosnia and Herzegovina (Босна и Херцеговина)+387
    • Botswana+267
    • Brazil (Brasil)+55
    • British Indian Ocean Territory+246
    • British Virgin Islands+1284
    • Brunei+673
    • Bulgaria (България)+359
    • Burkina Faso+226
    • Burundi (Uburundi)+257
    • Cambodia (កម្ពុជា)+855
    • Cameroon (Cameroun)+237
    • Canada+1
    • Cape Verde (Kabu Verdi)+238
    • Caribbean Netherlands+599
    • Cayman Islands+1345
    • Central African Republic (République centrafricaine)+236
    • Chad (Tchad)+235
    • Chile+56
    • China (中国)+86
    • Christmas Island+61
    • Cocos (Keeling) Islands+61
    • Colombia+57
    • Comoros (‫جزر القمر‬‎)+269
    • Congo (DRC) (Jamhuri ya Kidemokrasia ya Kongo)+243
    • Congo (Republic) (Congo-Brazzaville)+242
    • Cook Islands+682
    • Costa Rica+506
    • Côte d’Ivoire+225
    • Croatia (Hrvatska)+385
    • Cuba+53
    • Curaçao+599
    • Cyprus (Κύπρος)+357
    • Czech Republic (Česká republika)+420
    • Denmark (Danmark)+45
    • Djibouti+253
    • Dominica+1767
    • Dominican Republic (República Dominicana)+1
    • Ecuador+593
    • Egypt (‫مصر‬‎)+20
    • El Salvador+503
    • Equatorial Guinea (Guinea Ecuatorial)+240
    • Eritrea+291
    • Estonia (Eesti)+372
    • Ethiopia+251
    • Falkland Islands (Islas Malvinas)+500
    • Faroe Islands (Føroyar)+298
    • Fiji+679
    • Finland (Suomi)+358
    • France+33
    • French Guiana (Guyane française)+594
    • French Polynesia (Polynésie française)+689
    • Gabon+241
    • Gambia+220
    • Georgia (საქართველო)+995
    • Germany (Deutschland)+49
    • Ghana (Gaana)+233
    • Gibraltar+350
    • Greece (Ελλάδα)+30
    • Greenland (Kalaallit Nunaat)+299
    • Grenada+1473
    • Guadeloupe+590
    • Guam+1671
    • Guatemala+502
    • Guernsey+44
    • Guinea (Guinée)+224
    • Guinea-Bissau (Guiné Bissau)+245
    • Guyana+592
    • Haiti+509
    • Honduras+504
    • Hong Kong (香港)+852
    • Hungary (Magyarország)+36
    • Iceland (Ísland)+354
    • India (भारत)+91
    • Indonesia+62
    • Iran (‫ایران‬‎)+98
    • Iraq (‫العراق‬‎)+964
    • Ireland+353
    • Isle of Man+44
    • Israel (‫ישראל‬‎)+972
    • Italy (Italia)+39
    • Jamaica+1876
    • Japan (日本)+81
    • Jersey+44
    • Jordan (‫الأردن‬‎)+962
    • Kazakhstan (Казахстан)+7
    • Kenya+254
    • Kiribati+686
    • Kosovo+383
    • Kuwait (‫الكويت‬‎)+965
    • Kyrgyzstan (Кыргызстан)+996
    • Laos (ລາວ)+856
    • Latvia (Latvija)+371
    • Lebanon (‫لبنان‬‎)+961
    • Lesotho+266
    • Liberia+231
    • Libya (‫ليبيا‬‎)+218
    • Liechtenstein+423
    • Lithuania (Lietuva)+370
    • Luxembourg+352
    • Macau (澳門)+853
    • Macedonia (FYROM) (Македонија)+389
    • Madagascar (Madagasikara)+261
    • Malawi+265
    • Malaysia+60
    • Maldives+960
    • Mali+223
    • Malta+356
    • Marshall Islands+692
    • Martinique+596
    • Mauritania (‫موريتانيا‬‎)+222
    • Mauritius (Moris)+230
    • Mayotte+262
    • Mexico (México)+52
    • Micronesia+691
    • Moldova (Republica Moldova)+373
    • Monaco+377
    • Mongolia (Монгол)+976
    • Montenegro (Crna Gora)+382
    • Montserrat+1664
    • Morocco (‫المغرب‬‎)+212
    • Mozambique (Moçambique)+258
    • Myanmar (Burma) (မြန်မာ)+95
    • Namibia (Namibië)+264
    • Nauru+674
    • Nepal (नेपाल)+977
    • Netherlands (Nederland)+31
    • New Caledonia (Nouvelle-Calédonie)+687
    • New Zealand+64
    • Nicaragua+505
    • Niger (Nijar)+227
    • Nigeria+234
    • Niue+683
    • Norfolk Island+672
    • North Korea (조선 민주주의 인민 공화국)+850
    • Northern Mariana Islands+1670
    • Norway (Norge)+47
    • Oman (‫عُمان‬‎)+968
    • Pakistan (‫پاکستان‬‎)+92
    • Palau+680
    • Palestine (‫فلسطين‬‎)+970
    • Panama (Panamá)+507
    • Papua New Guinea+675
    • Paraguay+595
    • Peru (Perú)+51
    • Philippines+63
    • Poland (Polska)+48
    • Portugal+351
    • Puerto Rico+1
    • Qatar (‫قطر‬‎)+974
    • Réunion (La Réunion)+262
    • Romania (România)+40
    • Russia (Россия)+7
    • Rwanda+250
    • Saint Barthélemy+590
    • Saint Helena+290
    • Saint Kitts and Nevis+1869
    • Saint Lucia+1758
    • Saint Martin (Saint-Martin (partie française))+590
    • Saint Pierre and Miquelon (Saint-Pierre-et-Miquelon)+508
    • Saint Vincent and the Grenadines+1784
    • Samoa+685
    • San Marino+378
    • São Tomé and Príncipe (São Tomé e Príncipe)+239
    • Saudi Arabia (‫المملكة العربية السعودية‬‎)+966
    • Senegal (Sénégal)+221
    • Serbia (Србија)+381
    • Seychelles+248
    • Sierra Leone+232
    • Singapore+65
    • Sint Maarten+1721
    • Slovakia (Slovensko)+421
    • Slovenia (Slovenija)+386
    • Solomon Islands+677
    • Somalia (Soomaaliya)+252
    • South Africa+27
    • South Korea (대한민국)+82
    • South Sudan (‫جنوب السودان‬‎)+211
    • Spain (España)+34
    • Sri Lanka (ශ්‍රී ලංකාව)+94
    • Sudan (‫السودان‬‎)+249
    • Suriname+597
    • Svalbard and Jan Mayen+47
    • Swaziland+268
    • Sweden (Sverige)+46
    • Switzerland (Schweiz)+41
    • Syria (‫سوريا‬‎)+963
    • Taiwan (台灣)+886
    • Tajikistan+992
    • Tanzania+255
    • Thailand (ไทย)+66
    • Timor-Leste+670
    • Togo+228
    • Tokelau+690
    • Tonga+676
    • Trinidad and Tobago+1868
    • Tunisia (‫تونس‬‎)+216
    • Turkey (Türkiye)+90
    • Turkmenistan+993
    • Turks and Caicos Islands+1649
    • Tuvalu+688
    • U.S. Virgin Islands+1340
    • Uganda+256
    • Ukraine (Україна)+380
    • United Arab Emirates (‫الإمارات العربية المتحدة‬‎)+971
    • United Kingdom+44
    • United States+1
    • Uruguay+598
    • Uzbekistan (Oʻzbekiston)+998
    • Vanuatu+678
    • Vatican City (Città del Vaticano)+39
    • Venezuela+58
    • Vietnam (Việt Nam)+84
    • Wallis and Futuna (Wallis-et-Futuna)+681
    • Western Sahara (‫الصحراء الغربية‬‎)+212
    • Yemen (‫اليمن‬‎)+967
    • Zambia+260
    • Zimbabwe+263
    • Åland Islands+358
    Where did you hear about Unidata? *
    Where did you hear about Unidata?
    Andrew
    Head of Client Success

    — I'll guide you through every step, from your first
    message to full project delivery

    Thank you for your
    message

    It has been successfully sent!

    We use cookies to enhance your experience, personalize content, ads, and analyze traffic. By clicking 'Accept All', you agree to our Cookie Policy.