Initial upload of Wargon Clothing Classifier v1.0

Browse files

Files changed (5) hide show

README.md +229 -0
class_mappings.json +122 -0
config.json +83 -0
model.safetensors +3 -0
preprocessor_config.json +31 -0

README.md ADDED Viewed

	@@ -0,0 +1,229 @@

+---
+license: apache-2.0
+base_model: google/vit-base-patch16-224
+tags:
+- image-classification
+- vision
+- clothing
+- fashion
+- vit
+- pytorch
+datasets:
+- wargoninnovation/clothingdatasetsecondhand
+metrics:
+- accuracy
+- f1
+pipeline_tag: image-classification
+widget:
+- src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/tiger.jpg
+  example_title: Tiger
+---
+# Wargon Clothing Classifier
+A Vision Transformer (ViT) based model for clothing classification, trained on secondhand clothing images. This model can classify 27 different types of clothing items with 73% accuracy.
+## Model Details
+### Model Description
+This is a Vision Transformer model fine-tuned for clothing classification. It was developed to solve real-world clothing categorization challenges in secondhand fashion applications.
+- **Developed by:** Wargon Innovation
+- **Model type:** Image Classification
+- **Language(s):** N/A (Vision model)
+- **License:** Apache 2.0
+- **Finetuned from model:** google/vit-base-patch16-224
+### Model Sources
+- **Repository:** [Wargon Innovation Clothing Dataset](https://huggingface.co/datasets/wargoninnovation/clothingdatasetsecondhand)
+- **Base Model:** [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224)
+## Uses
+### Direct Use
+This model can be used for:
+- Automatic clothing categorization in e-commerce
+- Fashion inventory management
+- Secondhand clothing marketplaces
+- Fashion recommendation systems
+### Downstream Use
+The model can be fine-tuned for:
+- Specific clothing brand recognition
+- Size estimation from images
+- Style classification
+- Multi-label clothing attribute detection
+## How to Get Started with the Model
+```python
+from transformers import AutoImageProcessor, AutoModelForImageClassification
+from PIL import Image
+import torch
+# Load model and processor
+processor = AutoImageProcessor.from_pretrained("wargoninnovation/wargon-clothing-classifier")
+model = AutoModelForImageClassification.from_pretrained("wargoninnovation/wargon-clothing-classifier")
+# Load and preprocess image
+image = Image.open("path_to_clothing_image.jpg")
+inputs = processor(image, return_tensors="pt")
+# Make prediction
+with torch.no_grad():
+    outputs = model(**inputs)
+    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+# Get top prediction
+predicted_class_id = predictions.argmax().item()
+```
+## Training Details
+### Training Data
+The model was trained on the [wargoninnovation/clothingdatasetsecondhand](https://huggingface.co/datasets/wargoninnovation/clothingdatasetsecondhand) dataset, which contains over 30,000 images of secondhand clothing items across 34+ categories.
+**Data Preprocessing:**
+- Filtered classes with fewer than 10 samples to ensure robust train/validation splits
+- Final dataset contains 27 clothing categories
+- Images resized to 224x224 pixels
+- Stratified train/validation split (80/20)
+### Training Procedure
+#### Preprocessing
+- **Image Size:** 224x224 pixels
+- **Normalization:** ImageNet statistics
+- **Data Augmentation:** Standard transformations applied
+#### Training Hyperparameters
+- **Training regime:** Mixed precision (fp16)
+- **Learning Rate:** 2e-5
+- **Batch Size:** 16
+- **Epochs:** 6
+- **Optimizer:** AdamW
+- **Weight Decay:** 0.01
+- **Warmup Steps:** 500
+- **Label Smoothing:** 0.1
+#### Hardware
+- **GPU:** NVIDIA RTX 3060 (12GB VRAM)
+- **Training Time:** ~1.5 hours
+## Evaluation
+### Testing Data, Factors & Metrics
+The model was evaluated on a stratified validation set (20% of the filtered dataset).
+#### Metrics
+- **Validation Accuracy:** 73.0%
+- **F1 Score:** 72.7%
+- **Precision:** 72.8%
+- **Recall:** 73.0%
+### Results
+The model achieves balanced performance across major clothing categories, with particular strength in:
+- Common items (T-shirts, Jeans, Dresses)
+- Well-represented categories in the training data
+- Clean product photography (as in the training dataset)
+## Clothing Categories
+The model can classify the following 27 clothing types:
+1. Blazer
+2. Blouse
+3. Cardigan
+4. Dress
+5. Hoodie
+6. Jacket
+7. Jeans
+8. Nightgown
+9. Outerwear
+10. Pajamas
+11. Rain jacket
+12. Rain trousers
+13. Robe
+14. Shirt
+15. Shorts
+16. Skirt
+17. Sweater
+18. T-shirt
+19. Tank top
+20. Tights
+21. Top
+22. Training top
+23. Trousers
+24. Tunic
+25. Vest
+26. Winter jacket
+27. Winter trousers
+## Limitations and Bias
+### Limitations
+- **Image Quality:** Best performance on clean, well-lit product photos similar to training data
+- **Background:** Optimized for images with minimal background distractions
+- **Viewpoint:** Trained primarily on front-facing clothing images
+- **Categories:** Limited to the 27 categories present in training data
+### Bias
+- **Data Source:** Trained on secondhand clothing, may not generalize well to new/luxury items
+- **Cultural Bias:** Dataset may reflect specific regional fashion preferences
+- **Class Imbalance:** Some categories had limited representation even after filtering
+## Environmental Impact
+- **Hardware Type:** NVIDIA RTX 3060
+- **Hours Used:** ~1.5 hours training time
+- **Cloud Provider:** N/A (Local training)
+- **Compute Region:** Local
+## Technical Specifications
+### Model Architecture
+- **Base:** Vision Transformer (ViT-Base/16)
+- **Parameters:** ~86M parameters
+- **Input Size:** 224x224x3
+- **Patch Size:** 16x16
+- **Number of Classes:** 27
+### Software
+- **Framework:** PyTorch
+- **Libraries:** HuggingFace Transformers, Datasets
+- **Training Libraries:** Weights & Biases (W&B)
+## Citation
+```bibtex
+@misc{wargon_clothing_classifier_2024,
+  title={Wargon Clothing Classifier: A Vision Transformer for Secondhand Fashion Classification},
+  author={Wargon Innovation},
+  year={2024},
+  publisher={Hugging Face},
+  howpublished={\url{https://huggingface.co/wargoninnovation/wargon-clothing-classifier}},
+}
+```
+## Model Card Authors
+Wargon Innovation Team
+## Model Card Contact
+For questions and feedback, please open an issue in the model repository or contact the Wargon Innovation team.

class_mappings.json ADDED Viewed

	@@ -0,0 +1,122 @@

+{
+  "class_to_id": {
+    "Blazer": 0,
+    "Blouse": 1,
+    "Cardigan": 2,
+    "Dress": 3,
+    "Hoodie": 4,
+    "Jacket": 5,
+    "Jeans": 6,
+    "Nightgown": 7,
+    "Outerwear": 8,
+    "Pajamas": 9,
+    "Rain jacket": 10,
+    "Rain trousers": 11,
+    "Robe": 12,
+    "Shirt": 13,
+    "Shorts": 14,
+    "Skirt": 15,
+    "Sweater": 16,
+    "T-shirt": 17,
+    "Tank top": 18,
+    "Tights": 19,
+    "Top": 20,
+    "Training top": 21,
+    "Trousers": 22,
+    "Tunic": 23,
+    "Vest": 24,
+    "Winter jacket": 25,
+    "Winter trousers": 26
+  },
+  "id_to_class": {
+    "0": "Blazer",
+    "1": "Blouse",
+    "2": "Cardigan",
+    "3": "Dress",
+    "4": "Hoodie",
+    "5": "Jacket",
+    "6": "Jeans",
+    "7": "Nightgown",
+    "8": "Outerwear",
+    "9": "Pajamas",
+    "10": "Rain jacket",
+    "11": "Rain trousers",
+    "12": "Robe",
+    "13": "Shirt",
+    "14": "Shorts",
+    "15": "Skirt",
+    "16": "Sweater",
+    "17": "T-shirt",
+    "18": "Tank top",
+    "19": "Tights",
+    "20": "Top",
+    "21": "Training top",
+    "22": "Trousers",
+    "23": "Tunic",
+    "24": "Vest",
+    "25": "Winter jacket",
+    "26": "Winter trousers"
+  },
+  "num_classes": 27,
+  "valid_classes": [
+    0,
+    1,
+    2,
+    3,
+    4,
+    5,
+    6,
+    7,
+    8,
+    9,
+    10,
+    11,
+    12,
+    13,
+    14,
+    15,
+    16,
+    17,
+    18,
+    19,
+    20,
+    21,
+    22,
+    23,
+    25,
+    26,
+    27,
+    30,
+    31,
+    32
+  ],
+  "class_weights": [
+    3.2049648761749268,
+    0.7775523066520691,
+    0.9295064210891724,
+    0.4611579179763794,
+    1.5798324346542358,
+    1.1890760660171509,
+    0.7341421842575073,
+    5.0,
+    3.072801351547241,
+    5.0,
+    5.0,
+    5.0,
+    5.0,
+    0.5360822677612305,
+    0.9422394037246704,
+    1.1290216445922852,
+    0.4077451825141907,
+    0.29895859956741333,
+    0.8248940706253052,
+    1.6935325860977173,
+    0.2645518183708191,
+    5.0,
+    0.3766576051712036,
+    4.585565090179443,
+    4.609201908111572,
+    5.0,
+    5.0
+  ]
+}

config.json ADDED Viewed

	@@ -0,0 +1,83 @@

+{
+  "architectures": [
+    "ViTForImageClassification"
+  ],
+  "attention_probs_dropout_prob": 0.0,
+  "encoder_stride": 16,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.0,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2",
+    "3": "LABEL_3",
+    "4": "LABEL_4",
+    "5": "LABEL_5",
+    "6": "LABEL_6",
+    "7": "LABEL_7",
+    "8": "LABEL_8",
+    "9": "LABEL_9",
+    "10": "LABEL_10",
+    "11": "LABEL_11",
+    "12": "LABEL_12",
+    "13": "LABEL_13",
+    "14": "LABEL_14",
+    "15": "LABEL_15",
+    "16": "LABEL_16",
+    "17": "LABEL_17",
+    "18": "LABEL_18",
+    "19": "LABEL_19",
+    "20": "LABEL_20",
+    "21": "LABEL_21",
+    "22": "LABEL_22",
+    "23": "LABEL_23",
+    "24": "LABEL_24",
+    "25": "LABEL_25",
+    "26": "LABEL_26"
+  },
+  "image_size": 224,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_10": 10,
+    "LABEL_11": 11,
+    "LABEL_12": 12,
+    "LABEL_13": 13,
+    "LABEL_14": 14,
+    "LABEL_15": 15,
+    "LABEL_16": 16,
+    "LABEL_17": 17,
+    "LABEL_18": 18,
+    "LABEL_19": 19,
+    "LABEL_2": 2,
+    "LABEL_20": 20,
+    "LABEL_21": 21,
+    "LABEL_22": 22,
+    "LABEL_23": 23,
+    "LABEL_24": 24,
+    "LABEL_25": 25,
+    "LABEL_26": 26,
+    "LABEL_3": 3,
+    "LABEL_4": 4,
+    "LABEL_5": 5,
+    "LABEL_6": 6,
+    "LABEL_7": 7,
+    "LABEL_8": 8,
+    "LABEL_9": 9
+  },
+  "layer_norm_eps": 1e-12,
+  "model_type": "vit",
+  "num_attention_heads": 12,
+  "num_channels": 3,
+  "num_hidden_layers": 12,
+  "patch_size": 16,
+  "pooler_act": "tanh",
+  "pooler_output_size": 768,
+  "problem_type": "single_label_classification",
+  "qkv_bias": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.55.3"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e24861295646225ec60be0f8e195e01bc20144fc20cf637fcd92acde23d5bea
+size 343300876

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "crop_size": null,
+  "data_format": "channels_first",
+  "default_to_square": true,
+  "device": null,
+  "disable_grouping": null,
+  "do_center_crop": null,
+  "do_convert_rgb": null,
+  "do_normalize": true,
+  "do_rescale": true,
+  "do_resize": true,
+  "image_mean": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "image_processor_type": "ViTImageProcessorFast",
+  "image_std": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "input_data_format": null,
+  "resample": 2,
+  "rescale_factor": 0.00392156862745098,
+  "return_tensors": null,
+  "size": {
+    "height": 224,
+    "width": 224
+  }
+}