Turkish NLP E-Commerce
					Collection
				
				3 items
				• 
				Updated
					
				
This model is a fine-tuned version of dbmdz/bert-base-turkish-cased specifically trained for aspect term extraction from Turkish e-commerce product reviews.
B-ASPECT: Beginning of an aspect termI-ASPECT: Inside/continuation of an aspect termO: Outside (not an aspect term)The model showed consistent improvement across epochs:
| Epoch | Loss | 
|---|---|
| 1 | 0.1758 | 
| 2 | 0.1749 | 
| 3 | 0.1217 | 
| 4 | 0.1079 | 
| 5 | 0.0699 | 
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-aspect-extraction")
model = AutoModelForTokenClassification.from_pretrained("opdullah/bert-turkish-ecomm-aspect-extraction")
# Create pipeline
aspect_extractor = pipeline("token-classification", 
                           model=model, 
                           tokenizer=tokenizer,
                           aggregation_strategy="simple")
# Example usage
text = "Bu telefonun kamerası çok iyi ama bataryası yetersiz."
results = aspect_extractor(text)
print(results)
Expected Output:
[{'entity_group': 'ASPECT', 'score': 0.99498886, 'word': 'kamerası', 'start': 13, 'end': 21}, 
 {'entity_group': 'ASPECT', 'score': 0.9970175, 'word': 'bataryası', 'start': 34, 'end': 43}]
import torch
from transformers import AutoTokenizer, AutoModelForTokenClassification
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-aspect-extraction")
model = AutoModelForTokenClassification.from_pretrained("opdullah/bert-turkish-ecomm-aspect-extraction")
# Example text
text = "Bu telefonun kamerası çok iyi ama bataryası yetersiz."
# Tokenize input
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_ids = predictions.argmax(dim=-1)
# Convert predictions to labels
tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
predicted_labels = [model.config.id2label[class_id.item()] for class_id in predicted_class_ids[0]]
# Display results
for token, label in zip(tokens, predicted_labels):
    if token not in ['[CLS]', '[SEP]', '[PAD]']:
        print(f"{token}: {label}")
Expected Output:
Bu: O
telefonun: O
kamerası: B-ASPECT
çok: O
iyi: O
ama: O
batarya: B-ASPECT
##sı: I-ASPECT
yetersiz: O
.: O
import torch
from transformers import AutoTokenizer, AutoModelForTokenClassification
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-aspect-extraction")
model = AutoModelForTokenClassification.from_pretrained("opdullah/bert-turkish-ecomm-aspect-extraction")
# Example texts for batch processing
texts = [
    "Bu telefonun kamerası çok iyi ama bataryası yetersiz.",
    "Ürünün fiyatı uygun ancak kalitesi düşük.",
    "Teslimat hızı mükemmel, ambalaj da gayet sağlam."
]
# Tokenize all texts
inputs = tokenizer(texts, return_tensors="pt", truncation=True, padding=True)
# Get predictions for all texts
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_ids = predictions.argmax(dim=-1)
# Process results for each text
for i, text in enumerate(texts):
    print(f"\nText {i+1}: {text}")
    print("-" * 50)
    
    # Get tokens for this specific text
    tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][i])
    predicted_labels = [model.config.id2label[class_id.item()] for class_id in predicted_class_ids[i]]
    
    # Display results
    for token, label in zip(tokens, predicted_labels):
        if token not in ['[CLS]', '[SEP]', '[PAD]']:
            print(f"{token}: {label}")
Expected Output:
Text 1: Bu telefonun kamerası çok iyi ama bataryası yetersiz.
Bu: O
telefonun: O
kamerası: B-ASPECT
çok: O
iyi: O
ama: O
batarya: B-ASPECT
##sı: I-ASPECT
yetersiz: O
.: O
Text 2: Ürünün fiyatı uygun ancak kalitesi düşük.
Ürünün: O
fiyatı: B-ASPECT
uygun: O
ancak: O
kalitesi: B-ASPECT
düşük: O
.: O
Text 3: Teslimat hızı mükemmel, ambalaj da gayet sağlam.
Teslim: B-ASPECT
##at: I-ASPECT
hızı: I-ASPECT
mükemmel: O
,: O
ambalaj: B-ASPECT
da: O
gayet: O
sağlam: O
.: O
id2label = {
    0: "O",
    1: "B-ASPECT", 
    2: "I-ASPECT"
}
label2id = {
    "O": 0,
    "B-ASPECT": 1,
    "I-ASPECT": 2
}
This model is designed for:
If you use this model, please cite:
@misc{turkish-bert-aspect-extraction,
  title={Turkish BERT for Aspect Term Extraction},
  author={Abdullah Koçak},
  year={2025},
  url={https://huggingface.co/opdullah/bert-turkish-ecomm-aspect-extraction}
}
@misc{schweter2020bertbase,
  title={BERTurk - BERT models for Turkish},
  author={Stefan Schweter},
  year={2020},
  publisher={Hugging Face},
  url={https://huggingface.co/dbmdz/bert-base-turkish-cased}
}
Base model
dbmdz/bert-base-turkish-cased