Hotel Booking Cancellation Predictor

Predicts probability that a hotel booking will be cancelled (Sri Lankan hospitality context). The champion model is XGBoost; threshold based decisions currently use 0.35000000000000003 (see champion_meta.json).

Last updated: 2025-10-05 16:43 UTC

Key Metrics (Holdout)

Metric Value
F1 0.8046506137865911
ROC-AUC 0.9384035807110922
Precision 0.841708852944808
Recall 0.7707179197286602
Accuracy 0.8613786749308987

Top Features (SHAP importance)

  • deposit_type
  • country__te
  • market_segment
  • total_of_special_requests
  • lead_time
  • required_car_parking_spaces
  • assigned_room_type
  • customer_type_target_encoded
  • reserved_room_type
  • previous_cancellations

Quickstart

from huggingface_hub import snapshot_download
import joblib, json, pandas as pd

local_dir = snapshot_download(repo_id="j2damax/hotel-cancel-model")
model = joblib.load(f"{local_dir}/champion_model.pkl")
preprocessor = joblib.load(f"{local_dir}/preprocessor.pkl")
meta = json.load(open(f"{local_dir}/champion_meta.json"))

sample = pd.DataFrame([{
    'lead_time': 45, 'arrival_month': 7, 'adults': 2, 'children': 0, 'adr': 110.0
}])

X = preprocessor.transform(sample)
proba = float(model.predict_proba(X)[:,1][0])
print('Cancellation probability:', round(proba, 4))

Files

  • champion_model.pkl – serialized champion estimator
  • preprocessor.pkl – unified preprocessing / feature pipeline
  • champion_meta.json – metrics & threshold
  • Optional SHAP / feature importance JSON artifacts

Notes

Model trained with stratified 5-fold CV; primary selection metric: F1; tie-breaker: ROC-AUC. Class imbalance handled via class weights.

Citation

Academic coursework (NIB 7072) β€” Sri Lankan tourism cancellation risk analysis.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using j2damax/hotel-cancel-model 1

Evaluation results