HW2 Airline Classification with AutoGluon
Model Description
This repository contains an AutoML model trained with AutoGluon Tabular to classify airlines based on flight features. The model was trained as part of Homework 2 in CMU 24-679 (Designing and Deploying AI/ML).
- Framework: AutoGluon Tabular
- Best model:
WeightedEnsemble_L2(ensemble of NeuralNetTorch, XGBoost, LightGBM, FastAI nets) - Task: Multiclass classification (
Airline) - Classes: Spirit, Frontier, American, Southwest, Allegiant, Breeze, Air Canada, United
Results
- Validation Accuracy: 0.8792
- Original Data Accuracy: 1.0000
- Original Data Weighted F1: 1.0000
Classification Report (Original Data)
| Class | Precision | Recall | F1-score | Support |
|---|---|---|---|---|
| Air Canada | 1.00 | 1.00 | 1.00 | 1 |
| Allegiant | 1.00 | 1.00 | 1.00 | 2 |
| American | 1.00 | 1.00 | 1.00 | 6 |
| Breeze | 1.00 | 1.00 | 1.00 | 1 |
| Frontier | 1.00 | 1.00 | 1.00 | 7 |
| Southwest | 1.00 | 1.00 | 1.00 | 3 |
| Spirit | 1.00 | 1.00 | 1.00 | 9 |
| United | 1.00 | 1.00 | 1.00 | 1 |
Overall Accuracy: 1.00 Macro Avg F1: 1.00 Weighted Avg F1: 1.00
How to Use
Install requirements
pip install autogluon==1.4.0 huggingface_hub cloudpickle
import cloudpickle
from huggingface_hub import hf_hub_download
pkl_path = hf_hub_download(
repo_id="cassieli226/hw1-airline-automl",
filename="autogluon_predictor.pkl",
repo_type="model"
)
with open(pkl_path, "rb") as f:
predictor = cloudpickle.load(f)
import pandas as pd
X_test = pd.DataFrame({
"Stops": [1],
"Days from Departure": [30],
"Flight_Time_Minutes": [120],
"Price": [150],
"Day of the Week": [3],
"Destination": ["MCO"]
})
print(predictor.predict(X_test))
import zipfile, shutil, pathlib
from huggingface_hub import hf_hub_download
import autogluon.tabular as ag
zip_path = hf_hub_download(
repo_id="cassieli226/hw1-airline-automl",
filename="autogluon_predictor_dir.zip",
repo_type="model"
)
extract_dir = pathlib.Path("predictor_dir")
if extract_dir.exists():
shutil.rmtree(extract_dir)
with zipfile.ZipFile(zip_path, "r") as zf:
zf.extractall(str(extract_dir))
predictor = ag.TabularPredictor.load(str(extract_dir))
print(predictor.leaderboard(silent=True))