Model Transparency
Our prediction model is fully open about its methodology and accuracy. No black boxes.
How It Works
Every UFC fight since 1994, scraped from UFCStats.com. Fighter stats computed as point-in-time snapshots to prevent data leakage.
LightGBM + CatBoost ensemble (65/35 blend), tuned with Optuna (50 trials each). Trained on pre-2022 data, validated on 2022, tested on 2023.
45 features including Elo ratings, rolling fight stats, defensive metrics, style matchups, and market odds when available.
Confidence Scaling
The model's confidence labels reflect real accuracy. When we say “Strong”, we mean it.
Calibration
When the model says 70%, does that fighter actually win ~70% of the time? Closer to the diagonal = better calibrated.
| Predicted Range | Mean Predicted | Actual Win Rate | Fights | Difference |
|---|---|---|---|---|
| 10-20% | 17.7% | 23.7% | 38 | +6.0pp |
| 20-30% | 25.6% | 25.7% | 171 | +0.1pp |
| 30-40% | 35.1% | 32.2% | 239 | 2.9pp |
| 40-50% | 44.8% | 48.3% | 331 | +3.5pp |
| 50-60% | 54.8% | 57.6% | 304 | +2.7pp |
| 60-70% | 64.9% | 64.1% | 265 | 0.8pp |
| 70-80% | 74.5% | 75.0% | 152 | +0.5pp |
| 80-90% | 84.7% | 93.8% | 96 | +9.1pp |
| 90-100% | 91.0% | 88.9% | 9 | 2.1pp |
Performance by Weight Class
What Drives Predictions
Top features ranked by SHAP importance (mean absolute impact on predictions).
Honest Assessment: Model vs Market
On fights with betting odds (1,080 of 1,605 test fights), the closing line achieves 68.0% accuracy — and our model matches it at 68.1% on the same subset. Overall test accuracy is 65.2%.
Where our model adds value:
- Fights without odds data — we still predict at 59.0% accuracy
- Structural analysis: Elo, style matchups, and stat differentials give context the line doesn't
- Early line detection: spots value before lines sharpen
- High-confidence picks (>72% model probability) hit at 81.3%
Training Details
What Changed in v2
- Dropped team features — removed gym/team encodings that added noise without improving accuracy.
- Added interaction features — offensive efficiency (strikes landed per absorbed) captures two-way matchup dynamics better than raw stat diffs.
- Adaptive Elo K-factor — Elo ratings now update faster for early-career fighters and slow down for veterans, improving rating responsiveness.
- Data-driven confidence thresholds — tier boundaries (toss-up / lean / confident / strong) are now set from observed accuracy curves instead of arbitrary cutoffs.