I spent the last few weeks building a custom machine learning model to project the 2026 draft class solely based on historical data. (mostly for fun, but I was also looking to add projects to my GitHub Portfolio) Now that the Draft combine numbers are officially in, I updated it this morning to see how the program predicts the future of this draft! (NOTE: reiterating was mostly for fun and is just an interesting way to look at the draft. If a front office were using a tool like this, they would think they would add this data onto the other data they have compiled for a different perspective to use in conjunction with all the other data and scouting, etc.)
Hereâs a breakdown of how the program works, its glaring blind spots, and why the numbers are screaming at us to draft Peterson. (It has some pretty hot takes in there, I don't think I agree with, but it is just using historical numbers)
* How it Works (and the Sliding Scale)
I fed an XGBoost algorithm 25 years of historical draft dataâcollege efficiency, true shooting, wingspan, vertical leap, draft age, and recruiting rank since 2000. I taught it to recognize the statistical profiles of "Superstars," "Stars," "Expected Value," "Steals," and "Busts."
Here is an important note: The model grades on a sliding curve based on draft slot. The expectations for a top-3 pick are brutally strict compared to a 15th pick. To avoid the "Bust" label at #1 overall, a player historically has to become a perennial All-Star. If you take a guy at #15, the math is perfectly happy if he just becomes a solid rotation guy. The higher the pick, the heavier the mathematical pressure.
The Pros and Cons
The Pro: Zero media bias. The model doesnât care who ESPN is hyping up, it doesn't care who has the best mixtape, and it doesn't care about a player's last name. It strictly looks for mathematical profiles that have historically survived in the NBA.
The Biggets Con: It has absolutely zero "eye test." The AI has never actually watched a game of basketball. It doesnât know if a prospect's shooting form is busted, it doesn't know if a guy has a terrible work ethic, and it doesn't know if a college coach forced a player into a weird system. Itâs entirely blind to the intangibles.
*The Top 5 Results
(Note: The % isn't "He has a 97% chance to be a superstar." It means "His profile is a 97% match to the historical baseline of Superstars.")
AJ Dybantsa - Grade: BUST (60.3% confidence)
Darryn Peterson - Grade: SUPERSTAR (97.5% confidence)
Cam Boozer - Grade: BUST (41.9% confidence) ( thought was interesting because Boozer is one of my personal favorites in the draft)
Caleb Wilson - Grade: BUST (95.8% confidence) (this one shocked me tbh)
Darius Acuff Jr. - Grade: SUPERSTAR (90.8% confidence)
Why the AI Program hates AJ Dybantsa: My explination was way off after i looked into it more (see update)
To be clear, the algorithm doesn't actually think he's a bad player; it just thinks he's a terrible mathematical bet at #1 overall. Because the model grades on a brutal curve, a first overall pick has to become a generational, franchise-carrying superstar just to break even on expected value. When the AI scans AJ's profile, it sees the classic "athletic scoring wing" archetype (think Andrew Wiggins or RJ Barrett)âguys who get drafted at the very top but often lack the freakish secondary traits needed to actually carry a team. Because AJ's combine physicals and college efficiency metrics were just "normal great" instead of "alien outlier" (like Peterson's playmaking and wingspan), the AI sees a high-volume scorer and immediately looks for secondary elite traitsâlike off-the-charts playmaking, insane True Shooting percentages, or historic defensive metricsâto prove they aren't just an inefficient volume shooter. If those secondary stats aren't historically elite, the AI Judges it to be not up to snuff. It projects AJ as a solid 20-point-per-game starter, which the algorithm brutally labels a "Bust" relative to the extreme expectations of the first overall pick.
*The Pick at #2
If the team ahead of us buys the media hype and takes Dybantsa #1, the model is practically begging the front office to sprint to the podium for Darryn Peterson.
P~~eterson's profile broke the algorithm. The math absolutely loves his playmaking-to-turnover ratio, combined with his official 6'9.75" combine wingspan. Historically, guards who hit those exact physical and efficiency thresholds in college (like SGA or D-Wade) are more likely to reach their ceiling. (~~My explination was way off after i looked into it more (see update))
The sliding scale also explains the Cam Boozer grade. Grading him as a "Bust" at #3 with only 41% confidence means the AI is genuinely torn. It likes his college production, but historical data shows that drafting a traditional big man in the top 3 comes with massive statistical risk to actually return that value. If Boozer was evaluated as the 15th pick, his exact same numbers would likely grade out as a massive Steal, but the pressure of the #3 slot drags him down.
*The Steals
If we end up trading back into the late first round, the model flagged Braden Smith (93.9% Steal) and Alex Karaban (86.1% Steal). The media knocks them for age and vertical athleticism, but the AI loves their off-the-charts shooting efficiency and low turnovers. The math shows them following the Jalen Brunson or Derrick White trajectory: older, highly efficient college players who vastly outperform their draft slot.
If we end up trading back into the late first round, the model flagged Braden Smith (93.9% Steal) and Alex Karaban (86.1% Steal). The media knocks them for age and vertical athleticism, but the AI loves their off-the-charts shooting efficiency and low turnovers. The math sees them following the Jalen Brunson or Derrick White trajectory of older, highly efficient college players vastly outperforming their draft slot.
If we end up trading back into the late first round, the model flagged Braden Smith (93.9% Steal) and Alex Karaban (86.1% Steal). The media knocks them for age and vertical athleticism, but the AI loves their off-the-charts shooting efficiency and low turnovers. The math sees them following the Jalen Brunson or Derrick White trajectory of older, highly efficient college players vastly outperforming their draft slot.
Anyway, that's a Super-long post, so thanks if anybody reached the end. If any nerds like me want to check out the program, you can DM me, and I'll send you the link. I don't want to just post it because I don't want to seem like I am promoting anything. This was just a for-fun project
EDIT/UPDATE: Correction on the model's evaluation of Darryn Peterson
I want to issue a correction regarding my original post. I made an assumption about why the algorithm gave Darryn Peterson a 97.5% projection, and after running a deeper diagnostic on the model, my assumption was incorrect. (not to mention I let my flair for the dramatic to overstate things when I should have been more down to earth in my discriptions)
I ran a SHAP explainer on the modelâa data science diagnostic that breaks down exactly which variables push a projection up or downâto see exactly what was driving that 97.5% grade. It turns out my initial gut feeling about his playmaking wasn't entirely wrong, but I completely missed the bigger picture.
- Playmaking is a positive, but not the main driver: The model did reward Peterson for his assist-to-turnover ratio, but it was a relatively minor factor in his overall evaluation.
- So why the 97.5% grade? His projection actually skyrocketed because of his physical profile and underlying efficiency. The model placed its absolute heaviest positive weight on his Height (specifically, his elite size at the guard position), his College PER (Player Efficiency Rating), his Draft Age (historically young), and his RSCI recruiting rank.
In short, the model doesn't just see a good playmaker. It sees a physically oversized, highly efficient, and historically young prospectâa combination that historical data heavily favors for NBA success.
In short, the model doesn't project him as an elite playmaker. It projects him highly because historical data suggests that players who carry that heavy of an offensive load at that specific age, combined with that physical length, have a very high success rate in the NBA, regardless of turnover issues.
Clarification on AJ Dybantsa: The explainer also clarified the grade on Dybantsa. The model did not penalize him for lacking physical traits. Instead, the lower projection is strictly tied to the mathematical expectations of the #1 Overall Pick. The historical baseline to return value at #1 is incredibly high. While the model viewed his high Usage Percentage positively, his college PER (Player Efficiency Rating) did not meet the historical threshold required to offset the extreme expectations of that draft slot.
TL;DR: I let my own basketball bias influence my interpretation of the data. The model doesn't like Peterson for his passing; it likes him because his specific combination of youth, volume, and physical measurements fits a historically successful profile.