Giant Killers: Going out on a limb with our decision tree analysis
This story appears in ESPN The Magazine's March 27 Analytics Issue. Subscribe today!
Decisions, decisions. For 11 years, as we have prospected for March Madness Cinderellas, our Giant Killers project has followed one principle: Isolate the statistical elements common to big favorites and deep underdogs from the past, so as to screen new contestants for those traits to predict the future. Works pretty well, but now-thanks to math professors Liz Bouzarth, John Harris and Kevin Hutson, our Giant Killer colleagues at Furman University-we've added a wrinkle, one that reflects more intuitively how humans make tough choices. It's called decision tree analysis, and it allows us to look at all the Giant vs. Killer contests in our database (which dates to 2007), then ask a series of questions to split the games into purer and purer groups of matchups with similar outcomes-upsets and non-upsets. "In creating the decision tree, we rely on the power of statistics to help identify the things we wish we could identify by eye-natural groupings of complicated data," Bouzarth says.
More from ESPN.com
As Selection Sunday draws near, the committee has a lot to consider. Greg Shaheen spent 12 years on that committee and will be providing regular updates until the field of 68 is revealed. Story
Not sure who to watch? UCLA, always UCLA. Not sure what to watch? Kentucky's frosh. Not sure how to watch? Don't serve meatloaf. Can't pronounce the Gonzaga center's first name? We got you. Story
For example, in 2014, 12-seed Stephen F. Austin was a strong underdog, playing at a slow tempo with a good shooting touch, facing a Giant in VCU that didn't dominate the offensive boards. Which meant SFA-VCU "branched" to a final "leaf" shared by a bunch of notable killings, such as Davidson over Georgetown in 2008. Lo and behold, SFA sprang an upset. On the other hand, the next season, Buffalo was a fast-paced 12-seed playing a favorite that crushed the offensive boards. In our decision tree, that path ends with Giants going 26-0! (Buffalo lost.)
This year, in advance of Selection Sunday, we applied decision tree analysis to supplement our metrics and see where the top Giants' branches could lead. We'll go out on a limb and claim that the following five cases are particularly interesting.
Giant Rating* 93.9
Potential Nightmare Opponent: Miami (Giant Killer Rating**: 30.2)
You don't need an advanced abacus to realize Villanova is once again a terrific team, rated tops in the nation by BPI. Over the past four seasons, the Wildcats have embraced the small-ball lessons of analytics: They typically start four players under 6-foot-6 who constantly threaten to shoot deep 3s, which creates space for open looks inside. Conversely, they switch and occasionally press at the perimeter to harass opponents' long-range shooting. The results have been outstanding: After adjusting for strength of opponents, Villanova scores 123.2 points per 100 possessions (third in the country) and allows 93 (ranking 16th). However, the Wildcats' high-risk/high-reward style resembles successful Killers of the past rather than safe Giants. Our statistical model is suspicious of Villanova's reliance on 3s, its average offensive rebounding and its slow tempo. Assuming the Wildcats get by their first-round opponent, half a dozen branches on our decision tree lead to trouble in later rounds.
Miami could be particularly tough. Jim Larranaga's crew plays even slower than Butler, which dragged Villanova into two losses. The Hurricanes force opponents into low-efficiency terrain, and they rank 45th in the nation in nabbing 33.8 percent of their own missed shots. So watch out. Through tactics such as deploying long-spanned but wing-sized Eric Paschall at center, maybe Villanova is redefining what it means to play as a top seed. But while these Wildcats are statistically similar to last year's national champs, they also resemble the 2014-15 team that earned a 1-seed in the NCAA tournament but lost in the third round.
Giant Rating 92.1
Potential Nightmare Opponent: South Carolina (Giant Killer Rating: 31.2)
Gonzaga is ridiculously efficient. The Bulldogs score 120.9 adjusted points per 100 possessions after adjusting for strength of opponents and allow 88, the only program in the country to rate in the top 10 in both categories. But while they rank No. 2 in the NCAA according to BPI, they are just the 12th-safest Giant according to our model. Their strengths lie in hitting and defending shots; when it comes to creating or snuffing out chances-through rebounding or turnovers-they're slightly weaker. So Gonzaga is in a statistical zone where overdogs have to start worrying about Killers: Since 2007, Giants rated above .950 by our model have lost twice in 38 NCAA tournament games; teams from .900 to .950 have lost 17 of 101 contests, mostly to opponents beyond the first round.
South Carolina is an immovable object the Zags should be wary of. The Gamecocks space the perimeter to disrupt passing lanes and allow 87.9 adjusted points per 100 possessions, second fewest in D1. They also grab 34.2 percent of their missed shots (ranking 35th), which is particularly helpful for Killers against good teams but generic Giants like Gonzaga.
Giant Rating 83.7
Potential Nightmare opponent: UNC Wilmington (Giant Killer Rating: 24.8)
Yes, we know: Lonzo Ball is lots of fun, and UCLA can shoot lights-out from any ZIP code. The Bruins' 60.4 percent effective field goal percentage is the highest of any NCAA team in at least the past 16 seasons. But it's our job to tell you that Giants bulletproof themselves by building possessions, in case they lose their touch for a night or run into a dangerous or unfamiliar Killer. The Bruins don't nab offensive rebounds (ranking 162nd) and are terrible at forcing turnovers (ranking 307th)-which makes them similar to vulnerable Giants from past seasons.
UCLA is also allowing opponents to shoot 35.9 percent from long distance (ranking 221st), which could hurt against Killers such as UNC Wilmington, which hoists 3s on 41.8 percent of attempts (ranking 47th). The Seahawks play trappy defense, protect the ball better than any other team and happen to have a monstrous dunker named Devontae Cacok. They're a sharpshooting underdog with a diverse arsenal, leading the Bruins to particularly dicey territory on our decision tree. If either team falls into this matchup, our analysis sees UCLA-Wilmington as similar to notable upsets such as that VCU vs. Stephen F. Austin game in 2014 and Vanderbilt vs. Richmond in 2011.
Giant Rating 95.5
Potential Nightmare Opponent: East Tennessee State (Giant Killer Rating: 18.8)
The most important thing a Giant can do to hold off Killers is dominate the offensive boards: Superior talent plus extra possessions usually adds up to safety.
Baylor routinely ranks as a top-five team in offensive rebounding percentage but has gone down in huge first-round upsets in each of the past two NCAA tournaments. So here's where matchups come into play: Our decision tree analysis says a Power Giant such as Baylor was due for a hard time against Georgia State in 2015 and Yale in 2016. Those underdogs were strong, slow-paced teams, plus the Panthers forced scads of turnovers and the Bulldogs went after offensive rebounds as if they were extra points on the LSAT.
This time around, even with the Bears falling to the 2-line, no likely opponent branches in those directions. Most very low seeds are weak or fast and lack much ability to create extra possessions. Among them, East Tennessee State would make Baylor's most formidable foe: The Bucs are better than typical 14- (or even 13-) seeds; their defense is 11th in the country in steal percentage, and they can hit shots from anywhere. But even in that worst-case scenario, our model says Baylor would have nearly an 80 percent chance of winning.
Giant Rating 98.8
Potential Nightmare Opponent: Princeton (Giant Killer Rating: 22.3)
Two years ago, smoke started puffing out of our spreadsheets as they analyzed West Virginia. Bob Huggins' Mountaineers rank sixth in the country by hauling in 38.3 percent of their missed shots. They also press more often than any other team, forcing turnovers on 28.8 percent of opponent possessions. It's an extremely valuable combination in the NCAA tournament: If you constantly give yourself second chances and deny opponents first chances, it's nearly impossible for a weaker foe to build a run against you. Our model estimates that West Virginia's statistical profile will boost its strength against an average March Madness underdog by 14.9 points per 100 possessions, the biggest gain we have ever seen. Lacking a lights-out shooter, the Mountaineers have struggled in close Big 12 games. But their ability to seize the ball was discombobulating-after watching his Wildcats' 34 turnovers against WVU, New Hampshire coach Bill Herrion said, "I don't want to go through that again."
If anyone can give the Mountaineers trouble, our analysis says it's Princeton. The Tigers are patient and are well-versed in passing and cutting to create open 3-point looks. Even so, our model gives them only an 11.6 percent chance of taking down West Virginia.
*Meaning an estimated 93.9 percent chance of defeating an average NCAA tournament Killer.
**Meaning an estimated 30.2 percent chance of beating an average tournament Giant.
Stats through March 8, courtesy KenPom.com.