The American college basketball tournaments known as “March Madness” begin this week. College basketball, or National Collegiate Athletic Association (N.C.A.A.) basketball, is very popular in the United States. In parts of the country it is even more popular than professional basketball.
And, many people like to try to guess who will win the many games played over the next few weeks of competition. Sixty-seven games will be held for both men and women.
A chart that shows the sequence of games is called a bracket. Thousands of fans in the U.S. compete with each other to correctly predict the most outcomes of each game.
Today, more people are using artificial intelligence, or AI, to help them fill their brackets. Using AI for bracketing in the tournament is not so new. Even so, the yearly bracket competitions still provide many surprises for computer science experts who have spent years creating their models using past tournament results.
The researchers have found that machine learning alone cannot quite solve for the limited data and unpredictable human elements of the tournament.
A normal fan may spend a few days this week deciding which team might win a few games in the tournament. But some computer experts are going after even more detailed information. They are using complex math to find the best model for predicting success in the tournament. Some are using AI to perfect their codes or decide which qualities of the team can best predict their competitive future.
The chances of creating a perfect bracket are extremely low for any competitor, however advanced their tools may be. An “informed fan” making choices based on past results has a 1 in 2 billion chance at perfection, says Ezra Miller. He is a mathematics professor at Duke University.
Artificial intelligence is likely very good at determining the probability that a team wins, Miller said. But even with the models, he added that the “random choice of who’s going to win a game that’s evenly matched” is still a random choice.
For the 10th straight year, the data science community Kaggle is hosting “Machine Learning Madness.” In traditional bracket competitions, people simply write each team they think will win. But “Machine Learning Madness” requires users to enter a percentage representing their level of confidence that a team will advance.
Kaggle provides a large data set from past results for people to develop their algorithms. That includes information on a team’s free-throw percentage, turnovers and assists. Users can then turn that information over to an algorithm to find the statistics most predictive of tournament success.
“It’s a fair fight. There’s people who know a lot about basketball and can use what they know,” said Jeff Sonas. He is a statistical chess analyst who helped found the competition. “It is also possible for someone who doesn’t know a lot about basketball but is good at learning how to use data to make predictions.”
No method will include every element at play on the court. There is a balance between modeling and intuition, said Tim Chartier, a Davidson University bracket expert.
Chartier has studied brackets since 2009. He developed a method that largely depends on team success on home court and away, performance in the second half of the season and difficulty of schedule. But he said the NCAA Tournament’s historical results provide an unpredictable and small sample size. That is a difficulty for machine learning models, which use large sample sizes.
Chartier’s goal is never for his students to reach perfection in their brackets. His own model still cannot account for Davidson’s 2008 unexpected admission into the “Elite Eight” level of the tournament.
In that mystery, Chartier finds a useful reminder from March Madness: “The beauty of sports, and the beauty of life itself, is the randomness that we can’t predict.”
I’m Dan Novak.
Dan Novak adapted this story for VOA Learning English based on reporting by The Associated Press.
_______________________________________________
Words in This Story
chart — n. information in the form of a table, diagram, etc.
sequence — n. the order in which things happen or should happen
advanced — adj. beyond the basic level
probability — n. the chance that something will happen
random — adj. chosen, done, etc., without a particular plan or pattern
algorithm — n. a set of steps that are followed in order to solve a mathematical problem or to complete a computer process
turnover — n. the amount of money that is received in sales by a store or company
intuition — n. a natural ability or power that makes it possible to know something without any proof or evidence
schedule — n. a plan of things that will be done and the times when they will be done
sample — n. a group of people or things that are taken from a larger group and studied, tested, or questioned to get information