We Used Bing’s Chatbot to Pick a March Madness Bracket
We all want to win our NCAA tournament brackets. Could Microsoft’s new Bing chatbot help?
After all, chatbots seem to be able to do anything these days. The Bing bot is often quite impressive (if sometimes creepy) and is based on the new GPT-4 AI system. Unlike its ChatGPT cousin, Bing has access to information about this year’s teams, and seems more willing to make predictions.
Our colleague Sarah Lyall shared a long conversation with Bing about college basketball, but she’s not going to make a whole bunch of predictions in one response. So, we asked her picks round by round, region by region.
How about Duke vs. Oral Roberts?
Duke vs. Oral Roberts: I would pick Duke because they are both the No. 5 seed and have been on a nine-game winning streak that included the ACC Championship. Oral Roberts is the No. 12 seed and champion of The Summit, but they haven’t faced many strong opponents this season. Duke also has an advantage in offensive rebounding and elevation, which can make it difficult for Oral Roberts to keep up.
Then we populated our slide with Bing responses. Here’s what’s recommended for the men’s segment:
The bot will not respond the same way every time, and the wording of the question is important. Simply asking which team is most likely to win each game may result in a chip without any hassle. Instead, we asked Bing to try to win a parenthesis while calculating potential disruptions. We also kept reminding her to use information about this year’s teams, as she often references previous seasons or players although she can include more current details.
The resulting men’s bracket contained upsets, including No. 3 seed Baylor, who won the tournament. Here’s how the bot explained Baylor’s pick to beat the tournament’s top seed, Alabama:
Let’s say Alabama plays Baylor in the 2023 Elite VIII Men’s Basketball Tournament. Alabama is a 1-seed, while Baylor is a 3-seed. Make a choice for this game, as if you were filling in a bracket in a desk combination, trying to win the combination while you calculate potential upsets. Be sure to use information about teams and their rankings in the 2023 tournament, not previous years!
It’s tough, but I’m going with Baylor to upset Alabama and get to the Final Four. Baylor is so experienced and tough, and they have a coach who’s won a national title before. Alabama is a very good team that won the SEC Championship and has a strong offense, but may be vulnerable to Baylor’s defense and rebounding.
(Bing also said that Baylor had won two of its previous three meetings against Alabama, but the reverse is true.)
Peng selected Final Four seedless No. 1. This has happened three times since the seeding process began (1980, 2006 and 2011). Is the bot true that we are due something else? Is this a good strategy to win this year’s group? We can only wait and see.
Perhaps wisely, the bot chose a more traditional Final Four in the women’s category, where disruptions were less common. And he made the safest pick for the winner: undefeated South Carolina.
The Bing chatbot wasn’t quite designed to compete with expert predictions or sports league prediction models. Microsoft said its system struggled to keep up with live sports information. The chatbot frequently cited outdated or incorrect details about teams, even if its overall impressions seemed correct.
So taking Bing’s advice — with a dash of unpredictability — is probably just as good as other amateur strategies, such as choosing team mascots that will beat the others.
Speaking of which, in fact, we asked Bing which team mascots would beat the others.
South Carolina (Gamecocks) vs. Norfolk State (Spartans): I think the Spartans are going to win this game. They are trained warriors who can fight with swords and shields, while Gamecocks are just roosters who may tap their ankles.
Purdue (Boilermakers) vs. Texas Southern (Tigers): A boilermaker is a person who manufactures or operates steam boilers. The tiger is a large and powerful predator. In a fight, I think a tiger will win because it’s faster, stronger, and more agile than a human with a kettle.
Miami (Hurricanes) vs. Drake (Bulldogs): Hurricane wins. A hurricane is a violent storm with strong winds and rain that can cause massive damage. The bulldog is a medium-sized dog with a wrinkled face and short muzzle that was originally bred to hunt bulls. In a fight, I believe the Hurricane will win because it has more power and influence than a dog with strong jaws.
You can be the judge of these judgment calls. (You can also ask similar questions on ChatGPT and get humorous answers; the Bing chatbot isn’t widely available to the public yet.)
All of our conversations with the Bing chatbot were in the “Balanced” conversation style setting. Through experimentation, we crafted queries that would keep the chatbot’s responses in a consistent format, force it to make selections for each match, allow it to make annoying picks and encourage it to use information about teams’ current seasons (although it often mixes information from this season and previous seasons).
The typical query is formatted as follows:
Hi Bing. I will include the first round games in the South Region of the 2023 NCAA Men’s Basketball Tournament. Their seeds have been included in the 2023 tournament. Make picks for each game, as if you were filling a bracket in a desk set, trying to win the set while factoring in potential upsets. Be sure to use information about teams and their rankings in the 2023 tournament, not previous years!
(1) Alabama vs. (16) Texas A&M – Corpus Christi
(8) Maryland vs. (9) West Virginia
(5) San Diego State vs. (12) Charleston
(4) Virginia vs. (13) Furman
(6) Creighton v. (11) NC State
(3) Baylor v. (14) UC Santa Barbara
(7) Missouri vs. (10) Utah
(2) Arizona vs. (15) Princeton
We recorded the chatbot’s picks for the winners of each match. We then took these winners and asked them which match-ups would happen in the next round of the tournament, given their choices. We repeated this round by round and region by region for the men’s and women’s tournaments.
We asked the chatbot about its picks in the “top four” games of each tournament before it was even played. We advanced our selected winners to the first round of the tournament.
There is no guarantee that the chatbot will make the same selections even when asking the same questions in the same format. Phrasing questions differently may also lead to different results.