Tuesday, July 16, 2024
HomeTechnologyGadgetsIf artificial intelligence is going to take over the world, why can’t...

If artificial intelligence is going to take over the world, why can’t it solve the spelling bee?

My mission to the AI ​​overlords is simple: help me crack New York Times Spelling bee.

I spent most of Saturday night trying to shape the letters G, Y, A, L, P, O and N into as many single words as possible. But three hours, 141 points, and 37 words later—including “no apology,” “lagoon,” and “analogy”—I hit a wall. I only had to say a few more words to propel myself into the “genius” echelon of the spelling bee, a title reserved for those who can decipher 70% of possible words using the given letters and for whom the puzzle deems itself valid The moment it was resolved. (100% of solvers are awarded the title “Queen Bee,” but even I have limitations.)

My human mind was clearly struggling, but the task seemed a piece of cake for an artificial intelligence, so I fired up ChatGPT, told it I was trying to win a spelling bee, gave it my letter and made up the rules. “Give me a list of every word that can be formed with them,” I typed. “You can repeat letters, but each word must contain an ‘N’.”

When I hit the enter key, ChatGPT replies with the following:

The following is a list of words consisting of the letters “GYALPON”, which must contain the letter “N”:
GLN Paio
propylene glycol
Since a large number of permutations are possible, this is only an example of a list.

This is not a good start.

“These words need to be in the dictionary,” I replied. “Of course,” was the response. “I will provide a list of valid English words based on the given alphabet.” Angeli, Gallon, Long.

Obviously, this didn’t work, so I turned to Microsoft’s Copilot (Yang, playing, planning, annoying), Google Gemini (Gabon, tribute, Ji’an), and Claude of Anthropic (Mango, work in progress, Lawn 17.LAY). Meta AI helpfully told me that it ensures that only words recognized by the dictionary are included in lists containing: NALYP and Najib, And Perplexity — a chatbot ambition Kill Google Search – Simply Write Galan Suddenly freezes after hundreds of times.

Solving spelling bee puzzles sucks

Perplexity, a chatbot with ambitions to eliminate Google searches, collapsed when asked to form words from a set of letters. (Screenshot by Pranav Dixit/Engadget)

Artificial intelligence can now create images, video and audio as quickly as you type a description of what you want. It can write poetry, prose, and term papers. It can also be a pale imitation of your girlfriend, your therapist, and your personal assistant. Many believe it will put humanity out of work and change the world in ways we can barely imagine. So why is it so difficult to solve a simple word puzzle?

The answer lies in how large-scale language models—the underlying technology driving the modern artificial intelligence boom—work. Computer programming is traditionally logical and rule-based; you input commands that the computer executes based on a set of instructions and provides valid output. But machine learning, of which generative artificial intelligence is a subset, is different.

“It’s purely statistics,” Noah Giansiracusa, a professor of mathematics and data science at Bentley University, told me. “It’s really about extracting patterns from the data and then pushing out new data that largely fits those patterns.”

OpenAI did not respond, but a company spokesperson told me that this type of “feedback” helps OpenAI improve its model’s ability to understand and respond to problems. “Things like word structures and anagrams are not common use cases for Perplexity, so our models are not optimized for it,” company spokesperson Sara Platnick told me. “As a daily Wordle/Connections/Mini Crossword player, I’m excited to see how we’re doing!” Microsoft and Meta declined to comment. As of press time, Google and Anthropic had not responded.

At the heart of the large language model is Transformers, a technological breakthrough achieved by Google researchers in 2017. mathematical unit. Transformer is able to analyze each token in the context of the larger data set on which the model was trained to understand how they are connected to each other. Once the transformer understands these relationships, it can respond to your prompts by guessing the next possible token in the sequence. this Financial Times have a great animated narrator This all breaks down if you’re interested.

Meta AI also performs poorly in spelling bees

I typed “sure” wrong, but Meta AI thought I was suggesting it as a word and told me I was right. (Screenshot by Pranav Dixit/Engadget)

I idea I gave the chatbot precise instructions to produce my spelling bee words, and all they did was convert my words into tokens and use transformers to spit out plausible-looking responses. “This is different from computer programming or typing commands into a DOS prompt,” Giansiracusa said. “Your words are converted into numbers and then processed statistically.” It seems that queries based purely on logic are the worst application of artificial intelligence skills – akin to trying to drive a screw with a resource-intensive hammer.

The success of an AI model also depends on the data it is trained on. That’s why AI companies are now frantically striking deals with news publishers—the more recent the training material, the better the response. Taking generative artificial intelligence as an example, Too bad When suggesting chess moves, but at least Marginally Better at this task than solving word puzzles. Giansiracusa pointed out that the large number of chess games on the Internet is almost certainly included in the training data of existing artificial intelligence models. “I suspect there aren’t enough annotated Spelling Bee games on the Internet for AI training, like there are chess games,” he said.

Sandi Besen, an AI researcher at Neudesic, an AI company owned by Neudesic, said: “If your chatbot is more confused by a word game than a cat with a magic cube, it’s because it hasn’t been specifically trained to play it. Complex word games. “Word games have specific rules and constraints that are difficult for a model to adhere to unless specifically instructed during training, fine-tuning, or prompting.”

“If your chatbot seems more confused by a word game than a cat with a Rubik’s Cube, that’s because it hasn’t been specifically trained to play complex word games.”

None of this has stopped the world’s leading AI companies from marketing the technology as a panacea, often gross exaggeration About its capabilities. In April this year, both OpenAI and Meta claimed that their new artificial intelligence models would be able to “reason” and “plan.” OpenAI Chief Operating Officer Brad Lightcap in an interview Tell this Financial Times The next generation of GPT, the artificial intelligence model that powers ChatGPT, will make progress in solving “hard problems” such as inference. Joelle Pineau, Meta’s vice president of artificial intelligence research, told the publication that the company is “working hard on how to make these models not only able to talk, but actually reason, plan… have memories.”

I tried multiple times to crack the spelling bee using GPT-4o and Llama 3, but failed. When I tell ChatGPT Gallon, Long and Angley Not in the dictionary, the chatbot said it agreed with me and suggested plating instead. When I mistakenly typed the world “sure” as “sur” in my reply to Meta AI’s request for more words, the chatbot told me that “sur” is indeed another word that can be formed from the letters G, Y, A, L ,P,O and N.

Clearly, we are still a long way from general artificial intelligence, a vague concept that describes the moment when machines can perform most tasks as well as or better than humans. Some experts, such as Yann LeCun, chief artificial intelligence scientist at Meta, have been outspoken about the limitations of large language models, claiming that they will never reach human-level intelligence because they don’t really use logic. At last year’s event in London, LeCun explain The current generation of AI models “just don’t understand how the world works. They don’t have the ability to plan. They don’t really have the ability to reason,” he said. “We don’t have fully autonomous self-driving cars that can train themselves to drive in about 20 hours of practice, which a 17-year-old can do.

Giansiracusa, however, struck a more cautious tone. “We don’t really know how humans reason, right? We don’t know what intelligence is. I don’t know if my brain is just a big statistical calculator, kind of like a more efficient version of a big language model.

Perhaps the key to coexisting with generative AI without succumbing to the hype or anxiety is to simply understand its inherent limitations. “These tools aren’t really designed for a lot of the things people use them to do,” said Chirag Shah, a professor of artificial intelligence and machine learning at the University of Washington who co-wrote a well-received paper Attractive Research Papers 2022. Criticizing the Use of Large Language Models in Search Engines. Shah believes tech companies can do a better job of being transparent about what artificial intelligence can and cannot do before forcing it on us. However, that ship may have already sailed. Over the past few months, the world’s largest technology companies— Microsoft, Yuan, Samsung, appleand Google – Has made claims to tightly integrate artificial intelligence into its products, services and operating systems.

“These bots are bad because they’re not designed for this,” Shah said of my wordplay conundrum. Whether they can solve all the other problems tech companies throw at them remains to be seen.

Where else have you been disappointed by AI chatbots? Send me an email at pranav.dixit@engadget.com let me know!

Updated, June 13, 2024, 4:19 pm ET: This story has been updated to include a statement from Bewildered.



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments