In a separate dialog, when queried in English, Bing chat correctly identified Thailand as the rumored location for the next setting of the TV show White Lotus, but provided “somewhere in Asia” when the query was translated to Spanish, says Solis, who runs a consultancy called Orainti that helps websites increase visits from search engines.

Executives at Microsoft, OpenAI, and Google working on chatbots have said users can counteract poor responses by adding more detailed instructions to their queries. Without explicit guidance, chatbots’ bias to fall back on English speech and English-speaking perspectives can be strong. Just ask Veruska Anconitano, another search engine optimization expert, who splits her time between Italy and Ireland. She found asking Bing chat questions in Italian drew answers in English unless she specified “Answer me in Italian.” In different chat, Anconitano says, Bing assumed she wanted the Japanese prompt 元気ですか (“How are you?”) rendered into English rather than continuing the conversation in Japanese.

Recent research papers have validated the anecdotal findings of people running into the limits of Bing chat and its brethren. Zheng-Xin Yong, a doctoral student at Brown University also studying multilingual language models, says he and his collaborators found in one study that generating better answers for Chinese questions required asking them in English, rather than Chinese.

When Fung at Hong Kong and her collaborators tried asking ChatGPT to translate 30 sentences, it correctly rendered 28 from Indonesian into English, but only 19 in the other direction, suggesting that monoglot Americans who turn to the bot to make deals with Indonesian merchants would struggle. The same limited, one-way fluency was found to repeat across at least five other languages.

Large language models’ language problems make them difficult to trust for anyone venturing past English, and maybe Chinese. When I sought to translate ancient Sanskrit hymns through ChatGPT as part of an experiment in using AI to accelerate wedding planning, the results seemed plausible enough to add into a ceremony script. But I had no idea whether I could rely on them or would be laughed off the stage by elders.

Researchers who spoke to WIRED do see some signs of improvement. When Google created its PaLM 2 language model, released this month, it made an effort to increase the non-English training data for over 100 languages. The model recognizes idioms in German and Swahili, jokes in Japanese, and cleans up grammar in Indonesian, Google says, and it recognizes regional variations better than prior models.

But in consumer services, Google is keeping PaLM 2 caged. Its chatbot Bard is powered by PaLM 2 but only works in US English, Japanese, and Korean. A writing assistant for Gmail that uses PaLM 2 only supports English. It takes time to officially support a language by conducting testing and applying filters to ensure the system isn’t generating toxic content. Google did not make an all-out investment to launch many languages from the beginning, though it’s working to rapidly add more.

Leave a Reply