“In this age of machine learning, it isn’t the algorithms, it’s the data,” says David Karger, a professor and computer scientist at MIT. Karger says Musk could improve Twitter by making the platform more open, so that others can build on top of it in new ways. “What makes Twitter important is not the algorithms,” he says. “It’s the people who are tweeting.”
A deeper picture of how Twitter works would also mean opening up more than just the handwritten algorithms. “The code is fine; the data is better; the code and data combined into a model could be best,” says Alex Engler, a fellow in governance studies at the Brookings Institution who studies AI’s impact on society. Engler adds that understanding the decisionmaking processes that Twitter’s algorithms are trained to make would also be crucial.
The machine learning models that Twitter uses are still only part of the picture, because the entire system also reacts to real-time user behavior in complex ways. If users are particularly interested in a certain news story, then related tweets will naturally get amplified. “Twitter is a socio-technical system,” says a second Twitter source. “It is responsive to human behavior.”
This fact was illustrated by research that Twitter published in December 2021 showing that right-leaning posts received more amplifications than left-leaning ones, although the dynamics behind this phenomenon were unclear.
“That’s why we audit,” says Ethan Zuckerman, a professor at the University of Massachusetts Amherst who teaches public policy, communication, and information. “Even the people who build these tools end up discovering surprising shortcomings and flaws.”
One irony of Musk’s professed motives for acquiring Twitter, Zuckerman says, is that the company has been remarkably transparent about the way its algorithm works of late. In August 2021, Twitter launched a contest that gave outside researchers access to an image-cropping algorithm that had exhibited biased behavior. The company has also been working on ways to give users greater control over the algorithms that surface content, according to those with knowledge of the work.
Releasing some Twitter code would provide greater transparency, says Damon McCoy, an associate professor at New York University who studies security and privacy of large, complex systems including social networks, but even those who built Twitter may not fully understand how it works.
A concern for Twitter’s engineering team is that, amid all this complexity, some code may be taken out of context and highlighted as a sign of bias. Revealing too much about how Twitter’s recommendation system operates might also result in security problems. Access to a recommendation system would make it easier to game the system and gain prominence. It may also be possible to exploit machine learning algorithms in ways that might be subtle and hard to detect. “Bad actors right now are probing the system and testing,” McCoy says. Access to Twitter’s models “may well help outsiders understand some of the principles used to elevate some content over others.”
On April 18, as Musk was escalating his efforts to acquire Twitter, someone with access to Twitter’s Github, where the company already releases some of its code, created a new repository called “the algorithm”—perhaps a developer’s dig at the idea that the company could easily release details of how it works. Shortly after Musk’s acquisition was announced, it disappeared.
Additional reporting by Tom Simonite.