Nothing Special   »   [go: up one dir, main page]

Data Access Needed To Tackle Online Misinformation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

The international journal of science / 6 June 2024

Data access needed


in the sharing of misinformation. It is hard to know whether
the ban modified user behaviour directly, or whether the
violence at the Capitol had an indirect effect on the like-

to tackle online lihood of users sharing narratives about the ‘stolen’ 2020
presidential election that fuelled it. Either way, the team

misinformation writes, the circumstances constituted a natural experi-


ment that shows how misinformation can be countered
by social-media platforms enforcing their terms of use.
Such experiments currently look unlikely to be repeated.
Online platforms must help researchers to As Lazer tells Nature, he and his colleagues were lucky to be
fill the gaps in our understanding of how false collecting data ahead of and during the attack, and to be
information propagates on the Internet. doing so at a time when Twitter was permissive in the data
it allowed scientists to extract. Since its takeover by entre-

“T
preneur Elon Musk, the platform, now rebranded X, has not
he Holocaust did happen. COVID-19 only reduced content moderation and enforcement, but
vaccines have saved millions of lives. also limited researchers’ access to its data.
There was no widespread fraud in the There is a similar opacity in another, even less-well-stud-
2020 US presidential election.” These are ied part of the misinformation ecosystem: its funding.
three statements of indisputable fact. Misinformation wasn’t created by the Internet or social
Indisputable — and yet, in some quarters of the Internet, media, but the advert-funded model of much of the web has
hotly disputed. boosted its production. For example, automated advertis-
They appear in a Comment article1 by cognitive scien- The first step ing exchanges auction off ad space to companies according
tist Ullrich Ecker at the University of Western Australia in must be for to which sites — including misinformation sites — people
Perth and his colleagues, one of series of articles in this are looking at, and the sites receive a cut if users look at
issue of Nature dedicated to online misinformation. It is a
companies and click on ads.
crucial time to highlight this subject. With more than 60% to engage In a second research article6, Wajeeha Ahmad, a doc-
of the world’s population now online, false and misleading more with toral candidate at Stanford University in California, and
information is spreading more easily than ever, with conse- her colleagues show that companies are ten times more
quences such as increased vaccine hesitancy2 and greater
researchers.” likely to wind up advertising on misinformation sites if they
political polarization3. In a year in which countries home advertise using exchanges. Although firms can follow up
to some four billion people are holding major elections, on where their ads are placed, most advertising decision
sensitivities around misinformation are only heightened. makers underestimate their involvement with misinfor-
Yet common perceptions about misinformation and mation — and consumers are similarly unaware.
what well-grounded research tells us don’t always agree, In Ahmad’s words, online advertising is “happening
as Ceren Budak at the University of Michigan School of kind of in the dark”: as with social media, the bulk of the
Information in Ann Arbor and her colleagues point out data needed to understand how misinformation spreads
in a Perspective article4. The degree to which people are are held by online platforms. If platforms are conducting
exposed tends to be overestimated, as does the influence of interventions of their own to try to curb the spread of mis-
algorithms in dictating this exposure. And a focus on social information, that is happening away from public scrutiny.
media often means that wider societal and technological The first step to tackling misinformation must be for
trends that contribute to misinformation are ignored. companies to engage more with researchers. The studies
The message from researchers is that misinformation that have already been performed show that it is possible
can be curbed. But for that to happen, platforms and reg- to collaborate ethically on data while ensuring people’s
ulators must take action, and evidence on how misinfor- privacy. Moreover, taking steps against the spread of prov-
mation spreads and why must be collected from diverse able falsehoods does not amount to a curb on freedom of
societies across the world. speech if it is done transparently. If companies are not will-
The role of social-media platforms in abetting the spread ing to share data, regulators should compel them to do so.
of misinformation is shown in a research article5 by David The rise of generative artificial-intelligence (AI) appli-
Lazer, a political and computer scientist at Northeastern cations, which reduce the barrier to producing dubious
University in Boston, Massachusetts, and his colleagues. content, is another reason to tackle the issue urgently. As
They analysed the activity of more than 550,000 Twitter Kiran Garimella at Rutgers University in New Brunswick,
users during the 2020 US presidential election cycle. New Jersey, and Simon Chauchard at University Carlos III
Their findings fit with the view that overall exposure to in Madrid point out in a Comment article7, their studies of
mis­information is overstated: only 7.5% of users shared one users of the WhatsApp messaging app in India indicate that
or more pieces of misinformation during the study period. generative AI content does not seem to be prevalent in the
This period included the attack on the US Capitol on misinformation mix as yet — but from what we know about
6 January 2021, after which Twitter deplatformed 70,000 how the use of technology evolves, it seems likely that it is
users deemed to be trafficking misinformation. The only a matter of time.
authors show that this move coincided with a huge drop The world has a shared interest in curbing the spread

Nature | Vol 630 | 6 June 2024 | 7


Editorials

of misinformation and keeping public debate focused and Johns Hopkins University in Baltimore, Maryland, set
on issues of evidence and fact. Which curbing measures out to expand the number of low-resource languages that
work and for whom must be tested — and first, independent their model translates as part of Meta AI’s ‘No Language
researchers need access to the data that will allow society Left Behind’ programme. They selected languages that
to make informed choices. were present in Wikipedia articles, but had fewer than 1
million sentences of example translations available online.
1. Ecker, U. et al. Nature 630, 29–32 (2024).
2. Borges do Nascimento, I. J. et al. Bull. World Health Organ. 100, 544–561 This work doubles the number of languages made avail-
(2022). able by a previous iteration3, and makes improvements to
3. Lorenz-Spreen, P., Oswald, L., Lewandowsky, S. & Hertwig, R. Nature Hum.
translation quality. The researchers employed professional
Behav. 7, 74–101 (2023).
4. Budak, C., Nyhan, B., Rothschild, D. M., Thorson, E. & Watts, D. J. Nature translators and reviewers to create a ‘seed’ data set in 39
630, 45–53 (2024). of the languages, and developed a technique that allowed
5. McCabe, S. D., Ferrari, D., Green, J., Lazer, D. M. J. & Esterling, K. M. Nature
them to mine web data to create parallel data sets in the
630, 132–140 (2024).
6. Ahmad, W., Sen, A., Eesley, C. & Brynjolfsson, E. Nature 630, 123–131 (2024). remaining languages. They also generated a list of some
7. Garimella. K. & Chauchard, S. Nature 630, 32–34 (2024). 200 ‘toxic’ words for each language, to identify translations
that could, for example, constitute hate speech.
The involvement of human specialists is time-consuming

Two cheers for


and expensive — but crucial. Without them, algorithms
would be trained on poor-quality data generated by arti-
ficial intelligence (AI), creating more errors. Models would

translator bots then harvest this content and create even more poor-qual-
ity text. William Lamb, a linguist and ethnographer at the
Without University of Edinburgh, UK, who was not involved in Meta
continued AI’s programme, says that this is already happening for
Automated translation software can boost Scottish Gaelic, for which most online content is generated
engagement, by AI. Scottish Gaelic is one of the low-resourced languages
neglected languages — but companies must
engage with the people who speak them.
machine in the Meta programme for which the content was profes-
translation sionally translated. Human expertise is also important for

I
languages that lack certain vocabulary. For example, many
n this week’s Nature, a team that includes researchers
could African languages do not have bespoke terms for scien-
at the technology company Meta describes a method become tific concepts. The research project Decolonise Science
of scaling up machine translation of ‘low-resourced’ another form employed professional translators to translate 180 sci-
languages for which there are few readily available dig- of ‘parachute entific papers into 6 African languages. It was initiated
ital sources1. The company’s automated translation by Masakhane, a grassroots organization of researchers
systems will now include more than 200 languages, many of science’.” interested in natural language processing.
them not currently served by machine-translation software. Such specialists are in short supply, however. This is one
These include the southern African language Tswana; Dari, reason why researchers and technology companies must
a type of Persian spoken in Afghanistan; and the Polynesian include communities that speak these languages, not
language Samoan. just in the process of creating their machine-translation
It’s an important step that helps to close the digital gap systems, but also as those systems are used, to reflect
between such neglected languages and languages that how real people use those languages. Researchers who
are more prevalent online, such as English, French and Nature spoke to say that they are concerned that not
Russian. It could allow speakers of lower-resourced lan- doing so will hasten the demise of the languages and,
guages to access knowledge online in their first language, by extension, their associated cultures. Without contin-
and possibly stave off the extinction of these languages by ued engagement, working on machine translation could
shepherding them into the digital era. become another form of ‘parachute science’, in which
But machine-learning models are only as good as the data researchers in high-income countries exploit communi-
that they are fed — which are mainly created by humans. As ties in low-income countries.
machine-translation tools develop, the companies behind “The words, the sentences, the communication, are
them must continue to engage with the communities they void of the values and beliefs encoded in the languages,”
aim to serve, or risk squandering the technology’s promise. says Sara Child, a specialist in language revitalization at
Of the almost 7,000 languages spoken worldwide, about North Island College on Vancouver Island in Canada and
half are considered to be in danger of going extinct. A 2022 a member of the Kwakwaka’wakw people. As AI propels
study2 predicts that the rate of language loss could triple more languages into the digital space, “I worry that we lose
within 40 years. The dominance of just a few languages on even more of ourselves”. This human element must not be
the Internet is one driving force: it’s estimated that more ignored in the rush towards a universal translation system.
than half of all websites are in English, and the top ten
languages account for more than 80% of Internet content. 1. NLLB Team. Nature https://doi.org/10.1038/s41586-024-07335-x (2024).
2. Bromham, L. et al. Nature Ecol. Evol. 6, 163–173 (2022).
The researchers, based at Meta AI, Meta’s research divi- 3. Goyal, N. et al. Preprint at arXiv https://doi.org/10.48550/arXiv.2106.03193
sion in New York City, the University of California, Berkeley, (2021).

8 | Nature | Vol 630 | 6 June 2024

You might also like