Summary:
In Syria, where the government arrested and tortured citizens and journalists for reporting on the civil war, social media content became vital for human rights investigations. However, this content also posed a moderation challenge for platforms like YouTube and Facebook, which had to differentiate between the documentation of abuses and the promotion of terrorism. While machine learning classifiers have become part of the solution, a 2017 implementation by YouTube led to improper removals of content and accounts, resulting in an outcry from human rights advocates.
Background
The Syrian civil war sparked in 2011 when mass protests against President Bashar al-Assad’s dictatorship were met with arrests, torture, and the killing of protesters. The war then quickly unfolded into a complex armed conflict involving several international governments and factions. Investigations subsequently found Assad’s regime systematically committing a wide range of war crimes and human rights abuses including the targeting of civilians with chemical weapons, summary executions, enforced disappearances, and torture.
The Syrian government also severely suppressed freedom of speech, with security forces having arrested, detained, and tortured citizens and journalists for their online activity. Because of this, on-the-ground information Syrians managed to upload to internet platforms has proved to be especially valuable to human rights investigations outside the country. Evidence uploaded to social media platforms has been used to build cases against human rights abusers by international advocacy organizations, some European courts, and by United Nations investigators.
But the violent nature of documented abuses uploaded by Syrians presented a challenge for internet platforms that found themselves in the position of trying to distinguish between users documenting abuses and those promoting terrorism, which is often an explicitly prohibited category of content. Some platforms make exceptions for graphic or disturbing content. For example, YouTube has defined certain exceptions for content that is “Educational, Documentary, Scientific, and Artistic.” Deciding how content fits within a policy exception is largely a matter of understanding why the content was posted. However, due to the sheer volume of content, limited contextual information, and resource constraints, deciding if content falls within a policy exception is a delicate exercise. This is why specific and rigorously defined policies are critical to fair and consistent content moderation at scale. Machine learning methods for content classification have emerged as an essential tool for overcoming these limitations. These tools are often used to automatically flag content for human review, and in some circumstances are designed to automatically remove content. But these methods are imperfect; if classifiers are trained using incomplete, biased, or inaccurate data, their accuracy suffers.
The Case
In 2016, increased awareness of terrorist recruiting on social media attracted regulatory pressure for social media platforms. In June of 2017, days after a terrorist bombing in Manchester killed 22 concertgoers, Prime Minister Theresa May announced at a counter-terrorism event that she would consider measures to “encourage corporations to do more and abide by their social responsibility,” measures which might include “creating a new legal liability for tech companies if they fail to remove unacceptable content.” The following week Google, the parent company of YouTube, announced it would bolster resources for machine learning classification of terrorist content and nearly double the size of its trusted flagger program by adding 50 NGOs with subject matter expertise.
In an August update blog, which has since been overwritten (but quoted here), YouTube announced it began “developing and implementing cutting-edge machine learning technology designed to help us identify and remove violent extremism and terrorism-related content in a scalable way.” The quote continues, “The accuracy of our systems has improved dramatically due to our machine learning technology.” Again continuing, “While these tools aren’t perfect, and aren’t right for every setting, in many cases our systems have proven more accurate than humans at flagging videos that need to be removed.”
In the weeks following YouTube’s August update, it was reported that about 900 channels and thousands of videos documenting the Syrian conflict were removed from the platform. News and advocacy groups like Airwars and Bellingcat complained of videos being removed. Meanwhile, other organizations like the Aleppo Media Center, Sham News Agency, and the Violations Documentation Center in Syria had their accounts terminated for publishing “violent/graphic content” and “content that incites violence or encourage[s] dangerous activities,” according to YouTube notices. Mistaken enforcement actions of this kind were already an ongoing issue, and organizations like the Syrian Archive had been set up to copy and store these videos before they get removed. But the 2017 introduction of YouTube’s new terrorism classification system caused an unprecedented wave of video takedowns and account terminations.
The Response
Following an outcry from advocates, YouTube worked with affected users to restore many of the channels and videos, but was criticized for doing so clumsily. Some channels were not fully restored, with some coming back with many of their videos missing. In one case, the Sham News account was removed and restored again at least four times after activists made complaints. Advocates also expressed frustration at the lack of insight into how the classification algorithm works; transparency that could help them avoid being mistakenly removed in the future.
In October 2017, two months after first implementing its classifier, YouTube published this update:
YouTube acknowledged it made mistakes and attributed them to the challenging volume of content and imperfect machine and human processes. It also affirmed it would make improvements to the system and offer information about how users can signal the intent of their videos.
Insights
This case study demonstrates that moderating content at scale is especially difficult—and consequential—in the midst of an ongoing crisis. For YouTube, sorting through millions of videos and understanding whether users were uploading important evidence of atrocities or promoting terrorism proved to be challenging. Also featured are two key principles for moderating content: anticipating tradeoffs and balancing priorities. In this case, YouTube, motivated in part by government pressure, moved quickly in an attempt to radically improve its moderation of terrorism content with machine learning classifiers, but found its efforts came with an associated cost of removing important evidence of human rights abuses. In response to this unexpected outcome, YouTube reprioritized its efforts on improving its combined classifier and human moderation system.
Company considerations
- What systems can be put in place to evaluate machine learning classifiers before deployment?
- How might automatic flagging of content influence the decisions human reviewers make?
- What level of transparency is enough to help prevent mistaken removals without teaching bad actors how to circumvent policy enforcement?
- What are the elements of an effective appeals system?
Issue considerations
- In what ways can researchers and civil society contribute to the preservation of digital evidence in conflict zones?
- What would better collaboration between human rights researchers and technology companies look like? What other ideas might be helpful?
- How comfortable is the public with the availability of human rights abuse material on platforms they typically use for other purposes?
- How can social media users contribute to ensuring that human rights documentation is not mistakenly removed?
Considerations for policymakers
- How can policymakers become more aware of the tradeoffs and dynamics of online content issues?
- What steps can policymakers take to address urgent online crises while ensuring they are handled carefully?
Written by Alan P. Kyle, July 2024.