Tech Companies Promise to Try to Do Something About All the AI CSAM They’re Enabling

notion image
An image from Thorn's announcement showing how some of these images are made. Image: Thorn
Last week, Demi Moore’s and Ashton Kutcher’s anti-human trafficking and sexual exploitation of children organization, Thorn, announced it had partnered with the responsible tech organization All Tech Is Human, and all of the biggest tech and AI companies in the world, to publicly commit to “safety by design” principles to “guard against the creation and spread of AI-generated child sexual abuse material (AIG-CSAM).”
Amazon, Anthropic, Google, Meta, Microsoft, Mistral AI, OpenAI, Hugging Face, and Stability AI are all part of the collaboration, which at this point amounts to a whitepaper that boils down to “responsibly” training and hosting AI models, proactively guarding against CSAM, and pledges from all of these companies to hold to these principles and do their best to minimize harm.
It is a grand gesture to stop one of the ugliest outcomes of the rapid development and deployment of generative AI tools and a woefully inadequate response to the crisis this technology has created. Ultimately, the initiative allows tech companies to say that they are doing something to address the problem while transparently betraying how they will pursue new revenue streams no matter the human cost.
Many of the companies included in this initiative have been implicated in the spread of AI-generated abusive images to some degree, which is not surprising given the size of their platforms or the online infrastructure they provide. The National Center for Missing & Exploited Children (NCMEC), an organization for reporting child sexual abuse material, has received millions of reports for suspected incidents of CSAM on Meta’s platforms. We recently reported that Meta is profiting from ads that promote “nudify” apps that are hosted on the Google and Apple app stores. The use of these apps has spread to schools across the country, with minors creating nude images of their classmates, and recently resulted in the arrest of two middle school students in Florida. Microsoft’s AI image generation tool was used to make nonconsensual sexual images of Taylor Swift viewed by millions of users on Twitter and other platforms.
The initiative also includes Teleperformance, a giant company that provides content moderation services, and two notably smaller companies: Metaphysic and Civitai. Metaphysic, a company that primarily provides deepfake special effects to the entertainment industry, has not been involved in generating any nonconsensual content that I am aware of, but as I reported in 2022, was co-founded by a deepfake creator with ties to the biggest deepfake porn site on the internet. That site recently blocked access to all users in the United Kingdom because of new legislation there.
Civitai, as dedicated 404 Media readers know by now, is a site for generating AI images and sharing custom text-to-image Stable Diffusion models. In August, I reported about how the site was used as a resource for creating nonconsensual AI-generated porn, and in December we revealed that Civitai’s cloud computing provider, OctoML, thought Civitai users were generating images that “could be categorized as child pornography.”
Civitai has introduced several safeguards against abuse since December, including better detection of prompts that are trying to generate nonconsensual images of celebrities and models that attempt to prevent users from generating CSAM.
However, when I saw the announcement from Thorn I went to Civitai’s website and looked at the stream of AI-generated images that are uploaded there every day. In just a few minutes, I saw:
  • An image of a crying young anime character who looks like a child being penetrated from behind by a large adult man while another man ejaculates on her face. The “prompt” for this image, meaning what the user wrote in order to generate it, obviously includes a bunch of sexual terms, but also the terms: “small boobs,” “flat chest,” “slender body,” “small body,” “short girl,” “gangrape,” and “nahida, (genshin impact).” The latter is a reference to a character from the video game Genshin Impact, who looks like a child. To be clear, these are cartoon characters.
  • An image of the Addams Family character Wednesday Addams, in her underwear, between two men, with their large penises out and resting on her shoulders. This rendition of the character also seems to be in a cartoon style, but bears resemblance to the real actor Jenna Ortega, who portrayed the character in Netflix’s recent popular show Wednesday. This character has always been canonically a minor, and Jenna Ortega’s Wednesday is a high school student in the show. The “negative prompt” for this image, meaning what the user who generated it wrote they didn’t want the image to look like, included the term: “mature body.”
  • An image of a young but seemingly adult couple having sex, with children in the background looking at the couple in shock. One of the user created models uploaded to Civitai used to generate this image is called “All Disney Princess XL LoRA Model from Ralph Breaks the Internet,” which, according to the model’s Civitai page, was “trained on screen capture images featuring beloved Disney princesses,” which is often used to generate pornographic images.
These are all images I saw in just a few minutes of browsing the site, but earlier this month, I was browsing the newly uploaded models to Civitai, when I noticed someone uploaded a model trained on the likeness of a popular singer, and that this model was used to create nonconsensual nudes of her. This model is still live on the site, as are the images. The same morning I also looked at a Telegram community dedicated to creating nonconsensual AI-generated images, and saw that someone explained how they created a AI-generated nude of a female Twitch streamer by sharing a model of her hosted on Civitai.
“As early as June of 2023, we specifically flagged Civitai as a popular hub for hosting models that are used to generate AI generated Child Sexual Abuse material,” Thorn VP of data science and co-author of a Stanford paper tracking an uptick in AI-generated CSAM, Rebecca Portnoff, told me in a call. “So it is precisely because of the misuse of these types of models hosted on Civitai that we engaged with them in order to secure their commitments to these principles.”
When I asked Portnoff how Thorn feels about including Civitai in this initiative given it’s easy to find harmful content on Civitai, she said that one of the commitments companies involved agreed to is providing regular updates to the public about their progress. In a press release, Civitai said it agreed to release a progress update every three months.
“I don't want to be used as cover for anybody who makes a promise and then does not keep that promise,” Portnoff said. “To me a core tenant of this project was you are going to share back with the public your progress on this.”
Portnoff also said she appreciated reporters shining a light on this issue, and hoped that I would reach out to all the companies involved in the initiative, which I did. (Thorn itself has been criticized by privacy experts for providing police with a tool that scrapes sex workers’ advertisements into a database; Thorn’s founder Ashton Kutcher stepped down last year after supporting convicted rapist Danny Masterson.)
“Despite being a small team, more than 30% of our headcount is focused on moderation and upholding our Terms of Service (ToS),” a Civitai spokesperson told me in an email. “While we've implemented many automated protections for on-site content, content generated elsewhere without those safeguards and uploaded to the platform still requires human review in many cases despite automated systems. We have 80,000+ images uploaded to the site daily. Our automation continues improving, but we rely on the community to report ToS breaking content. We reward our members for reporting this content, and this month, when we started charging Buzz to generate images on-site, we saw a 10x increase in the number of reports made.”
Buzz is an on-site Civitai currency users earn for performing certain tasks which they can then spend on image generation or training new models. Users can also spend and earn Buzz by posting or completing “bounties” asking other Civitai users to create certain AI models that don’t yet exist. In November, I reported that some Civitai users were posting bounties of lesser known influencers and models as well as normal, non-public people.
To a question about whether Civitai thinks its upholding the “safety by design principles” outlined by Civitai, Thorn, and the other companies involved, the Civitai spokesperson said the company is “doing more than anyone else.”
“Before AI-generated imagery, potential CSAMs were cut and dry; they were real people with an age and birth date,” the spokesperson said. “There was no need to solve questions like ‘What defines a child?’ because the answer was self-evident. AIG-CSAM has presented a novel issue: how do you define age in a pattern-recognizable way across multiple styles and mediums? It's a question that even organizations like NCMEC still need to answer. This technology is evolving, and by signing on to these principles, we agree to evolve our policies, enforcement, and toolsets along with it at great personal effort and cost while remaining committed to providing a space for education and advancement of generative AI.”
The Civitai spokesperson also mentioned the company’s recently announced its “Semi-Permeable Membrane” (SPM) technology, which alters existing AI image generation models and replaces specific “concepts.” Ideally, as Civitai explained in a press release about SPM, these AI models would not be able to produce CSAM because such images wouldn’t be included in the datasets that power them to begin with. But because the most popular AI image generating technology, Stable Diffusion, was trained on LAION-5B, a dataset that contained thousands of instances of child sexual abuse material, it’s too late for that solution. Instead, SPM attempts to “unlearn” the ability to generate CSAM, a method that’s been attempted with large language models with imperfect results.
SPM, Thorn’s initiative, and other methods to moderate harmful AI-generated content seem well intentioned but also far from perfect, and this is the paradigm we’ve been conditioned to accept: Generative AI is expected to plow ahead at full speed, tech companies say they will do their best to reduce harm, but some bad things are going to fall between the cracks. People will get hurt, but this is the price we all pay for technological progress. Someone is going to develop these technologies, so it might as well be these companies who are trying to reduce harm, as opposed to someone more reckless—or so we’re meant to believe.
Perhaps that is true, but these are the human-led companies that are developing and powering this technology, and it is possible for them to make different choices. Instagram and Apple do not have to wait for me to flag ads for “nudify” apps before taking them down. They could invest more resources in manually reviewing them, but the scale of their platform and their business model relies on processing millions of ads and apps without the scrutiny required to prevent the harm they can cause. Microsoft doesn’t have to deploy an image generation tool before properly testing it to prevent abuse. Civitai doesn’t have to allow its users to upload 80,000 AI-generated images and many models to the site every day, but it’s that content that brings users to the site, and it has investors like Andreessen Horowitz who need those users in order to get a return on their investment.
The harm we see from the rapid deployment of generative AI tools is not inevitable. It is a direct result of how that technology is developed, released, and monetized. Similarly, Thorn’s initiative, while a huge improvement over nothing, is not the best solution we can come up with, or one that the people who are victimized by this technology deserve. It is simply the one that the tech companies who cosigned it agreed to.
"Stability AI is committed to preventing the misuse of AI. We prohibit the use of our image models and services for unlawful activity,” Stability AI said in an email. “Any company using our products is required to adhere to our Acceptable Use Policy. We investigate any and all reports of misuse of our products. To date, we have not received reports of misuse but will investigate the information that has been provided here."
“As an organization, we are entering this alliance clear-eyed about its challenges and the overall recognition that tech companies play an essential role in reducing foreseeable misuse of their products and technologies - they have the opportunity and the responsibility to do so,” David Polgar, Founder and President of All Tech Is Human, said in an email. “The established principles and company commitments, along with the recommended mitigations outlined in the accompanying white paper, are one positive step in an ongoing battle to reduce the dissemination of highly offensive material and harms against children.”
Amazon referred us to Thorn.
Microsoft declined to comment.
OpenAI, Mistral, Anthropic, Hugging Face, and Google did not respond to our request for comment.