AI Governance Alone Won’t Save Us

On October 30, 2023, while President Biden issued the Executive Order to create AI safeguards, I spoke on a panel for the Practicing Law Institute (PLI) conference, Online Platforms and Popular Technologies 2023: Legal and Regulatory Responses to Technology Innovations and Uses.

My session, Algorithms and Artificial Intelligence: A Deep Dive, discussed a variety of legal issues emerging from the use of Generative AI. A recurring topic considered concerns regarding threats to safety from the increasing prevalence of LLMs and other such architectures. Being the only panelist who was not a lawyer, my contribution was largely focused on providing insights from my own perspective and experience as an engineer and technology professional working in the online abuse mitigation space.

The panel offered me an opportunity to organize my thoughts around the concerns surrounding generative AI, and speculate about the effectiveness of proposed regulations. I have summarized said thoughts below.

Should platforms be required to have services to identify and flag generative AI content?

The identification of generative AI content is only useful in so far as it mitigates harm. I think a more pertinent question to ask is: what new risks and potential for harm does generative AI content create, and then ask how those risks can be mitigated? If these novel risks require disclosure about what is AI-generated or not, then yes, identification is necessary, but even that would require platform- and context-specific considerations.

For example, if a video posted to YouTube contains sexually explicit content, is the problem that it was created by generative AI, or is the problem that the content violates policy? In this case, the content is problematic no matter what. However, if a YouTube video containing deepfakes is spreading misinformation, or if the sexually explicit content is AI-generated and the fact that it is AI-generated is the reason it is increasing harm (for example, if the generation process gives the content unique characteristics that makes it better at evading traditional detection methods) and knowing whether it is generative AI content improves the detection and mitigation capabilities, then it is probably useful to demarcate that.

I also don’t think moralizing generative AI content and enforcing identification on all platforms is appropriate because the way in which these risks manifest depend entirely on the context and policies specific to that platform. Ultimately, LLMs, diffusion models, and other generative AI architectures are just a new set of interfaces, similar to how innovations in video production or media provide new interfaces or abstraction layers that engineers, innovators, technologists, and content creators can use to generate new digital artifacts.

How do you reduce the threat to safety posed by generative AI?

This is a big question. One framework to apply to this problem may be to think about generative AI as an abstraction. Similar to other models in computing, such as networking or programming language design, abstraction layers provide boundaries between a system’s constituent parts, making it easier to understand and reason about. In the context of abuse mitigation, evaluating the risk of generative AI applications across different abstraction layers can help clearly define where it is acceptable and harmless to use for full automation, as compared with situations in which human oversight is necessary. For example, generative AI poses little risk when removing the grunt work of tedious tasks such as data entry or video editing, whereas the risk is significantly greater for AI healthcare systems providing critical diagnoses.

How easy is it to identify AI-generated content using AI? How far along are we in fighting AI with AI?

In cases where it is appropriate to identify and flag generative AI content, it is a difficult undertaking for the following reasons:

Complexity: It is a non-trivial technical task, whether it’s accomplished by humans or AI, and using AI to detect generative AI content has high false positives and negatives, which can lead to censorship of legitimize content or failure to flag true violations. Moreover, since every abuse mitigation problem is adversarial, the problem would contantly evolve to evade mitigation tactics, wherein new ones would need to be developed and enforced.
Computational expense: It is resource intensive to train and develop models at scale capable of flagging generative AI content, meaning that even if there was a way to do so, associated compute costs would be extensive.
Privacy concerns: In order to identify AI-generated content, you often have to corroborate content analysis with the analysis of user metadata, which can diminish user privacy protections. Most large-scale Trust and Safety teams, such as those at Google and GitHub, examine potential harm through two lenses: (1) content analysis, and (2) behavioural analysis. Although most companies have elaborate policies around what content is and is not acceptable reflected in their community guidelines, it is often insufficient to examine content alone to make a decision about whether an account is malicious. The analysis of metadata describing user activity provides deeper insight into anomalies that could signal clues as to whether someone is a bad actor or not. This necessitates examining user data. Thus, the collection and retention of this data would be required to train generative AI systems, which could lead to user profiling and other privacy concerns.

If you were being tried for shoplifting, would you be comfortable with a generative AI judge/jury?

To answer this question, we must remember that these models inherit the biases present in their training sets, meaning that if the training set correlated ethnic minorities with shoplifting crime, it could lead to discriminatory outcomes. However, a human judge or jury could also carry the same, if not more racist biases. Given this reliance on data, there are two points I want to make:

When you have an AI-hammer, everything looks like a nail. I do not think generating verdicts are a good application of AI. Prioritizing automation via AI or through other means within an inherently flawed criminal justice system not only exaggerates existing systemic problems, but it also diverts attention and resources away from addresssing underlying issues that are in urgent need of reform. I am one of the less legally-minded panelists here, but I feel the priority should be to re-evaluate and rectify problems such as inequality, mass incarceration, police misconduct and militarization, the cash-bail system, and other systemic biases. Without repairing this foundation, any technological advancement only enhances the status quo, and could lead to greater miscarriages of justice.
The trade-offs between data quality vs. data privacy. Speaking as an engineer, I can say that for any AI system generating critical outcomes, I would want to have high confidence in the representativeness, volume, and quality of the training data, in addition to the training methods, to trust in the reliability of predictions. However, obtaining such massive and representative datasets would likely require significant privacy violations. Data collection has been normalized as a necessary evil, but it is difficult to forecast the full range of ways that data will be used. As Shoshana Zuboff describes in her book, The Age of Surveillance Capitalism, “it is no longer enough to automate information flows about us; the goal now is to automate us […] Industrial capitalism transformed nature’s raw materials into commodities, and surveillance capitalism lays its claims to the stuff of human nature for a new commodity invention. Now it is human nature that is scraped, torn, and taken for another century’s market project. It is obscene to suppose that this harm can be reduced to the obvious fact that users receive no fee for the raw material they supply. That critique is a feat of misdirection that would use a pricing mechanism to institutionalize and therefore legitimate the extraction of human behavior for manufacturing and sale. It ignores the key point that the essence of the exploitation here is the rendering of our lives as behavioral data for the sake of others’ improved control of us. The remarkable questions here concern the facts that our lives are rendered as behavioral data in the first place; that ignorance is a condition of this ubiquitous rendition; that decision rights vanish before one even knows that there is a decision to make; that there are consequences to this diminishment of rights that we can neither see nor foretell; that there is no exit, no voice, and no loyalty, only helplessness, resignation, and psychic numbing; and that encryption is the only positive action left to discuss when we sit around the dinner table and casually ponder how to hide from the forces that hide from us.”

Another point to acknowledge when considering generative AI applications is that LLMs do not replicate the nuance of human reasoning. In his interview with Tyler Cowen on an episode of the podcast, Conversations with Tyler, Noam Chomsky was asked about why he was critical about large language models. I have captured the part of the transcript that discusses this question, because I think Chomsky clearly points out structural issues with LLMs that can provide a foundation with which we may think through broader unintended consequences:

COWEN:, “Now, you’ve been very critical of large language models. There’s a recent essay by Stephen Wolfram, where he argues the success of those models is actually evidence for your theory of language — that they must, in some way, be picking up or detecting an underlying structure to language because their means are otherwise too limited to be successful. What’s your response to that view?”

CHOMSKY: He’s a brilliant scientist. I’ve talked to him sometimes. I think this is partly true but partly misleading. The large language models have a fundamental property which demonstrates that they cannot tell you anything about language and thought. Very simple property: its built-in principle can’t be modified, namely, they work just as well for impossible languages as for possible languages. It’s as if somebody came along with a new periodic table of the elements which included all the elements and all impossible elements and couldn’t make any distinction on them. It would tell us nothing about chemistry.

That’s what large language models are. You give them a data set that violates all the principles of language, it will do fine, doesn’t make any distinction. What the systems do, basically, is scan an astronomical amount of data, find statistical regularities, string things together. And using these regularities, they can make a pretty good prediction about what word is likely to come next after a sequence of words.

A lot of very clever programming, a lot of massive computer power, and of course, unbelievable amounts of data, but as I say, it does exactly as well with impossible systems as with languages. Therefore, in principle, it’s telling you nothing about language.

Should articles/news/posts written by AI require an explicit identifier? (Not sure, but this might already be a thing?, also this might be very hard to enforce)

I believe Amazon requires authors selling books to disclose whether they used any AI, and to ensure that content that is AI-generated does not infringe any copyright laws.

Versions of this self-attestation may emerge, but it is difficult to enforce.

An analogous issue I experienced whilst working at GitHub was in observing how the platform enables both security researchers and criminals to host malware. One complies with the acceptable use policy, the other doesn’t. However, nothing is stopping a criminal from saying that they are a security researcher. Moreover, collecting evidence that a given bad actor is lying about their intent is a difficult undertaking.

President Biden’s Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence

The Biden Administration has proposed several safeguards to mitigate concerns around AI-generated content. One such protection proposes instituting watermarking and other such standards to denote the authenticity of content, with the intent of preventing deception and fraud. While it sounds helpful for citizens to have some verifiable signature confirming communications are from Federal agencies, applying such regulations to the private sector, and to all content online, insinuates that the government should maintain absolute control over everything. If the government ultimately controls what content is considered factual, it necessitates having complete trust in the government. But all entities, be they corporations or governments, have their own vested interests.

One of the most powerful roles the internet has played has been in democratizing the creation and propagation of information. While this carries the risk of misinformation, filtering content through a centalized institution does not remove this risk. In fact, entrusting the government to be some objective moral purveyor of AI-generated content creates greater risks of censorship. How do we know the government, or corporations verified by the government, will not engage in misinformation en masse? Furthermore, what does “safety” mean, and whose safety is being protected?

A more principled approach to may be to start by aligning on what safety and security mean. In her paper, Toward Comprehensive Risk Assessments and Assurance of AI-Based Systems, Heidy Khlaaf does a great job of establishing fundamental terminology in this context that can provide the foundation for more nuanced conversations going forward.