Programs

About

Publications

Get Updates

Programs

Scenario Research

Governance Research

AI Awareness

About

About Us

Our Team

How We Work

Theory of Change

Blog

Donate

Get Updates

Programs

Scenario Research

Governance Research

AI Awareness

About

About Us

Our Team

How We Work

Theory of Change

Blog

Donate

Get Updates

policy report

State of the AI Regulatory Landscape

Published by Convergence Analysis, this series is designed to be a primer for policymakers, researchers, and individuals seeking to develop a high-level overview of the current state of AI regulation.

Report Home

Download Full Report

Outline

Structure of AI Regulations

AI Evaluation & Risk Assessments

AI Model Registries

AI Incident Reporting

Open-Source AI Models

Cybersecurity of Frontier AI Models

AI Discrimination Requirements

AI Disclosures

What situations do disclosure requirements for AI systems cover?

What are current regulatory policies around disclosure requirements for AI systems?

Convergence’s Analysis

AI and Chemical, Biological, Radiological, & Nuclear Hazards

AI Disclosures

Elliot McKernon

Writer-Researcher

Last updated Mar 29, 2024

Author's Note

This report is one in a series of ~10 posts comprising a State of the AI Regulatory Landscape in 2024 Review, conducted by the Governance Research Program at Convergence Analysis. Each post will cover a specific domain of AI governance. We’ll provide an overview of existing regulations, focusing on the US, EU, and China as the leading governmental bodies currently developing AI legislation. Additionally, we’ll discuss the relevant context behind each domain and conduct a short analysis.

This series is intended to be a primer for policymakers, researchers, and individuals seeking to develop a high-level overview of the current AI governance space. We’ll be releasing a comprehensive report at the end of this series.

Author's Note

What situations do disclosure requirements for AI systems cover?

The public and regulators have legal rights to understand goods and services. For example, food products must have clear nutritional labels; medications must disclose their side effects and contraindications; and machinery must come with safety instructions.

In the case of AI, these legally mandated disclosures can cover several topics, such as:

Clearly labeling AI-generated content: This allows people to immediately recognize that the image (or text or audio etc) they’re looking at was AI-generated. For example, the proposed AI Disclosure Act would require all generative AI content to include the text “Disclaimer: this output has been generated by artificial intelligence.”

Watermarking content generated by AI: This involves adding some detectable but not necessarily obvious mark. Watermarking has several purposes, for example letting us identify the provenance or source of AI-generated content.

Disclosure of training data: Since models are trained on huge amounts of data, but this data isn’t identifiable or reconstructable from the final model, some regulators require AI developers to disclose information about the data used to train models. For example, the EU AI Act requires AI developers to publicly disclose any copyrighted material used in their training data.

Notifying people that they’re being processed by an AI: For example, if video footage is analyzed by an AI to identify people’s age, the EU AI Act requires those people to be informed.

How do labels and watermarks work for AI-generated content?

Labels and watermarks vary in design; some are subtle, some conspicuous; some easy to remove, some difficult. For example, Dall-E 2 images have 5 coloured squares in their bottom right corner, a conspicuous label that’s easy to remove:

However, Dall-E 3 will add invisible watermarks to generated images, which are much harder to remove. Watermarking techniques are less visible than labels, and are evaluated on criteria such as perceptibility and robustness. A technique is considered robust if the resulting watermark resists both benign and malicious modifications; semi-robust if it resists benign modifications; and fragile if the watermark isn’t detectable after any minor transformation. Note that fragile and semi-robust techniques are still useful, for example in detecting tampering.

Imperceptible watermarking methods might embed a signal in the “noise” of the image such that it isn’t detectable to the human eye, and is difficult to fully remove, while still being clearly identifiable to a machine. This is part of steganography, the field of “representing information within another message or physical object”.

For example, the Least Significant Bit (LSB) technique adjusts unimportant bits in data. For example, in the binary number 1001001, the leftmost “1” represents 2⁶, while the rightmost “1” just represents 1, meaning it can be adjusted to carry part of a message with less disruption. LSB is relatively fragile, while other techniques like Discrete Cosine Transform (DCT) use Fourier transforms to subtly adjust images (and other data) at a more fundamental level, hiding signals in the higher frequency components of the image. These are more robust against simple attack techniques such as adding noise, compressing the image, or adding filters, making it more difficult to remove the watermark without significantly disrupting the image in question. Other popular techniques include DWT, SVD, and hybrids of multiple techniques.

There are also open-source technical standards such as C2PA that have been adopted by organizations like OpenAI. This standard allows good faith actors to maintain a chain of causation and signature on digital objects and recognize each others’ content. Ben Harack points out that it’s possible that if there was universal avoidance of works lacking a perfect C2PA chain, either through widespread legislation or strong norms (analogous to how web browsers disallow non-TLS websites), this could incentivize widespread adoption of the standard and make watermarking a much more powerful tool in distinguishing AI and human-generated works.

Text is much harder to watermark subtly, as the information content of text is relatively sensitive to small adjustments. Changing a few letters in a paragraph is more noticeable than changing many pixels in an image, for example. Watermarking can still be applied to metadata, and there are techniques derived from steganography that add hidden messages to text, though these can be disrupted and aren’t under major consideration by legislators or AI labs.

Importantly, all these labeling and watermarking techniques can be embedded in the weights of generative AI models, for example in a final layer of a neural network, meaning it is possible to have robust but invisible signals in AI-generated content that, if interpreted correctly, could be used to identify what particular model generated a piece of work.

Watermarking also involves tradeoffs between robustness and detectability; robust watermarking techniques alter the content more fundamentally, which is easier to detect. This means robustness can also trade-off against security, as more obscure and undetectable watermarking are harder to extract information from, and thus more secure. For example, brain scans feature incredibly sensitive information, and so researchers have developed fragile but secure watermarking techniques for fMRI. To quote a thorough review of watermarking and steganography:

“It is tough to achieve a watermarking system that is simultaneously robust and secure.”

Further, fragile watermarking standards could lead to false confidence, as any standards will inevitably incentivize powerful groups to break them.

Overall, modern digital watermarking techniques can be reasonably robust and difficult but not impossible to remove; watermarking may raise the barrier to entry of passing AI-generated content off as human-generated, and provide some tools for identifying the providence of AI-generated content (especially for images or audio content), but watermarking isn’t perfect and hasn’t been widely adopted.

What are current regulatory policies around disclosure requirements for AI systems?

The US

The Executive Order on AI states that Biden’s administration will “develop effective labeling and content provenance mechanisms, so that Americans are able to determine when content is generated using AI and when it is not.” In particular:

Section 4.5(a): Requires the Secretary of Commerce to submit a report identifying existing and developable standards and tools for authenticating content, tracking its provenance, and detecting and labeling AI-generated content.

Section 10.1(b)(viii)(C): Requires the Director of OMB to issue guidance to government agencies that includes the specification of reasonable steps to watermark or otherwise label generative AI output.

Section 8(a): Encourages independent regulatory agencies to emphasize requirements related to the transparency of AI models.

The AI Disclosure Act was proposed in 2023, though it has not passed the house or senate yet, instead being referred to the Subcommittee on Innovation, Data, and Commerce. If passed, the act would require any output generated by AI to include the text: ‘‘Disclaimer: this output has been generated by artificial intelligence.’’

China

China’s 2022 rules for deep synthesis, which addresses the online provision and use of deep fakes and similar technology, requires providers to watermark and conspicuously label deep fakes. The regulation also requires the notification and consent of any individual whose biometric information is edited (e.g. whose voice or face is edited or added to audio or visual media).

The 2023 Interim Measures for the Management of Generative AI Services, which addresses public-facing generative AI in mainland China, requires content created by generative AI to be conspicuously labeled as such and digitally watermarked. Developers must also label the data they use in training AI clearly, and disclose the users and user groups of their services.

The EU

Article 52 of the EU AI Act lists the transparency obligations for AI developers. These largely relate to AI systems “intended to directly interact with natural persons”, where natural persons are individual people (excluding legal persons, which can include businesses). For concision, we will just call these “public-facing” AIs. Notably, the following requirements have exemptions for AI used to detect, prevent, investigate, or prosecute crimes (assuming other laws and rights are observed).

Article 52.1: Requires developers to ensure users of public-facing AI are informed or obviously aware that they are interacting with an AI.

Article 52.1a: Requires AI-generated content to be watermarked (with an exemption for AI assisting in standard editing or which doesn’t substantially alter input data).

Article 52.2: Requires developers of AI that recognizes emotions or categorizes biometric data (e.g. distinguishing children from adults in video footage) to inform the people being processed.

Article 52.3: Requires deep fakes to be labeled as AI-generated (with a partial exemption for use in art, satire, etc, in which case developers can disclose the existence of the deep fake less intrusively). AI-generated text designed to inform on matters of public interest must disclose that it’s AI-generated, unless the text undergoes human review, and someone takes editorial responsibility.

Article 52b: Requires developers of general purpose AI with systemic risk to notify the EU Commission within 2 weeks of meeting any of the following requirements defined in article 52a.1:

Possessing “high impact capabilities”, as evaluated by appropriate technical tools.

By decision of the Commission, if they believe a general purpose AI has capabilities or impact equivalent to “high impact capabilities”.

Article 52c: Requires providers of GPAI to publish a summary of the content used for training the model, and 60f and 60k require developers to disclose any copyrighted material in their training data in their summary.

Convergence’s Analysis

Unclear definitions of what constitutes an application of AI will lead to inconsistent disclosure requirements and enforcement.

AI is becoming embedded in many creative tools, such as image-editing tools like Photoshop and GIMP. Among other functions, these can be used to “uncrop” images, generating additional content. AI is also important in procedurally generated video games and VR spaces.

These uses of AI lead to gray areas and edge cases that aren’t clearly covered by legislation, and individuals using these tools may not be able to tell whether they’re using compliant or illegal tools.

Current legal definitions are far from comprehensive enough to fully distinguish and legislate these overlapping use cases.

Mandatory labeling of AI-generated content is a lightweight but imperfect method to keep users informed and reduce the spread of misinformation and similar risks from generative AI.

Labeling AI-generated text, images, video, and so on is a simple way to make users clearly understand that content is AI-generated. Further, it’s not expensive or complex to add labeling mechanisms to generative AI.

Labeling has extensive precedents in most legislations, such as food and medication labels.

While compliance can be high for such mandatory labeling, there’s variance in efficacy. For example, the World Health Organization found that inadequate labeling of medication plays a role in non-adherence to medication prescriptions, and some studies have found that improving labeling improves health outcomes.

Further, compliance can be low, especially when violations by smaller organizations or individuals aren’t actively addressed. For example, though many major websites are GDPR-compliant, a 2020 survey found that only 11.8% of (a scrapable subset of) the top 10,000 websites in the UK were compliant.

Mandatory watermarking is a lightweight way to improve traceability and accountability for AI developers.

Like labeling, watermarking is easy for developers to do, and invisible watermarks have the advantage of not interfering with the users’ experience.

If AI developers include watermarking in their generative AI models, these can be used to precisely identify which model was used to generate a piece of content. This is especially important when generative AI is used to generate harmful content, such as misinformation, deep fake porn, or other provocative material, as models should be trained not to produce such content. Watermarking allows us to find and address the root of the problem and hold the developers legally accountable.

Labels and watermarks can be disrupted or removed by motivated users, especially in text generation.

Labels and watermarking involve adding information to content, and it is usually possible to manually (or even automatically) remove or disrupt this information.

This means that it’s unlikely any content platform could guarantee that AI-generated content is always clearly distinguishable to people.

Despite the potential fragility of labeling and watermarking, they can still be important aspects of a larger, layered strategy, making it more difficult to produce misinformation, or for AI developers to avoid accountability.

In particular, societal education about AI will be a critical aspect of such a layered strategy.

Research orgs such as Meta and DeepMind are researching more advanced methods of watermarking during AI development.