policy report
Published by Convergence Analysis, this series is designed to be a primer for policymakers, researchers, and individuals seeking to develop a high-level overview of the current state of AI regulation.
AI Disclosures
What situations do disclosure requirements for AI systems cover?
The public and regulators have legal rights to understand goods and services. For example, food products must have clear nutritional labels; medications must disclose their side effects and contraindications; and machinery must come with safety instructions.
In the case of AI, these legally mandated disclosures can cover several topics, such as:
How do labels and watermarks work for AI-generated content?
Labels and watermarks vary in design; some are subtle, some conspicuous; some easy to remove, some difficult. For example, Dall-E 2 images have 5 coloured squares in their bottom right corner, a conspicuous label that’s easy to remove:
However, Dall-E 3 will add invisible watermarks to generated images, which are much harder to remove. Watermarking techniques are less visible than labels, and are evaluated on criteria such as perceptibility and robustness. A technique is considered robust if the resulting watermark resists both benign and malicious modifications; semi-robust if it resists benign modifications; and fragile if the watermark isn’t detectable after any minor transformation. Note that fragile and semi-robust techniques are still useful, for example in detecting tampering.
Imperceptible watermarking methods might embed a signal in the “noise” of the image such that it isn’t detectable to the human eye, and is difficult to fully remove, while still being clearly identifiable to a machine. This is part of steganography, the field of “representing information within another message or physical object”.
For example, the Least Significant Bit (LSB) technique adjusts unimportant bits in data. For example, in the binary number 1001001, the leftmost “1” represents 2⁶, while the rightmost “1” just represents 1, meaning it can be adjusted to carry part of a message with less disruption. LSB is relatively fragile, while other techniques like Discrete Cosine Transform (DCT) use Fourier transforms to subtly adjust images (and other data) at a more fundamental level, hiding signals in the higher frequency components of the image. These are more robust against simple attack techniques such as adding noise, compressing the image, or adding filters, making it more difficult to remove the watermark without significantly disrupting the image in question. Other popular techniques include DWT, SVD, and hybrids of multiple techniques.
There are also open-source technical standards such as C2PA that have been adopted by organizations like OpenAI. This standard allows good faith actors to maintain a chain of causation and signature on digital objects and recognize each others’ content. Ben Harack points out that it’s possible that if there was universal avoidance of works lacking a perfect C2PA chain, either through widespread legislation or strong norms (analogous to how web browsers disallow non-TLS websites), this could incentivize widespread adoption of the standard and make watermarking a much more powerful tool in distinguishing AI and human-generated works.
Text is much harder to watermark subtly, as the information content of text is relatively sensitive to small adjustments. Changing a few letters in a paragraph is more noticeable than changing many pixels in an image, for example. Watermarking can still be applied to metadata, and there are techniques derived from steganography that add hidden messages to text, though these can be disrupted and aren’t under major consideration by legislators or AI labs.
Importantly, all these labeling and watermarking techniques can be embedded in the weights of generative AI models, for example in a final layer of a neural network, meaning it is possible to have robust but invisible signals in AI-generated content that, if interpreted correctly, could be used to identify what particular model generated a piece of work.
Watermarking also involves tradeoffs between robustness and detectability; robust watermarking techniques alter the content more fundamentally, which is easier to detect. This means robustness can also trade-off against security, as more obscure and undetectable watermarking are harder to extract information from, and thus more secure. For example, brain scans feature incredibly sensitive information, and so researchers have developed fragile but secure watermarking techniques for fMRI. To quote a thorough review of watermarking and steganography:
“It is tough to achieve a watermarking system that is simultaneously robust and secure.”
Further, fragile watermarking standards could lead to false confidence, as any standards will inevitably incentivize powerful groups to break them.
Overall, modern digital watermarking techniques can be reasonably robust and difficult but not impossible to remove; watermarking may raise the barrier to entry of passing AI-generated content off as human-generated, and provide some tools for identifying the providence of AI-generated content (especially for images or audio content), but watermarking isn’t perfect and hasn’t been widely adopted.
What are current regulatory policies around disclosure requirements for AI systems?
The US
The Executive Order on AI states that Biden’s administration will “develop effective labeling and content provenance mechanisms, so that Americans are able to determine when content is generated using AI and when it is not.” In particular:
The AI Disclosure Act was proposed in 2023, though it has not passed the house or senate yet, instead being referred to the Subcommittee on Innovation, Data, and Commerce. If passed, the act would require any output generated by AI to include the text: ‘‘Disclaimer: this output has been generated by artificial intelligence.’’
China
China’s 2022 rules for deep synthesis, which addresses the online provision and use of deep fakes and similar technology, requires providers to watermark and conspicuously label deep fakes. The regulation also requires the notification and consent of any individual whose biometric information is edited (e.g. whose voice or face is edited or added to audio or visual media).
The 2023 Interim Measures for the Management of Generative AI Services, which addresses public-facing generative AI in mainland China, requires content created by generative AI to be conspicuously labeled as such and digitally watermarked. Developers must also label the data they use in training AI clearly, and disclose the users and user groups of their services.
The EU
Article 52 of the EU AI Act lists the transparency obligations for AI developers. These largely relate to AI systems “intended to directly interact with natural persons”, where natural persons are individual people (excluding legal persons, which can include businesses). For concision, we will just call these “public-facing” AIs. Notably, the following requirements have exemptions for AI used to detect, prevent, investigate, or prosecute crimes (assuming other laws and rights are observed).