Getting started with Content Credentials
This is a technical introduction to the Content Authenticity Initiative (CAI) open-source SDK that provides initial context and an overview of the technical aspects of implementing CAI solutions.
The CAI open-source SDK is fully compliant with the C2PA specification, but it's important to note that the specification is more general than the SDK, which doesn't support every feature in the specification.
Important terminology
To understand how CAI works, you need to understand some basic vocabulary. Having this vocabulary makes it easier to discuss CAI's technical aspects. The definitions below are summarized and slightly simplified from the Glossary in the C2PA specification. This document is meant to convey the general concepts and may not cover all technical details or edge cases that the specification addresses.
Let's start with a broad definition of content provenance which means information on the creation, authorship, and editing of a digital asset such as an image. Content Credentials include this provenance information, along with the cryptographic means to authenticate that the information is correctly tied to the content and is unchanged from when it was originally added. Sometimes these two terms are used to loosely mean the same thing: Technology to verify the origin and history of a digital asset.
In practice, Content Credentials are kept in a C2PA manifest store, and the CAI SDK works with that. A manifest store consists of one or more individual manifests, each containing information about the asset. The most recently-added manifest is called the active manifest. The active manifest has content bindings that can be validated with the asset; that is, it's hashed with the asset to ensure its validity.
As illustrated below, the manifest contains assertions about the asset's creation and history, wrapped up with additional information into entities called claims that are digitally signed with an actor's private key (the claim signature).
Now, let's drill down a bit to clarify some of these terms.
Actor: A human or hardware or software product. For example, a camera, image editing software, cloud service, or the person using such tools.
Asset: A digital media file or stream of data of certain specific image, video, or audio formats. In the future, the types of supported assets will expand. A composed asset generalizes this concept, for example an image superimposed on top of another image.
Action: An operation that an actor performs on an asset. For example, "create," "embed," or "change contrast."
Assertion: Part of the manifest "asserted" by an actor that contains data about an asset's creation, authorship, how it's been edited, and other relevant information. For example, an assertion might be "change image contrast." For a list of standard C2PA assertions, see the C2PA Specification.
Certificate Authority (CA): A trusted third party that verifies the identity of an organization applying for a digital certificate. After verifying the organization's identity, the CA issues a certificate and binds the organization's identity to a public key. A digital certificate can be trusted because it is chained to the CA's root certificate.
Certificate (public key certificate or digital certificate): An electronic document that vouches for the holder's identity. Like a passport, the certificate is issued by a trusted third party (the CA), cannot be forged, and contains identifying information.
Claim: Digitally-signed and tamper-evident data that references a set of assertions by one or more actors, concerning an asset, and the information necessary to represent the content binding. For example, a claim could specify that a particular image was edited by John Doe using Product X on 05/08/2021 at 11am to change the image contrast.
Claim signature: Part of the manifest that is the digital signature on the claim using an actor's private key.
Content binding: Information that associates digital content to a specific manifest associated with a specific asset, either as a hard binding or a soft binding.
Manifest (C2PA manifest): Information about the provenance of an asset based on the combination of:
- One or more assertions (including content bindings).
- A single claim.
- A claim signature.
Manifest store: A collection of manifests associated with asset. The most recent manifest in the manifest store is the active manifest, which has the set of content bindings that are hashed with the asset and thus can be validated.
For more definitions and detail, see the C2PA specification.
How it works
A manifest is a binary data structure that describes the history and identity data attached to digital asset. The CAI SDK enables applications and websites to attach a manifest to an asset and display it when requested. This helps viewers to understand the origin and evolution of the asset.
Although the manifest structure described in the C2PA specification is a complex binary structure, the CAI SDK works with a JSON manifest format that's easier to understand and use. It's essentially a declarative language for representing and creating a manifest in binary format. For more information on the CAI JSON manifest, see Working with manifests.
The CAI uses cryptographic asset hashing to provide verifiable, tamper-evident signatures to indicate that the asset and metadata haven't been altered since the asset was published with the attached manifest. This means that a hash function converts the digital asset data to a unique "fingerprint," which is signed using a certificate. Once the credentials are signed, if the asset is changed then its fingerprint also changes. That's why it's called "tamper evident." You're not prevented from changing an image that has Content Credentials, but if you do, the credentials are no longer valid unless you change it using a tool that updates and re-hashes the credentials, along with a timestamp and optionally a description of what changed. The C2PA specification refers to this as hard binding.
Introduction to public key infrastructure
For authentication, C2PA uses public key infrastructure (PKI) technology, a widely used international standard: The same technology used to secure websites using HTTPS, to secure email using S/MIME, and for many other digital security purposes. C2PA uses the international X.509 standard, the most common format for public key certificates.
A signed digital certificate verifies the authenticity of an entity, such as a server, a client, or an application. You typically purchase a certificate from a third-party certificate authority. A certificate can contain the following information to verify the identity of an entity:
- Organizational information that uniquely identifies the owner of the certificate, such as organizational name and address. You supply this information when you get the certificate.
- Public key: The receiver of the certificate uses the public key to decipher encrypted data sent by the certificate owner to verify its identity. A public key has a corresponding private key used for encryption.
- Name of the CA that uniquely identifies the issuer of the certificate.
- Digital signature of the CA that verifies its authenticity. The corresponding CA certificate compares the signature to verify that the certificate originated from a trusted certificate authority.
Signing and certificates
Once you initially attach a manifest to an asset, it has information about the origin of the asset, such as the name of the tool that created it (for example, Photoshop) and your name as the actor who created or modified it. Each time someone edits or updates the asset using a tool that supports CAI, it adds a new manifest with the actions taken and the certificate of the tool/site; this becomes the active manifest, which then references any prior manifests as ingredients.
Each manifest is digitally signed with the application's or client's credentials. To make validation of the credentials possible, the manifest also includes the signing certificate chain. The "chain" of certificates starts with the certificate from the last tool that signed the manifest (known as the "end-entity") followed by the certificate that signed it, and so on, back to the original CA issuer. In other words, a user knows they can trust that the manifest is valid because there is a "trust chain" that goes back to a trusted root certificate authority. That's why you need to acquire a security certificate from a legitimate certificate authority.
In practice, to use a certificate with the CAI SDK, follow this process:
- Purchase security credentials (certificate and key) from a certificate authority. Either email protection or document signing certificates are valid.
- Extract the certificate by using a tool such as OpenSSL. You could also host the certificate in a secure environment like a hardware security module (HSM).
- Use one of the supporting CAI libraries or C2PA Tool to sign manifests using the certificate.
For a short tutorial example, see Signing manifests. For more information on how to get and use a signing certificate in production, see Getting and using a signing certificate.
Getting a security certificate
To create or modify Content Credentials, you must have a valid security certificate and key that conform to the requirements laid out in the C2PA specification Trust Model section.
You must purchase a X.509 v3 security certificate from a certificate authority (CA). There are many CAs that issue certificates. Some of the most popular ones are:
- GlobalSign: S/MIME email signing, document signing
- IdenTrust: S/MIME email signing, document signing
- Comodo Cybersecurity: S/MIME email signing cert, document signing cert
- Digicert: S/MIME email signing cert, document signing cert
The above list is for reference only; inclusion does not imply endorsement by CAI or Adobe, Inc.
When you purchase a certificate, you must select at least one of the extended key usage (EKU) fields that specify what the certificate can be used for: email protection and document signing. Applications that use the CAI SDK won't accept the certificate unless it has one of these EKUs.
Extracting the certificate
To work with the certificate, you need to extract it. When the CAI SDK adds Content Credentials to an asset, it incorporates the certificate (including the associated public key) into the manifest.
The private key associated with the certificate is extremely sensitive. Always treat it with the highest security to ensure your credentials are not compromised. If someone does obtain your private key, they will be able to sign C2PA manifests and other content on your behalf without your consent.
Using the certificate to sign a manifest
The simplest way to add a C2PA manifest to an asset file is by using C2PA Tool (c2patool
). You can run C2PA Tool tool manually from the command line (for example, during development) and more generally from any executable program that can call out to the shell, such as a Node.js application as shown in the c2patool Node.js service example.
The prerelease libraries for Node.js, Python, and C++/C can also add and sign a manifest.
Similarly, using the Rust SDK, you can add a manifest to an asset file, referencing the certificate and private key file. For a simple example of creating and signing a manifest from a C program, see the c2c2pa repository.
Accessing a private key and certificate directly from the file system is fine during development, but doing so in production may be insecure. Instead use a Key Management Service (KMS) or a hardware security module (HSM) to access the certificate and key; for example as show in the C2PA Python Example.
Verify known certificate list
The C2PA Verify tool uses a list of known certificates (sometimes referred to as a "trust list") to determine whether a Content Credential was issued by a known source. The known certificate list applies only to Verify . For more information, see Verify tool known certificate list
Identity
To identify who created or modified an asset, identity needs to be verifiable and bound to an asset and its manifest store. The CAI SDK supports the W3C verifiable credentials standard recommendation (part of the C2PA specification), but doesn't currently have a way to validate these credentials or ensure that they properly reflect authorship of the content. An actor can add one or more identities to a manifest using the W3C verifiable credentials data model. Currently, a verifier must trust the manifest signer to properly authenticate the identity.
Identity can be bolstered with other kinds of evidence such as Adobe connected accounts. In the future, the identity credentials will be separately verifiable. In the future, these verifiable credentials will be strongly bound to the manifest and media and be independently verifiable.
In addition to simply adding a name and organization, Adobe tools can use the Connected Accounts service to connect social media accounts such as Behance, Instagram, or Twitter to an identity in a manifest. This service uses OAuth, so a user must be able to log in to the account to connect it.
The Creator Assertions Working Group (CAWG) is developing a technical specification for an identity assertion for use in the C2PA ecosystem. CAI expects to adopt and implement this specification in the SDK at some point in the future.
How to use the SDK
The CAI open-source SDK consist of:
- C2PA Tool, a command-line tool for working with manifests and media. This tool is a wrapper around the Rust SDK and provides most of the same capabilities that it does.
- Language-specific libraries in C/C++, Python, Node.js and client JavaScript. NOTE: The C/C++, Python, Node.js libraries are prerelease versions whose APIs are subject to change.
- The Rust library enables a desktop, mobile, or embedded application to create and sign manifests, embed manifests in certain file formats, and parse and validate manifests.
Behind the scenes, C2PA Tool and language-specific libraries are built using the Rust library to ensure consistency.
The following diagram provides a high-level view of how to use the open-source CAI SDK.
Applications can use the CAI SDK in several different ways:
- Web pages can use the JavaScript library to display Content Credentials.
- Applications can "shell out" to call C2PA Tool directly.
- Applications written in C++, Python, or Node.js can use the APIs of the corresponding language libraries to:
- Create, modify, and sign manifests.
- Embed manifests into media files.
- Parse and validate manifests.
Similarly, applications written in many programming languages can use the Rust Foreign Function Interface to call the Rust API and perform those same functions.
Native desktop or mobile applications
Applications written in C++, Python, or Node.js can use the corresponding prerelease library APIs. Applications written in any language call C2PA Tool directly, though doing so is not highly scalable.
Alternatively, native applications can use Rust's Foreign Function Interface (FFI) to call functions in the Rust library. The FFI enables interoperability between Rust and code written in other languages.
Although the underlying technology of the Rust library supports all major programming languages, the bindings and APIs to make all of them workable and easy to use are still in development.
A Windows application can use the FFI to call Rust functions from languages such as C++ or C#. For an example, see the c2c2pa repository.
An Android application can use JNI (Java Native Interface) to call Rust functions from Java or Kotlin code. This requires creating a shared library (a .so file) with Rust code that exposes functions with #[no_mangle]
attribute and an extern "C"
keyword. Java and Kotlin code can load and invoke the shared library using System.loadLibrary()
and native methods.
An iOS application can use the C-ABI (C Application Binary Interface) to call Rust functions from Swift or Objective-C code. This also requires creating a shared library (a .dylib file) with Rust code that exposes functions with #[no_mangle]
attribute and extern "C"
keyword. For a simple example, see lib.rs
in the c2c2pa repository. Swift or Objective-C code can link and invoke the shared library using the @_silgen_name
attribute and unsafe blocks.
Websites
A website can serve web pages that use the JavaScript library to display manifest data using client JavaScript. The ability to create and sign manifests from JavaScript via WebAssembly is under consideration and may be released in the future.
A server-side web application can create, modify, and sign claims (and view them) by:
- Executing a shell command to invoke C2PA Tool. For an example, see the c2patool Node.js service example. While this approach works, it is not highly scalable.
- Use the prerelease Node.js, Python, or C++/C libraries.
- Bind to the Rust library and use it, similarly to native applications.
Embedded applications
An embedded application can use the Rust FFI (foreign function interface) to call Rust functions from languages such as C or C++, similarly to a native application.
Embedded applications have unique constraints tied to the devices on which they run, including small memory footprint, low-powered hardware, intermittent network access, unique operating systems, or the lack of an operating system OS (running on bare metal). For these reasons, if you want to develop a CAI-enabled embedded application, please contact the CAI team directly.
Attaching and storing the manifest
Once you've generated a manifest, you must attach it to the asset for it to be useful.
Storing a manifest in file
You can embed a manifest directly in a digital asset (image, audio, or video file). Currently, this is the only approach available to developers other than Adobe, which operates the Content Credentials Cloud store. Adobe tools may both store the manifest in the file and upload it to the cloud.
Embedding a manifest in an asset usually increases the size of the file. Two things contribute to this increase:
- Whether a thumbnail image is stored and its quality. This typically adds around 100KB – 1MB to the size of the file, depending on the quality of the thumbnail.
- The size of the certificate chain: Signing chain and timestamp certificate chain each may be around 10KB.
Storing a manifest in the cloud
In addition to storing the manifest with the asset, Adobe tools use the Adobe Content Credentials Cloud, a public, persistent storage option for attribution and history data. Publishing to this cloud keeps files smaller and makes Content Credentials more resilient, because if stripped from an asset they can be recovered by searching with the C2PA Verify tool. On the other hand, embedding the manifest keeps the file self-contained and means no web access is required to validate.
Currently, Adobe Content Credentials Cloud (ACCC) is only available to Adobe tools, though in the future other credentials clouds may become available if other organizations want to host a credential service. Currently, most manifests will be embedded in assets since ACCC is specific to Adobe.
Displaying manifest data
The manifest is only useful to end users if it can be displayed meaningfully in a user interface. How a UI does that is of course up to the designer, but because the full set of manifest data for an asset can be very complex, there are four recommended levels of disclosure:
- Level 1: Indicates that manifest data is present and whether it has been validated.
- Level 2: Summarizes the manifest data enough to describe how the asset came to its current state.
- Level 3: Provides a detailed display of all relevant manifest data, including ingredient manifest. The implementation determines what's relevant and how to display it.
- Level 4: Provides the complete data including full detail of signatures and trust signals, for sophisticated, forensic investigatory use.
The example shown below illustrates one way to implement levels of disclosure:
- The Content Credentials pin shown over the image indicates that the image has Content Credentials (level 1).
- Clicking on the pin displays level 2 information, and an Inspect button.
- Clicking the Inspect button opens the image on the Verify site that provides level 3 information.
Example uses
Numerous Adobe tools and products implement Content Credentials. You can optionally attach credentials to images you create or modify using Photoshop and Lightroom. Additionally, all images created using the Firefly generative AI tool automatically have attached Content Credentials. Behance displays Content Credentials if they are attached to media shared on the site, and still images provided through Adobe Stock automatically have Content Credentials attached.
The C2PA's Verify website inspects image files and displays their Content Credentials if they exist. See Using the Verify tool for more information.
Additionally, several other organizations have implemented CAI solutions, including:
- PixelStream: A version-controlled sharing platform for authentic media. It's kind of like GitHub for authentic media assets, but it's built on top of C2PA tooling instead of Git.
- SmartFrame Technologies: Validating provenance with secure capture to enhance innovative image streaming delivery.
- The New York Times R&D: Using secure sourcing to combat misinformation in news publishing.
For more case studies, see https://contentauthenticity.org/case-studies.