Reference
Media: string-encoding non-JSON data
JSON schema has a set of keywords to describe and optionally validate non-JSON data stored inside JSON strings. Due to the difficulty in writing validators for all media types, JSON schema validators are not required to validate the contents of JSON strings based on these keywords. However, applications that consume validated JSON use these keywords to encode and decode data during the storage and transmission of media types.
contentMediaType and contentEncoding
The contentMediaType
keyword specifies the media type of the content of a string, as described in RFC 2046. The Internet Assigned Numbers Authority (IANA) has officially registered a comprehensive list of media types, but the set of supported types depends on the application and operating system. Mozilla Developer Network maintains a shorter list of media types that are important for the web
Example
The following schema specifies a string containing an HTML file using the document's default encoding.
The contentEncoding
keyword specifies the encoding used to store the contents, as specified in RFC 2054, part 6.1 and RFC 4648.
The acceptable values are the following:
quoted-printable
base16
base32
base64
If not specified, the encoding is the same as the containing JSON document.
There are two main scenarios:
- Same encoding as JSON document: Leave
contentEncoding
unspecified and include the content in a string as-is. This is suitable for text-based content types (e.g.,text/html
,application/xml
) and assumes UTF-8 encoding in most cases. - Binary data: Set
contentEncoding
tobase64
and encode the content using Base64. This is appropriate for binary content types such as images (image/png
) or audio files (audio/mpeg
).
Example
The following schema indicates that a string contains a PNG file and is encoded using Base64:
To better understand how contentEncoding
and contentMediaType
are applied in practice, let's consider the process of transmitting non-JSON data:
1block-beta
2 columns 9
3 A space B space C space D space E
4 F space:5 G space:2
5
6 A{{"Sender"}} --> F{"contentEncoding
7contentMediaType"}
8 F{"contentEncoding
9contentMediaType"} --> B{{"Encoded data"}}
10 B{{"Encoded data"}} --> C(["Transmission"])
11 C(["Transmission"]) --> D{{"Consumer application"}}
12 D{{"Consumer application"}} --> G{"contentEncoding
13contentMediaType"}
14 G{"contentEncoding
15contentMediaType"} --> E{{"Decoded data"}}
16
- The sender encodes the content, using
contentEncoding
to specify the encoding method (e.g., base64) andcontentMediaType
to indicate the media type of the original content. - The encoded data is then transmitted.
- Upon receiving the data, the consumer application uses the
contentEncoding
andcontentMediaType
information to select the appropriate decoding method. - Finally, the consumer application decodes the data, restoring it to its original form.
This process ensures that the non-JSON content is properly encoded for transmission and accurately decoded by the recipient, maintaining the integrity of the data throughout the process.
contentSchema
The value of contentSchema
must be a valid JSON schema that you can use to define the structure and constraints of the content. It is used in conjunction with contentMediaType
when the instance is a string. If contentMediaType
is absent, the value of contentSchema
is ignored.
Full example
The following schema indicates that a string contains a JSON object encoded using Base64:
Need Help?
Did you find these docs helpful?
Help us make our docs great!
At JSON Schema, we value docs contributions as much as every other type of contribution!
Still Need Help?
Learning JSON Schema is often confusing, but don't worry, we are here to help!.