6.4.4 Escaping of Values

When a non-BINARY value is serialized during either system view or document view export, it is first converted to string form using standard value conversion, see 6.2.6 Property Type Conversion (BINARY values are encoded using Base64).

Within the resulting string, any occurrence of one of the five characters corresponding to the five predefined entity references in XML, ampersand (&), less-than symbol (<), greater-than symbol (>), apostrophe ('), and quotation mark (") must be escaped as &amp;, &lt;, &gt;, &apos; and &quot;, respectively.

In document view serialization, if the property being serialized is multi-valued (or if the implementation chooses to encode spaces in single value properties too, see below) then the value or values must be further encoded by escaping any occurrence of one of the four whitespace characters: space, tab, carriage return and line feed. The scheme used to encode these characters is the same as that described in 6.4.3 Escaping of Names. Note that in this restricted context, applying those escaping rules amounts to the following: a space becomes _x0020_, a tab becomes _x0009_, a carriage return becomes _x000D_, a line feed becomes _x000A_ and any underscore (_) that occurs as the first character of a sequence that could be misinterpreted as an escape sequence becomes _x005f_.

Finally, in document view export, the value of the attribute representing a multi-value property is constructed by concatenating the results of the above escaping into a space-delimited list.

In document view export (though not in system view), if multi-value property serialization is supported (see 6.4.2.5 Multi-value Properties) then a mechanism must be adopted whereby upon re-import the distinction between multi- and single- value properties is not lost. One option is that escaping of space literals must be applied to the value of all single-value properties as well, Another option is that when an XML document is imported in document view, each attribute is assumed to be a single-value property unless out-of-band information defines it to be multi-valued (for example, if the applicable node type defines the property as multi-valued or the XML document is associated with a schema definition that indicates that the attribute is a list value).

Note that the value of a jcr:xmlcharacters property used to represent XML text (see 6.4.2.3 XML Text) is not space-escaped, regardless of the prevailing multi-value property serialization policy.