Status Update (6 April 2021): Feedback, comments, error reports on this specification should be sent via GitHub https://github.com/w3c/qtspecs/issues or email to public-qt-comments@w3.org.
Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in these non-normative formats: XML and Change markings relative to 1.0 Recommendation.
Copyright © 2014 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document defines serialization of an instance of the data model as defined in [XQuery and XPath Data Model (XDM) 3.0] into a sequence of octets. Serialization is designed to be a component that can be used by other specifications such as [XSL Transformations (XSLT) Version 3.0] or [XQuery 3.0: An XML Query Language].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is one document in a set of six documents that have been progressed to Recommendation together (XQuery 3.0, XQueryX 3.0, XPath 3.0, Data Model 3.0, Functions and Operators 3.0, and Serialization 3.0).
This is a Recommendation of the W3C. It was jointly developed by the W3C XSLT Working Group and the W3C XML Query Working Group, each of which is part of the XML Activity.
This Recommendation of XSLT and XQuery Serialization 3.0 represents the second version of a previous W3C Recommendation.
This specification is designed to be referenced normatively from other specifications defining a host language for it; it is not intended to be implemented outside a host language. The implementability of this specification has been tested in the context of its normative inclusion in host languages defined by the XQuery 3.0 and XSLT 3.0 (expected in 2014) specifications; see the XQuery 3.0 implementation report (and, in the future, the WGs expect that there will also be a — possibly member-only — XSLT 3.0 implementation report) for details.
This document incorporates minor changes made against the Proposed Recommendation of 22 October 2013. Changes to this document since the Proposed Recommendation are detailed in F Change Log.
Please report errors in this document using W3C's public Bugzilla system (instructions can be found at http://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery public comments mailing list, public-qt-comments@w3.org. It will be very helpful if you include the string “[SER30]” in the subject line of your report, whether made in Bugzilla or in email. Please use multiple Bugzilla entries (or, if necessary, multiple email messages) if you have more than one comment to make. Archives of the comments and responses are available at http://lists.w3.org/Archives/Public/public-qt-comments/.
This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document was produced by groups operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the XML Query Working Group and also maintains a public list of any patent disclosures made in connection with the deliverables of the XSL Working Group; those pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
1 Introduction
1.1 Terminology
1.2 Namespaces
2 Sequence Normalization
3 Serialization Parameters
3.1 Setting Serialization Parameters by
Means of a Data Model Instance
4 Phases of Serialization
5 XML Output Method
5.1 The Influence of
Serialization Parameters upon the XML Output Method
5.1.1 XML Output Method: the version Parameter
5.1.2 XML Output Method: the html-version
Parameter
5.1.3 XML Output Method: the encoding Parameter
5.1.4 XML Output Method: the indent and
suppress-indentation Parameters
5.1.5 XML Output Method: the
cdata-section-elements Parameter
5.1.6 XML Output Method: the
omit-xml-declaration and standalone Parameters
5.1.7 XML Output Method: the doctype-system and
doctype-public Parameters
5.1.8 XML Output Method: the undeclare-prefixes
Parameter
5.1.9 XML Output Method: the normalization-form
Parameter
5.1.10 XML Output Method: the media-type
Parameter
5.1.11 XML Output Method: the use-character-maps
Parameter
5.1.12 XML Output Method: the byte-order-mark
Parameter
5.1.13 XML Output Method: the
escape-uri-attributes Parameter
5.1.14 XML Output Method: the
include-content-type Parameter
5.1.15 XML Output Method: the item-separator
Parameter
6 XHTML Output Method
6.1 The Influence
of Serialization Parameters upon the XHTML Output Method
6.1.1 XHTML Output Method: the version
Parameter
6.1.2 XHTML Output Method: the html-version
Parameter
6.1.3 XHTML Output Method: the encoding
Parameter
6.1.4 XHTML Output Method: the indent and
suppress-indentation Parameters
6.1.5 XHTML Output Method: the
cdata-section-elements Parameter
6.1.6 XHTML Output Method: the
omit-xml-declaration and standalone Parameters
6.1.7 XHTML Output Method: the doctype-system and
doctype-public Parameters
6.1.8 XHTML Output Method: the
undeclare-prefixes Parameter
6.1.9 XHTML Output Method: the
normalization-form Parameter
6.1.10 XHTML Output Method: the media-type
Parameter
6.1.11 XHTML Output Method: the
use-character-maps Parameter
6.1.12 XHTML Output Method: the byte-order-mark
Parameter
6.1.13 XHTML Output Method: the
escape-uri-attributes Parameter
6.1.14 XHTML Output Method: the
include-content-type Parameter
6.1.15 XHTML Output Method: the item-separator
Parameter
6.2 Polyglot markup and namespace
declarations
7 HTML Output Method
7.1 Markup for
Elements
7.2 Writing
Attributes
7.3 Writing
Character Data
7.4 The Influence of
Serialization Parameters upon the HTML Output Method
7.4.1 HTML Output Method: the version and html-version
Parameters
7.4.2 HTML Output Method: the encoding
Parameter
7.4.3 HTML Output Method: the indent and
suppress-indentation Parameters
7.4.4 HTML Output Method: the
cdata-section-elements Parameter
7.4.5 HTML Output Method: the
omit-xml-declaration and standalone Parameters
7.4.6 HTML Output Method: the doctype-system and
doctype-public Parameters
7.4.7 HTML Output Method: the
undeclare-prefixes Parameter
7.4.8 HTML Output Method: the
normalization-form Parameter
7.4.9 HTML Output Method: the media-type
Parameter
7.4.10 HTML Output Method: the
use-character-maps Parameter
7.4.11 HTML Output Method: the byte-order-mark
Parameter
7.4.12 HTML Output Method: the
escape-uri-attributes Parameter
7.4.13 HTML Output Method: the
include-content-type Parameter
7.4.14 HTML Output Method: the item-separator
Parameter
8 Text Output Method
8.1 The Influence of
Serialization Parameters upon the Text Output Method
8.1.1 Text Output Method: the version Parameter
8.1.2 Text Output Method: the html-version
Parameter
8.1.3 Text Output Method: the encoding
Parameter
8.1.4 Text Output Method: the indent and
suppress-indentation Parameters
8.1.5 Text Output Method: the
cdata-section-elements Parameter
8.1.6 Text Output Method: the
omit-xml-declaration and standalone Parameters
8.1.7 Text Output Method: the doctype-system and
doctype-public Parameters
8.1.8 Text Output Method: the
undeclare-prefixes Parameter
8.1.9 Text Output Method: the
normalization-form Parameter
8.1.10 Text Output Method: the media-type
Parameter
8.1.11 Text Output Method: the
use-character-maps Parameter
8.1.12 Text Output Method: the byte-order-mark
Parameter
8.1.13 Text Output Method: the
escape-uri-attributes Parameter
8.1.14 Text Output Method: the
include-content-type Parameter
8.1.15 Text Output Method: the item-separator
Parameter
9 Character Maps
10 Conformance
A References
A.1 Normative References
A.2 Informative References
B Schema for Serialization
Parameters
C Summary of Error Conditions
D List of URI
Attributes
E Checklist of
Implementation-Defined Features (Non-Normative)
F Change Log (Non-Normative)
F.1 Changes
applied for the Recommendation
F.2 Changes
applied for the Candidate Recommendation
F.3 Changes
applied for the fifth Public Working Draft
F.4 Changes
applied for the fourth Public Working Draft
F.5 Changes
applied for the third Public Working Draft
F.6 Changes
applied for the second Public Working Draft
F.7 Changes
applied for the first Public Working Draft
This document defines serialization of the W3C XQuery and XPath Data Model 3.0 (XDM), which is the data model of at least [XML Path Language (XPath) 3.0], [XSL Transformations (XSLT) Version 3.0], and [XQuery 3.0: An XML Query Language], and any other specifications that reference it.
In this document, examples and material labeled as "Note" are provided for explanatory purposes and are not normative.
Serialization is the process of converting an instance of the [XQuery and XPath Data Model (XDM) 3.0] into a sequence of octets. Serialization is well-defined for most data model instances.
In this specification, where they appear in upper case, the words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "MAY", "REQUIRED", and "RECOMMENDED" are to be interpreted as described in [RFC2119].
[Definition: As is indicated in 10 Conformance, conformance criteria for serialization are determined by other specifications that refer to this specification. A serializer is software that implements some or all of the requirements of this specification in accordance with such conformance criteria.] A serializer is not REQUIRED to directly provide a programming interface that permits a user to set serialization parameters or to provide an input sequence for serialization. In this document, material labeled as "Note" and examples are provided for explanatory purposes and are not normative.
Certain aspects of serialization are described in this specification as implementation-defined or implementation-dependent.
[Definition: Implementation-defined indicates an aspect that MAY differ between serializers, but whose actual behavior MUST be specified either by another specification that sets conformance criteria for serialization (see 10 Conformance) or in documentation that accompanies the serializer.]
[Definition: Implementation-dependent indicates an aspect that MAY differ between serializers, and whose actual behavior is not REQUIRED to be specified either by another specification that sets conformance criteria for serialization (see 10 Conformance) or in documentation that accompanies the serializer.]
[Definition: In some instances, the sequence that is input to serialization cannot be successfully converted into a sequence of octets given the set of serialization parameter (3 Serialization Parameters) values specified. A serialization error is said to occur in such an instance.] In some cases, a serializer is REQUIRED to signal such an error. What it means to signal a serialization error is determined by the relevant conformance criteria (10 Conformance) to which the serializer conforms. In other cases, there is an implementation-defined choice between signaling a serialization error and performing a recovery action. Such a recovery action will allow a serializer to produce a sequence of octets that might not fully reflect the usual requirements of the parameter settings that are in effect.
[Definition: Where this specification indicates that two strings are to be compared without regard to case, the serializer MUST translate any characters in the range #x41 (LATIN CAPITAL LETTER A) to #x5A (LATIN CAPITAL LETTER Z), inclusive, to the corresponding lower-case letters in the range #x61 (LATIN SMALL LETTER A) to #x7A (LATIN SMALL LETTER Z) only for the purposes of making the comparison. The comparison succeeds if the two strings are the same length and the code point of each character in the first string is equal to the code point of the character in the corresponding position in the second string.]
Many terms used in this document are defined in the XPath specification [XML Path Language (XPath) 3.0] or the Data Model specification [XQuery and XPath Data Model (XDM) 3.0]. Particular attention is drawn to the following:
[Definition: The term atomization is defined in Section 2.4.2 Atomization XP30. It is a process that takes as input a sequence of nodes and atomic valuesXP30, and returns a sequence of atomic valuesXP30, in which the nodes are replaced by their typed valuesXP30 as defined in [XQuery and XPath Data Model (XDM) 3.0].]
[Definition: The term Node is defined as part of Section 6 Nodes DM30. There are seven kinds of nodes in the data model: document, element, attribute, text, namespace, processing instruction, and comment.]
[Definition: The term sequence is defined in Section 2 Basics XP30. A sequence is an ordered collection of zero or more items.]
[Definition: The term function is defined in Section 2.8.1 Functions DM30.]
[Definition: The term string value is defined in Section 5.13 string-value Accessor DM30. Every node has a string value. For example, the string value of an element is the concatenation of the string values of all its descendant text nodes.]
[Definition: The term expanded QName is defined in Section 2 Basics XP30. An expanded QName consists of an optional namespace URI and a local name. An expanded QName also retains its original namespace prefix (if any), to facilitate casting the expanded QName into a string.]
[Definition: An element or attribute that is in no namespace, or an expanded-QName whose namespace part is an empty sequence, is referred to as having a null namespace URI].
[Definition: An element or attribute that does not have a null namespace URI, is referred to as having a non-null namespace URI].
[Definition: A space character, TAB character, CR character or NL character is referred to as a whitespace character.]
Where this specification indicates that an XSLT instruction is evaluated, the behavior is as specified by [XSL Transformations (XSLT) Version 2.0]. Where it indicates that an XQuery expression is evaluated, the behavior is as specified by [XQuery 3.0: An XML Query Language].
This specification refers to several namespaces that affect the process of serialization. These are:
[Definition: the Output
declaration namespace,
http://www.w3.org/2010/xslt-xquery-serialization
];
[Definition: the XML namespace,
http://www.w3.org/XML/1998/namespace
];
[Definition: the XHTML namespace
namespace, http://www.w3.org/1999/xhtml
];
[Definition: the SVG namespace,
http://www.w3.org/2000/svg
]; and
[Definition: the MathML namespace
namespace, http://www.w3.org/1998/Math/MathML
.]
Wherever an element node or attribute node is said to be in a particular namespace, it is understood that the namespace URI of the node is equal to the namespace URI corresponding to that namespace. Wherever a namespace node is said to be a namespace node for a particular namespace, it is understood that the string value of the node is equal to the namespace URI corresponding to that namespace.
An instance of the data model that is input to the serialization process is a sequence. Prior to serializing a sequence using any of the output methods whose behavior is specified by this document (3 Serialization Parameters), the serializer MUST first compute a normalized sequence for serialization; it is the normalized sequence that is actually serialized. [Definition: The purpose of sequence normalization is to create a sequence that can be serialized as a well-formed XML document or external general parsed entity, that also reflects the content of the input sequence to the extent possible.] [Definition: The result of the sequence normalization process is a result tree.]
The normalized sequence for serialization is constructed by applying all of the following rules in order, with the initial sequence being input to the first step, and the sequence that results from any step being used as input to the subsequent step. For any implementation-defined output method, it is implementation-defined whether this sequence normalization process takes place.
Where the process of converting the input sequence to a
normalized sequence indicates that a value MUST be
cast to xs:string
, that operation is defined in
Section
18.1.1 Casting to xs:string and xs:untypedAtomic
FO30 of [XQuery and XPath Functions and Operators
3.0]. Where a step in the sequence normalization process
indicates that a node should be copied, the copy is performed in
the same way as an XSLT xsl:copy-of
instruction that
has a validation
attribute whose value is
preserve
and has a select
attribute whose
effective value is the node, as described in Section 11.9.2 Deep
CopyXT of [XSL
Transformations (XSLT) Version 2.0], or equivalently in
the same way as an XQuery content expression as described in Step
1e of Section
3.9.1.3 Content XQ30 of [XQuery 3.0: An XML Query Language], where the
construction mode is preserve
. The steps in
computing the normalized sequence are:
If the sequence that is input to serialization is empty, create a sequence S1 that consists of a zero-length string. Otherwise, copy each item in the sequence that is input to serialization to create the new sequence S1.
For each item in S1, if the item is atomic,
obtain the lexical representation of the item by casting it to an
xs:string
and copy the string representation to the
new sequence; otherwise, copy the item to the new sequence. The new
sequence is S2.
If the item-separator
serialization parameter
is absent, then for each subsequence of adjacent strings in
S2, copy a single string to the new sequence
equal to the values of the strings in the subsequence concatenated
in order, each separated by a single space. Copy all other items to
the new sequence. Otherwise, copy each item in
S2 to the new sequence, inserting between each
pair of items a string whose value is equal to the value of the
item-separator
parameter. The new sequence is
S3.
For each item in S3, if the item is a string, create a text node in the new sequence whose string value is equal to the string; otherwise, copy the item to the new sequence. The new sequence is S4.
For each item in S4, if the item is a document node, copy its children to the new sequence; otherwise, copy the item to the new sequence. The new sequence is S5.
For each subsequence of adjacent text nodes in S5, copy a single text node to the new sequence equal to the values of the text nodes in the subsequence concatenated in order. Any text nodes with values of zero length are dropped. Copy all other items to the new sequence. The new sequence is S6.
It is a serialization error [err:SENR0001] if an item in S6 is an attribute node, a namespace node or a function . Otherwise, construct a new sequence, S7, that consists of a single document node and copy all the items in the sequence, which are all nodes, as children of that document node.
S7 is the normalized sequence.
The result tree rooted at the document node that is created by the final step of this sequence normalization process is the instance of the data model to which the rules of the appropriate output method are applied. If the sequence normalization process results in a serialization error, the serializer MUST signal the error.
Note:
If the item-separator
serialization parameter
is absent, the sequence normalization process for a sequence
$seq
is equivalent to constructing a document
node using the XSLT
instruction:
<xsl:document> <xsl:copy-of select="$seq" validation="preserve"/> </xsl:document>
or the XQuery expression:
declare construction preserve; document { $seq }
If the item-separator
serialization parameter
is present, the sequence normalization process for a sequence
$seq
is equivalent to constructing a document
node using the XSLT
instruction:
<xsl:document> <xsl:for-each select="$seq"> <xsl:sequence select="if (position() gt 1) then $sep else ()"/> <xsl:choose> <xsl:when test=". instance of node()"> <xsl:sequence select="."/> </xsl:when> <xsl:otherwise> <xsl:value-of select="."/> </xsl:otherwise> </xsl:choose> </xsl:for-each> </xsl:document>
or the XQuery expression:
declare construction preserve; document { for $item at $pos in $seq let $node := if ($item instance of node()) then $item else text { $item } return if ($pos eq 1) then $node else ($sep, $node) }
where the value of the sep
variable is a
string whose value is equal to the value of the
item-separator
serialization parameter.
This process results in a serialization error [err:SENR0001] if $seq
contains functions, attribute nodes or namespace nodes.
There are a number of parameters that influence how serialization is performed. Host languages MAY allow users to specify any or all of these parameters, but they are not REQUIRED to be able to do so. However, the host language specification MUST specify how the values of all applicable parameters are to be determined.
It is a serialization error [err:SEPM0016] if a parameter value is invalid for the given parameter. It is the responsibility of the host language to specify how invalid values should be handled at the level of that language.
The following serialization parameters are defined:
Serialization parameter name | Permitted values for parameter |
---|---|
byte-order-mark |
One of the enumerated values yes or
no . This parameter indicates whether the serialized
sequence of octets is to be preceded by a Byte Order Mark. (See
Section 5.1 of [Unicode Encoding].)
The actual octet order used is implementation-dependent. If the encoding
defines no Byte Order Mark, or if the Byte Order Mark is prohibited
for the specific Unicode encoding or implementation environment,
then this parameter is ignored. |
cdata-section-elements |
A list of expanded QNames, possibly empty. |
doctype-public |
A string of PubidCharXML characters. This parameter MAY be absent. |
doctype-system |
A string of Unicode characters that does not include both an apostrophe (#x27) and a quotation mark (#x22) character. This parameter MAY be absent. |
encoding |
A string of Unicode characters in the range #x21 to #x7E (that
is, printable ASCII characters); the value SHOULD
be a charset registered with the Internet Assigned Numbers
Authority [IANA], [RFC2978] or begin with the characters
x- or X- . |
escape-uri-attributes |
One of the enumerated values yes or
no . |
html-version |
A decimal value. This parameter MAY be absent. |
include-content-type |
One of the enumerated values yes or
no . |
indent |
One of the enumerated values yes or
no . |
item-separator |
A string of Unicode characters. This parameter MAY be absent. |
media-type |
A string of Unicode characters specifying the media type (MIME
content type) [RFC2046]; the charset
parameter of the media type MUST NOT be specified
explicitly in the value of the media-type parameter.
If the destination of the serialized output is annotated with a
media type, this parameter MAY be used to provide
such an annotation. For example, it MAY be used to
set the media type in an HTTP header. |
method |
An expanded QName with a null namespace URI, and the local part of
the name equal to one of xml , xhtml ,
html or text , or having a non-null
namespace URI. If the namespace URI is non-null, the parameter
specifies an implementation-defined output method. |
normalization-form |
One of the enumerated values NFC ,
NFD , NFKC , NFKD ,
fully-normalized or none , or an implementation-defined
value of type NMTOKEN . |
omit-xml-declaration |
One of the enumerated values yes or
no . |
standalone |
One of the enumerated values yes , no
or omit . |
suppress-indentation |
A list of expanded QNames, possibly empty. |
undeclare-prefixes |
One of the enumerated values yes or
no . |
use-character-maps |
A list of pairs, possibly empty, with each pair consisting of a single Unicode character and a string of Unicode characters. |
version |
A string of Unicode characters. |
The value of the method
parameter is an expanded QName. If
the value has a null namespace URI, then the local name
identifies a method specified in this document and
MUST be one of xml
,
html
, xhtml
, or text
; in
this case, the output method specified MUST be
used for serializing. If the namespace URI is non-null, then it
identifies an implementation-defined output method; the behavior in
this case is not specified by this document.
In those cases where they have no important effect on the content of the serialized result, details of the output methods defined by this specification are left unspecified and are regarded as implementation-dependent. Whether a serializer uses apostrophes or quotation marks to delimit attribute values in the XML output method is an example of such a detail.
The detailed semantics of each parameter will be described separately for each output method for which it is applicable. If the semantics of a parameter are not described for an output method, then it is not applicable to that output method.
Implementations MAY define additional
serialization parameters, and MAY allow users to
do so. For this purpose, the name of a serialization parameter is
considered to be a QName; the parameters listed above are QNames in
no namespace, while any additional serialization parameters
that are either implementation-defined or defined by the host language
MUST have names that are namespace-qualified.
Any such additional serialization parameters MUST
NOT be in the namespace
http://www.w3.org/2010/xslt-xquery-serialization
. A
host language
MAY specify the means by which an implementation
can define such an additional serialization parameter, and
implementations MAY provide mechanisms by which
users can define such an additional serialization parameter.
If the serialization method is one of the four methods
xml
, html
, xhtml
, or
text
, then the additional serialization parameters
MAY affect the output of the serializer to the extent (but only to the
extent) that this specification leaves the output implementation-defined
or implementation-dependent. For example, such
parameters might control whether namespace declarations on an
element are written before or after the attributes of the element,
or they might define the number of space or tab characters to be
inserted when the indent
parameter is set to
yes
; but they could not instruct the serializer to suppress the
error that occurs when the HTML output method encounters characters
that are not permitted (see error [err:SERE0014]).
A host language MAY provide, by reference to this section, a mechanism by which the settings of serialization parameters are supplied in the form of an instance of the data model as specified in [XQuery and XPath Data Model (XDM) 3.0]. The instance of the data model used to determine the settings of serialization parameters MUST be processed as if by the procedure described below.
With the exception of the use-character-maps
parameter, the setting of each serialization parameter
defined in this specification is equal to the result
of evaluating the XQuery expression
(validate lax { document { . } }) /output:serialization-parameters /output:*[local-name() eq $param-name]/data(@value)
or equivalently the XSLT instructions
<xsl:sequence>
<xsl:variable name="validated-instance">
<xsl:document validation="lax">
<xsl:sequence select="."/>
</xsl:document>
</xsl:variable>
<xsl:sequence select="$validated-instance
/output:serialization-parameters
/output:*[local-name() eq $param-name]/data(@value)"/>
</xsl:sequence>
with the supplied instance of the data model as the context
item, the param-name
variable having as its value a
value of type xs:string
equal to the local part of the
name of the particular serialization parameter, and the other
components of the dynamic context and static context as specified
in the subsequent tables. If in any case evaluating this expression
would yield an error, serialization error [err:SEPM0017] results.
If the result of evaluating this expression for a particular serialization parameter is the empty sequence:
if the parameter is either cdata-section-elements
or suppress-indentation
and the result of evaluating
the XQuery expression
(validate lax { document { . } }) /output:serialization-parameters /output:*[local-name() eq $param-name]
or equivalently the XSLT instructions
<xsl:sequence>
<xsl:variable name="validated-instance">
<xsl:document select="." validation="lax">
<xsl:sequence select="."/>
</xsl:document>
</xsl:variable>
<xsl:sequence select="$validated-instance
/output:serialization-parameters
/output:*[local-name() eq $param-name]"/>
</xsl:sequence>
with the same settings of the static context and dynamic context is not an empty sequence, the setting of the parameter is the empty list;
otherwise, the setting of the parameter is absent.
The components of the static context used in evaluating the XQuery expressions or XSLT instructions are as defined in the following table.
Static Context Component | XQuery or XSLT | Setting |
---|---|---|
XPath 1.0 compatibility mode | Both | false |
Statically known namespaces | XQuery | The pair (output,http://www.w3.org/2010/xslt-xquery-serialization) |
XSLT | The pairs (output,http://www.w3.org/2010/xslt-xquery-serialization), (xslt,http://www.w3.org/1999/XSL/Transform) | |
Default element/type namespace | Both | "none" |
Default function namespace | Both | http://www.w3.org/2005/xpath-functions |
In-scope schema types, In-scope element declarations, Substitution groups, In-scope attribute declarations | Both | As defined by the schema for serialization parameters (B Schema for Serialization Parameters) and any additional implementation-defined in-scope schema components |
In-scope variables | Both | {param-name} |
Context item static type | Both | node() |
Statically-known function signatures | Both | {fn:data($arg as item()*) as
xs:anyAtomicType* } |
Statically known collations | Both | { (http://www.w3.org/2005/xpath-functions/collation/codepoint, The Unicode codepoint collation ) } |
Default collation | Both | The Unicode codepoint collation |
Construction mode | XQuery | strip |
Ordering mode | XQuery | ordered |
Default order for empty sequences | XQuery | least |
Boundary space policy | XQuery | strip |
Copy-namespaces mode | XQuery | (preserve,inherit) |
Base URI | Both | Absent |
Statically known documents | Both | None |
Statically known collections | Both | None |
Statically known default collection type | Both | node()* |
Statically known decimal formats | Both | None |
Set of named keys | XSLT | {} |
Values of system properties | XSLT | None |
Set of available instructions | XSLT | The empty set (not needed for evaluating these expressions). |
The remaining components of the dynamic context used in evaluating the XQuery expressions or XSLT instructions in the preceding table are as defined in the following table.
Dynamic Context Component | XQuery or XSLT | Setting |
---|---|---|
Context position | Both | 1 |
Context size | Both | 1 |
Variable values | Both | The param-name variable has a value of type
xs:string equal the local part of the name of the
serialization parameter under consideration |
Function implementations | Both | The implementation of fn:data |
Current dateTime | Both | Absent |
Implicit timezone | Both | Absent |
Available documents | Both | None |
Available collections | Both | None |
Default collection | Both | None |
Current template rule | XSLT | Absent |
Current mode | XSLT | The default mode |
Current group | XSLT | Absent |
Current grouping key | XSLT | Absent |
Current captured substrings | XSLT | The empty sequence |
Output state | XSLT | Temporary output state |
In the case of the use-character-maps
parameter,
the XQuery expression
(validate lax { document { . } }) /output:serialization-parameters/output:use-character-maps /output:character-map[@character eq $char]/string(@map-string)
or equivalently the XSLT instructions
<xsl:sequence>
<xsl:variable name="validated-instance">
<xsl:document validation="lax">
<xsl:sequence select="."/>
</xsl:document>
</xsl:variable>
<xsl:sequence select="$validated-instance
/output:serialization-parameters
/output:use-character-maps
/output:character-map[@character eq $char]
/string(@map-string)"/>
</xsl:sequence>
is evaluated for each Unicode character that is permitted in an
XML document. The dynamic context and static context used to
evaluate the expression are as defined above, except that in-scope
variables is the set {char
} and the value of the
variable "char
" is a value of type
xs:string
of length one whose value is the Unicode
character under consideration. If the result of evaluating the
expression is not an empty sequence, the pair consisting of the
Unicode character and the result of evaluating the expression is
part of the list of pairs in the value of the
use-character-maps
parameter. It is a serialization error
[err:SEPM0018] if
the result of evaluating this expression for any character is a
sequence of length greater than one.
Using the same settings of the components of the dynamic context and static context, serialization error [err:SEPM0019] results if the result of evaluating the following XQuery expression is not true
(document { . })/output:serialization-parameters /(count(distinct-values(*/node-name(.))) eq (count(*)))
or equivalently if the result of evaluating the following XSLT instructions is not true.
<xsl:sequence> <xsl:variable name="doc"> <xsl:document> <xsl:sequence select="."/> </xsl:document> </xsl:variable> <xsl:sequence select="$doc/output:serialization-parameters /(count(distinct-values(*/node-name(.))) eq (count(*)))"/> </xsl:sequence>
The result of evaluating either will be false if the data model
instance supplies a value for any particular
serialization parameter more than once, or will be the empty
sequence if the data model instance does not have as its root node
an element node or a document node with an element node child,
where the local part of the name of the element node is
serialization-parameters
and the namespace URI is
http://www.w3.org/2010/xslt-xquery-serialization
.
Note:
A serializer or implementation of a host language does not need to be accompanied by an XQuery processor nor by a general-purpose schema validator in order to meet the requirements of this section. It merely needs to be capable of extracting values from an XDM instance that conforms to the schema for serialization parameters, while checking that the constraints implied by the schema and additional constraints implied by the XQuery validate expression or explicitly stated in this section are satisfied.
The host language MAY provide additional mechanisms for overriding the values of any serialization parameters specified through the mechanism defined in this section, as well as additional mechanisms for specifying the values of any serialization parameters whose values are absent after applying the mechanism defined in this section.
If the instance of the data model contains elements or
attributes that are in a namespace other than
http://www.w3.org/2010/xslt-xquery-serialization
, the
implementation MAY interpret them to specify the
values of implementation-defined serialization parameters in an
implementation-defined manner.
The following XML document, if converted to a data model
instance and processed using the mechanism described in this
section, would specify the settings of the method
,
version
and indent
serialization
parameters with the values xml
, 1.0
and
yes
, respectively.
<output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"> <output:method value="xml"/> <output:version value="1.0"/> <output:indent value="yes"/> </output:serialization-parameters>
The following document would specify the setting of the
cdata-section-elements
serialization parameter with
value the pair of expanded QNames
(http://example.org/book/chapter
,heading
)
and
(http://example.org/book
,footnote
)
<output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization" xmlns:book="http://example.org/book" xmlns="http://example.org/book/chapter"> <output:cdata-section-elements value="heading book:footnote"/> </output:serialization-parameters>
The following document would specify the value of the
method
serialization parameter with the value
html
.
Notice that in this example, the default namespace declaration
in scope has no effect on the interpretation of the setting of the
method
parameter.
<output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization" xmlns="http://example.org/ext"> <output:method value="html"/> </output:serialization-parameters>
The following document would specify the value of the
method
serialization parameter with value equal to the
expanded QName (http://example.org/ext
,
jsp
), and the use-character-maps
parameter with value equal to the list of pairs, («, <%),
(», %>)
<output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization" xmlns:ext="http://example.org/ext"> <output:method value="ext:jsp"/> <output:use-character-maps> <output:character-map character="«" map-string="<%"/> <output:character-map character="»" map-string="%>"/> </output:use-character-maps> </output:serialization-parameters>
Serialization comprises five phases of processing (preceded optionally by the sequence normalization process described in 2 Sequence Normalization).
For an implementation-defined output method, any of these phases MAY be skipped or MAY be performed in a different order than is specified here. For the output methods defined in this specification, these phases are carried out sequentially as follows:
A meta
element is added to the normalized sequence
along with discarding an existing meta
element, as
controlled by the include-content-type
parameter for
the XHTML and HTML output methods.
Markup generation produces the character representation of those parts of the serialized result that describe the structure of the normalized sequence. In the cases of the XML, HTML and XHTML output methods, this phase produces the character representations of the following:
the document type declaration;
start tags and end tags (except for attribute values, whose representation is produced by the character expansion phase);
processing instructions; and
comments.
In the cases of the XML and XHTML output methods, this phase also produces the following:
the XML or text declaration; and
empty element tags (except for the attribute values);
In the case of the text output method, this phase replaces the single document node produced by sequence normalization with a new document node that has exactly one child, which is a text node. The string value of the new text node is the string value of the document node that was produced by sequence normalization.
Character expansion is concerned with the representation of characters appearing in text and attribute nodes in the normalized sequence. For each text and attribute node, the following rules are applied in sequence.
If the node is an attribute that is a URI attribute
value and the escape-uri-attributes
parameter is
set to require escaping of URI attributes, apply URI escaping as defined
below, and skip rules b-e. Otherwise, continue with rule b.
[Definition: URI escaping consists of the following three steps applied in sequence to the content of URI attribute values:]
normalize to NFC using the method defined in Section 5.4.6 fn:normalize-unicode FO30
percent-encode any special characters in the URI using the method defined in Section 6.4 fn:escape-html-uri FO30
escape according to the rules of the XML or HTML output
method, whichever is applicable, any characters that require
escaping, and any characters that cannot be represented in the
selected encoding. For example, replace <
with
<
. (See also section 7.3 Writing Character Data)
[Definition: The values of attributes
listed in D List of URI
Attributes are URI attribute values. Attributes are
not considered to be URI attributes simply because they are
namespace declaration attributes or have the type annotation
xs:anyURI
.]
If the node is a text node whose parent element is selected by
the rules of the cdata-section-elements
parameter for
the applicable output method, create CDATA sections as described
below, and skip rules c-e. Otherwise, continue with rule c.
Apply the following two processes in sequence to create CDATA sections
Unicode Normalization if requested by
the normalization-form
parameter.
apply changes as detailed in the description of the
cdata-section-elements
parameter for the applicable
output method.
Apply character mapping as determined by the
use-character-maps
parameter for the applicable output
method. For characters that were substituted by this process, skip
rules d and e. For the remaining characters that were not modified
by character mapping, continue with rule d.
Apply Unicode Normalization if requested by
the normalization-form
parameter.
[Definition: Unicode Normalization is the process of removing alternate representations of equivalent sequences from textual data, to convert the data into a form that can be binary-compared for equivalence, as specified in [UAX #15: Unicode Normalization Forms]. For specific recommendations for character normalization on the World Wide Web, see [Character Model for the World Wide Web 1.0: Normalization]. ]
The meanings associated with the possible values of the
normalization-form
parameter are defined in section
5.1.9 XML Output Method: the
normalization-form Parameter.
Continue with step e.
Escape according to the rules of the XML or HTML output
method, whichever is applicable, any characters (such as
<
and &
) where XML or HTML
requires escaping, and any characters that cannot be represented in
the selected encoding. For example, replace <
with
<
. (See also section 7.3 Writing Character Data). For
characters such as >
where XML defines a built-in
entity but does not require its use in all circumstances, it is
implementation-dependent whether the character is escaped.
Indentation, as controlled by the indent
parameter and the suppress-indentation
parameter, MAY add or remove whitespace
according to the rules defined by the applicable output method.
Encoding, as controlled by the encoding
parameter, converts the character stream produced by the previous
phases into an octet stream.
Note:
Serialization is only defined in terms of encoding the result as a stream of octets. However, a serializer MAY provide an option that allows the encoding phase to be skipped, so that the result of serialization is a stream of Unicode characters. The effect of any such option is implementation-defined, and a serializer is not required to support such an option.
The XML output method serializes the normalized sequence as an XML entity that MUST satisfy the rules for either a well-formed XML document entity or a well-formed XML external general parsed entity, or both. A serialization error [err:SERE0003] results if the serializer is unable to satisfy those rules, except for content modified by the character expansion phase of serialization, as described in 4 Phases of Serialization. The effects of the character expansion phase could result in the serialized output being not well-formed, but will not result in a serialization error. If a serialization error results, the serializer MUST signal the error.
If the document node of the normalized sequence has a single element node child and no text node children, then the serialized output is a well-formed XML document entity, and the serialized output MUST conform to the appropriate version of the XML Namespaces Recommendation [XML Names] or [XML Names 1.1]. If the normalized sequence does not take this form, then the serialized output is a well-formed XML external general parsed entity, which, when referenced within a trivial XML document wrapper like this:
<?xml version="version"?> <!DOCTYPE doc [ <!ENTITY e SYSTEM "entity-URI"> ]> <doc>&e;</doc>
where entity-URI
is a URI for the entity, and the
value of the version
pseudo-attribute is the value of
the version
parameter, produces a document which
MUST itself be a well-formed XML document
conforming to the corresponding version of the XML Namespaces
Recommendation [XML Names] or [XML Names 1.1].
[Definition: A reconstructed tree may be constructed by parsing the XML document and converting it into an instance of the data model as specified in [XQuery and XPath Data Model (XDM) 3.0].] The result of serialization MUST be such that the reconstructed tree is the same as the result tree except for the following permitted differences:
If the document was produced by adding a document wrapper, as
described above, then it will contain an extra doc
element as the document element.
The order of attribute and namespace nodes in the two trees MAY be different.
The following properties of corresponding nodes in the two trees MAY be different:
The reconstructed tree MAY contain additional attributes and text nodes resulting from the expansion of default and fixed values in its DTD or schema; also, in the presence of a DTD, non-CDATA attributes may lose whitespace characters as a result of attribute value normalization.
The type annotations of the nodes in the two trees MAY be different. Type annotations in a result tree are discarded when the tree is serialized. Any new type annotations obtained by parsing the document will depend on whether the serialized XML document is assessed against a schema, and this MAY result in type annotations that are different from those in the original result tree.
Note:
In order to influence the type annotations in the instance of
the data model that would result from processing a serialized XML
document, the author of the XSLT stylesheet, XQuery expression or
other process might wish to create the instance of the data model
that is input to the serialization process so that it makes use of
mechanisms provided by [XML Schema],
such as xsi:type
and xsi:schemaLocation
attributes. The serialization process will not automatically create
such attributes in the serialized document if those attributes were
not part of the result
tree that is to be serialized.
Similarly, it is possible that an element node in the instance of the data model that is to be
serialized has the nilled
property with the value
true
, but no xsi:nil
attribute. The
serialization process will not create such an attribute in the
serialized document simply to reflect the value of the property.
The value of the nilled
property has no direct effect
on the serialized result.
Additional namespace nodes MAY be present in the reconstructed tree if the serialization process did not undeclare one or more namespaces, as described in 5.1.8 XML Output Method: the undeclare-prefixes Parameter, and the starting instance of the data model contained an element node with a namespace node that declared some prefix, but a child element of that node did not have any namespace node that declared the same prefix.
The result tree MAY contain namespace nodes that are not present in the reconstructed tree, as the process of creating an instance of the data model MAY ignore namespace declarations in some circumstances. See Section 6.2.3 Construction from an Infoset DM30 and Section 6.2.4 Construction from a PSVI DM30 of [XQuery and XPath Data Model (XDM) 3.0] for additional information.
If the indent
parameter has the value
yes
,
additional text nodes consisting of whitespace characters MAY be present in the reconstructed tree; and
text nodes in the result tree that contained only whitespace characters MAY correspond to text nodes in the reconstructed tree that contain additional whitespace characters that were not present in the result tree
See 5.1.4 XML Output Method: the indent
and suppress-indentation Parameters for more information on
the indent
parameter.
Additional nodes MAY be present in the reconstructed tree due to the effect of character mapping in the character expansion phase, and the values of attribute nodes and text nodes in the reconstructed tree MAY be different from those in the result tree, due to the effects of URI expansion, character mapping and Unicode Normalization in the character expansion phase of serialization.
Note:
The use-character-maps
parameter can cause
arbitrary characters to be inserted into the serialized XML
document in an unescaped form, including characters that would be
considered to be part of XML markup. Such characters could result
in arbitrary new element nodes,
attribute nodes, and so on, in
the reconstructed tree that results from
processing the serialized XML document.
A consequence of this rule is that certain characters
MUST be output as character references, to ensure
that they survive the round trip through serialization and parsing.
Specifically, CR, NEL and LINE SEPARATOR characters in text
nodes MUST be
output respectively as "
",
"…
", and "

", or
their equivalents; while CR, NL, TAB, NEL and LINE SEPARATOR
characters in attribute nodes
MUST be output respectively as
"
", "

",
"	
", "…
", and
"

", or their equivalents. In addition, the
non-whitespace control characters #x1 through #x1F and #x7F through
#x9F in text nodes and
attribute nodes
MUST be output as character references.
For example, an attribute with the value "x" followed by "y"
separated by a newline will result in the output
"x
y"
(or with any equivalent character
reference). The XML output cannot be "x" followed by a literal
newline followed by a "y" because after parsing, the attribute
value would be "x y"
as a consequence of the XML
attribute normalization rules.
Note:
XML 1.0 did not permit an XML processor to normalize NEL or LINE
SEPARATOR characters to a LINE FEED character. However, if a
document entity that specifies version 1.1 invokes an external
general parsed entity with no text declaration or a text
declaration that specifies version 1.0, the external parsed entity
is processed according to the rules of XML 1.1. For this reason,
NEL and LINE SEPARATOR characters in text and attribute nodes MUST always be
escaped using character references, regardless of the value of the
version
parameter.
XML 1.0 permitted control characters in the range #x7F through
#x9F to appear as literal characters in an XML document, but XML
1.1 requires such characters, other than NEL, to be escaped as
character references. An external general parsed entity with no
text declaration or a text declaration that specifies a version
pseudo-attribute with value 1.0
that is invoked by an
XML 1.1 document entity MUST follow the rules of
XML 1.1. Therefore, the non-whitespace control characters in the
ranges #x1 through #x1F and #x7F through #x9F MUST
always be escaped, regardless of the value of the
version
parameter.
It is a serialization error [err:SEPM0004] to specify the
doctype-system parameter, or to specify the standalone parameter
with a value other than omit
, if the instance of the
data model contains text nodes
or multiple element nodes as
children of the root node. The
serializer
MUST either signal the error, or recover by
ignoring the request to output a document type declaration or
standalone
parameter.
version
ParameterThe version
parameter specifies the version of XML
and the version of Namespaces in XML to be used for outputting the
instance of the data model. The version output in the XML
declaration (if an XML declaration is not omitted)
MUST correspond to the version of XML that the
serializer used for
outputting the instance of the data model. The value of the
version
parameter MUST match the
VersionNumXML
production of the XML Recommendation [XML10] or
[XML11]. A serialization error
[err:SESU0013]
results if the value of the version
parameter
specifies a version of XML that is not supported by the serializer; the serializer
MUST signal the error.
This document provides the normative definition of serialization
for the XML output method if the version
parameter has
either the value 1.0
or 1.1
. For any
other value of version
parameter, the behavior is
implementation-defined. In that case the implementation-defined
behavior MAY supersede all other requirements of
this recommendation.
If the serialized result would contain an NCNameNames
that contains a character that is not permitted by the version of
Namespaces in XML specified by the version
parameter,
a serialization
error [err:SERE0005] results. The serializer MUST signal the
error.
If the serialized result would contain a character that is not
permitted by the version of XML specified by the
version
parameter, a serialization error [err:SERE0006] results. The serializer
MUST signal the error.
For example, if the version
parameter has the value
1.0
, and the instance of the data model contains a
non-whitespace control character in the range #x1 to #x1F, a
serialization
error [err:SERE0006] results. If the
version
parameter has the value 1.1
and a
comment node in the instance of
the data model contains a non-whitespace control character in the
range #x1 to #x1F or a control character other than NEL in the
range #x7F to #x9F, a serialization error [err:SERE0006] results.
html-version
ParameterThe html-version
parameter is not applicable to the
XML output method. It is the responsibility of the host language to specify
whether an error occurs if this parameter is specified in
combination with the XML output method, or if the parameter is
simply dropped.
encoding
ParameterThe encoding
parameter specifies the encoding to be
used for outputting the instance of the data model. Serializers are
REQUIRED to support values of UTF-8
and UTF-16
. A serialization error [err:SESU0007] occurs if an output encoding
other than UTF-8
or UTF-16
is requested
and the serializer
does not support that encoding. The serializer MUST signal the
error, or recover by using UTF-8
or
UTF-16
instead. The serializer MUST NOT use an
encoding whose name does not match the EncNameXML
production of the XML Recommendation [XML10].
When outputting a newline character in the instance of the data model, the serializer is free to represent it using any character sequence that will be normalized to a newline character by an XML parser, unless a specific mapping for the newline character is provided in a character map (see 9 Character Maps).
When outputting any other character that is defined in the selected encoding, the character MUST be output using the correct representation of that character in the selected encoding.
It is possible that the instance of the data model will contain a character that cannot be represented in the encoding that the serializer is using for output. In this case, if the character occurs in a context where XML recognizes character references (that is, in the value of an attribute node or text node), then the character MUST be output as a character reference. A serialization error [err:SERE0008] occurs if such a character appears in a context where character references are not allowed (for example, if the character occurs in the name of an element). The serializer MUST signal the error.
For example, if a text node
contains the character LATIN SMALL LETTER E WITH ACUTE (#xE9), and
the value of the encoding
parameter is
US-ASCII
, the character MUST be
serialized as a character reference. If a comment node contains the same character, a serialization error
[err:SERE0008]
results.
indent
and
suppress-indentation
ParametersThe indent
and suppress-indentation
parameters control whether the serializer MAY adjust the
whitespace in the serialized result so that a person will find it
easier to read. If the indent
parameter has the value
yes
, the serializer MAY output whitespace
characters in addition to the whitespace characters in the instance
of the data model. It MAY also elide from the
output whitespace characters that occurred in the instance of the
data model or replace such whitespace characters with other
whitespace characters.
[Definition: The term content has the same meaning as the term ContentXML defined in Section 3.1 Start-Tags, End-Tags, and Empty-Element TagsXML of [XML10].] [Definition: The immediate content of an element is the part of the content of the element that is not also in the content of a child element of that element.]
If the indent
parameter has the value
no
, the serializer MUST NOT output any
additional, elide or replace whitespace characters. If the
indent
parameter has the value yes
, the
serializer
MUST use an algorithm for dealing with whitespace
characters that satisfies all of the following constraints.
If more than one constraint applies, the serializer
MUST apply the most restrictive constraint. That
is, if any applicable constraint indicates that whitespace
MUST NOT be added, elided or replaced, that
constraint prevails; if an applicable constraint indicates that
whitespace SHOULD NOT be added, elided or
replaced, while all other applicable constraints indicate that
whitespace MAY be added, elided or replaced,
whitespace SHOULD NOT be added, elided or
replaced.
Whitespace characters MAY be added adjacent to a text node only if the text node contains only whitespace characters. Whitespace characters in such a text node MAY also be elided or replaced. For example, a tab MAY be inserted as a replacement for existing spaces.
Whitespace characters MAY be added, elided or
replaced in the immediate content of an element whose type
annotation is xs:untyped
or xs:anyType
and that has element node children, in the immediate content
of an element whose content model is element only, or outside the
content of any element.
Whitespace characters MUST NOT be added, elided or replaced in the immediate content of an element whose content model is known to be simple or empty.
Whitespace characters SHOULD NOT be added,
elided or replaced in places where the characters
would constitute significant whitespace, for example, in the
immediate
content of an element that is annotated with a type other
than xs:untyped
or xs:anyType
, and
whose content model is known to be mixed.
Whitespace characters MUST NOT be added,
elided or replaced in the content of an element whose expanded
QName is a member of the list of expanded QNames in the value of
the suppress-indentation
parameter.
Whitespace characters MUST NOT be added,
elided or replaced in a part of the result document that is
controlled by an xml:space
attribute with value
preserve
. (See [XML10] for more
information about the xml:space
attribute.)
Note:
The effect of these rules is to ensure that whitespace is only
added in places where (a) XSLT's
<xsl:strip-space>
declaration could cause it to
be removed, and (b) it does not affect the string value of any element node with simple content. It is usually
not safe to indent document types that include elements with mixed
content.
Note:
The whitespace added may possibly be based on whitespace stripped from either the source document or the stylesheet (in the case of XSLT), or guided by other means that might depend on the host language, in the case of an instance of the data model created using some other process.
cdata-section-elements
ParameterThe cdata-section-elements
parameter contains a
list of expanded QNames. If the expanded QName of the parent of a
text node is a member of the
list, then the text node
MUST be output as a CDATA section, except in those
circumstances described below.
If the text node contains
the sequence of characters ]]>
, then the currently
open CDATA section MUST be closed following the
]]
and a new CDATA section opened before the
>
.
If the text node contains characters that are not representable in the character encoding being used to output the instance of the data model, then the currently open CDATA section MUST be closed before such characters, the characters MUST be output using character references or entity references, and a new CDATA section MUST be opened for any further characters in the text node.
CDATA sections MUST NOT be used except where
they have been explicitly requested by the user, either by using
the cdata-section-elements
parameter, or by using some
other implementation-defined mechanism.
Note:
This is phrased to permit an implementor to provide an option that attempts to preserve CDATA sections present in the source document.
omit-xml-declaration
and standalone
ParametersThe XML output method MUST output an XML
declaration if the omit-xml-declaration
parameter has
the value no
. The XML declaration
MUST include both version information and an
encoding declaration. If the standalone
parameter has
the value yes
or the value no
, the XML
declaration MUST include a standalone document
declaration with the same value as the value of the
standalone
parameter. If the standalone
parameter has the value omit
, the XML declaration
MUST NOT include a standalone document
declaration; this ensures that it is both an XML declaration
(allowed at the beginning of a document entity) and a text
declaration (allowed at the beginning of an external general parsed
entity).
A serialization error [err:SEPM0009] results if the
omit-xml-declaration
parameter has the value
yes
, and
the standalone
parameter has a value other than
omit
; or
the version
parameter has a value other than
1.0
and the doctype-system
parameter is
specified.
The serializer MUST signal the error.
Otherwise, if the omit-xml-declaration
parameter
has the value yes
, the XML output method MUST
NOT output an XML declaration.
doctype-system
and
doctype-public
ParametersIf the doctype-system
parameter is specified, the
XML output method MUST output a document type
declaration immediately before the first element. The name
following <!DOCTYPE
MUST be the
name of the first element, if any. If the
doctype-public
parameter is also specified, then the
XML output method MUST output PUBLIC
followed by the public identifier and then the system identifier;
otherwise, it MUST output SYSTEM
followed by the system identifier. The internal subset
MUST be empty. The doctype-public
parameter MUST be ignored unless the
doctype-system
parameter is specified.
undeclare-prefixes
ParameterThe Data Model allows an element node that binds a non-empty prefix to have a child
element node that does not bind
that same prefix. In Namespaces in XML 1.1 ([XML Names 1.1]), this can be represented
accurately by undeclaring prefixes. For the undeclaring prefix of
the child element node, if the undeclare-prefixes
parameter has the value yes
, the output method is XML
or XHTML, and the version
parameter value is greater
than 1.0
, the serializer MUST undeclare its
namespace. If the undeclare-prefixes
parameter has the
value no
and the output method is XML or XHTML, then
the undeclaration of prefixes MUST NOT occur.
Consider an element x:foo
with four in-scope
namespaces that associate prefixes with URIs as follows:
x
is associated with
http://example.org/x
y
is associated with
http://example.org/y
z
is associated with
http://example.org/z
xml
is associated with
http://www.w3.org/XML/1998/namespace
Suppose that it has a child element x:bar
with
three in-scope namespaces:
x
is associated with
http://example.org/x
y
is associated with
http://example.org/y
xml
is associated with
http://www.w3.org/XML/1998/namespace
If namespace undeclaration is in effect, it will be serialized this way:
<x:foo xmlns:x="http://example.org/x" xmlns:y="http://example.org/y" xmlns:z="http://example.org/z"> <x:bar xmlns:z="">...</x:bar> </x:foo>
In Namespaces in XML 1.0 ([XML
Names]), prefix undeclaration is not possible. If the output
method is XML or XHTML, the value of the
undeclare-prefixes
parameter is yes
, and
the value of the version
parameter is
1.0
, a serialization error [err:SEPM0010] results; the serializer
MUST signal the error.
normalization-form
ParameterThe normalization-form
parameter is applicable to
the XML output method. The values NFC
and
none
MUST be supported by the
serializer. A
serialization
error [err:SESU0011] results if the value of the
normalization-form
parameter specifies a normalization
form that is not supported by the serializer; the serializer MUST signal the
error.
The meanings associated with the possible values of the
normalization-form
parameter are as follows:
NFC
specifies the serialized result will be in
Normalization Form C, using the rules specified in [UAX #15: Unicode Normalization
Forms].
NFD
specifies the serialized result will be in
Normalization Form D, as specified in [UAX #15: Unicode Normalization
Forms].
NFKC
specifies the serialized result will be in
Normalization Form KC, as specified in [UAX #15: Unicode Normalization
Forms].
NFKD
specifies the serialized result will be in
Normalization Form KD, as specified in [UAX #15: Unicode Normalization
Forms].
fully-normalized
specifies the serialized result
will be in fully normalized text, as specified in Section
5.4.6 fn:normalize-unicode FO30 of
[XQuery and XPath Functions and
Operators 3.0].
none
specifies that no Unicode
Normalization will be applied.
An implementation-defined value has an implementation-defined effect.
If the value of the parameter is fully-normalized
,
then no relevant construct of the parsed entity created by
the serializer may
start with a composing character. The term relevant
construct has the meaning defined in section 2.13 of [XML11]. If this condition is not satisfied, a
serialization
error [err:SERE0012] MUST be
signaled.
Note:
Specifying fully-normalized
as the value of this
parameter does not guarantee that the XML document output by the
serializer will in
fact be fully normalized as defined in [XML11]. This is because the serializer does not check that the text is
include normalized
, which would involve checking all
external entities that it refers to (such as an external DTD).
Furthermore, the serializer does not check whether any character
escape generated using character maps represents a composing
character.
media-type
ParameterThe media-type
parameter is applicable to the XML
output method. See 3 Serialization
Parameters for more information.
use-character-maps
ParameterThe use-character-maps
parameter is applicable to
the XML output method. The result of serialization using the XML
output method is not guaranteed to be well-formed XML if character
maps have been specified. See 9
Character Maps for more information.
byte-order-mark
ParameterThe byte-order-mark
parameter is applicable to the
XML output method. See 3 Serialization
Parameters for more information.
Note:
The byte order mark may be undesirable under certain
circumstances; for example, to concatenate resulting XML fragments
without additional processing to remove the byte order mark.
Therefore this specification does not mandate the
byte-order-mark
parameter to have the value
yes
when the encoding is UTF-16, even though the XML
1.0 and XML 1.1 specifications state that entities encoded in
UTF-16 MUST begin with a byte order mark.
Consequently, this specification does not guarantee that the
resulting XML fragment, without a byte order mark, will not cause
an error when processed by a conforming XML processor.
escape-uri-attributes
ParameterThe escape-uri-attributes
parameter is not
applicable to the XML output method. It is the responsibility of
the host
language to specify whether an error occurs if this parameter
is specified in combination with the XML output method, or if the
parameter is simply dropped.
include-content-type
ParameterThe include-content-type
parameter is not
applicable to the XML output method. It is the responsibility of
the host
language to specify whether an error occurs if this parameter
is specified in combination with the XML output method, or if the
parameter is simply dropped.
item-separator
ParameterThe effect of the item-separator
serialization
parameter is described in 2 Sequence
Normalization.
The XHTML output method serializes the instance of the data model as XML, using the HTML compatibility guidelines defined in the XHTML specification ([XHTML 1.0] or the XHTML syntax of current drafts of HTML5 and related specifications (see [HTML5] and [Polyglot]).
At the time this document was published, the current version of [HTML5] was that cited in A.1 Normative References. Like all draft W3C specifications, [HTML5] is subject to revision before final publication as a W3C Recommendation. For all information normatively derived in this specification from [HTML5], processors conforming to this specification MUST take the information in question from the version cited in A.1 Normative References, or from later versions of [HTML5] published by W3C. If they take the information from versions other than the one cited in A.1 Normative References, then it is implementation-defined which future version of [HTML5] is used as the source of the information, including the lists of elements recognized as HTML elements, void elements, phrasing content, and Boolean attributes. If future versions of [HTML5] differ from the current draft in any of these areas, implementations MAY support multiple versions, and MAY provide a user option for choosing which one to use.
[Definition: An element node is recognized as an HTML element by the XHTML output method if]
the element node is in the XHTML namespace, regardless of the value of
the html-version
serialization parameter or if
the html-version
serialization parameter is
absent; or
the value of the html-version
serialization parameter is 5.0
, the element has a
null
namespace URI, and the local part of the name is equal to the
name of an element defined by HTML5 [HTML5],
making the comparison without regard to case.
Note:
As noted elsewhere, processors conforming to this specification MUST support the list of elements defined in the version of [HTML5] current at the time this specification is published, or that given in some later version of [HTML5]. If they support the list in a later version, it is implementation-defined which version of [HTML5] they support.
It is entirely the responsibility of the person or process that
creates the instance of the data model to ensure that the instance
of the data model conforms to the [XHTML 1.0]
or [XHTML 1.1] specification if the
html-version
serialization parameter is absent or has
a value less than 5.0
or that it
conforms to the XHTML syntax of HTML5 if the
value of the html-version
serialization parameter is
5.0
. It is not an error if the instance of the
data model is invalid XHTML. Equally, it is entirely under the
control of the person or process that creates the instance of the
data model whether the output conforms to XHTML 1.0 Strict, XHTML
1.0 Transitional, the XHTML syntax of HTML5 (see [HTML5]), [Polyglot] or any other specific definition of
XHTML.
The serialization of the instance of the data model follows the same rules as for the XML output method, with the general exceptions noted below and parameter-specific exceptions in 6.1 The Influence of Serialization Parameters upon the XHTML Output Method. These differences are based on the HTML compatibility guidelines published in Appendix C of [XHTML 1.0] and on [Polyglot], both of which are designed to ensure that as far as possible, XHTML is rendered correctly on user agents designed originally to handle HTML.
If the value of the html-version
serialization
parameter is 5.0
, the instance of the data model that
is to be serialized is first subjected to prefix
normalization.
[Definition: During prefix normalization, any element node in the instance of the data model that is to be serialized that is in one of the XHTML namespace, the SVG namespace or the MathML namespace has its name replaced by the local part of its name. Such an element node is given a default namespace node whose value is the element's namespace URI. Any namespace node for any of those three namespaces that was previously present on any element node in the instance of the data model is also removed, unless the prefix that that namespace node declared is used as the prefix on the name of an attribute on that element or an ancestor of that element.]
The process of prefix normalization is equivalent to replacing the instance of the data model that is to be serialized with the result of the transformation described by this XSLT stylesheet, with the instance of the data model as the initial context item.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:mathml="http://www.w3.org/1998/Math/MathML"> <xsl:template match="xhtml:*|svg:*|mathml:*"> <xsl:element name="{local-name()}" namespace="{namespace-uri()}"> <xsl:call-template name="copy-namespace-nodes"/> <xsl:apply-templates select="@*|node()"/> </xsl:element> </xsl:template> <xsl:template match="node()|@*"> <xsl:copy copy-namespaces="no"> <xsl:call-template name="copy-namespace-nodes"/> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template name="copy-namespace-nodes"> <xsl:copy-of select="namespace::* [not(. = ('http://www.w3.org/1999/xhtml', 'http://www.w3.org/2000/svg', 'http://www.w3.org/1998/Math/MathML'))]"/> </xsl:template> </xsl:stylesheet>
[Definition: The following XHTML elements have an
EMPTY content model: area
, base
,
br
, col
, embed
,
hr
, img
, input
,
link
, meta
, basefont
,
frame
, isindex
, and
param
.]
[Definition:
The void elements of HTML5 are area
,
base
, br
, col
,
embed
, hr
, img
,
input
, keygen
, link
,
meta
, param
, source
,
track
and wbr
.]
Note:
This list of void elements is that given for void elements in section 8.1.2 of the draft of [HTML5] current at the time this document is published. As noted elsewhere, processors conforming to this specification MAY support the list of void elements included in later versions of [HTML5].
[Definition: An element node is expected to be empty if it is recognized as an HTML element and if either]
the html-version
serialization parameter is absent
or has a value less than 5.0
and the content model is
EMPTY, or
the html-version
serialization parameter has the
value 5.0
and the element is a void element.
If an element node that has no child nodes is
not expected to be empty, the
serializer
MUST NOT use the minimized form. That is, it
MUST output <p></p>
and
not <p />
.
If an element that has no children is
expected
to be empty, the serializer MUST use the
minimized tag syntax, for example <br />
,
as the alternative syntax <br></br>
allowed by XML gives uncertain results in many legacy
user agents. If the html-version
serialization parameter is absent or has a value less
than 5.0
, the serializer MUST include a space
before the trailing />
, e.g.
<br />
, <hr />
and
<img src="karen.jpg" alt="Karen" />
.
If the html-version
serialization parameter is absent or has a value less
than 5.0
, the serializer MUST NOT use the
entity reference '
which, although
defined in XML and therefore in XHTML, is not defined
in versions of HTML prior to HTML5, and
is not recognized by all HTML user agents.
If the html-version
serialization parameter is absent or has a value less
than 5.0
, the serializer SHOULD output
namespace declarations in a way that is consistent with the
requirements of the XHTML DTD if this is possible. If the
value of the html-version
serialization
parameter is 5.0
, the serializer SHOULD output
namespace declarations in a way that ensures that the
namespace declarations in the resulting document are quirk-compatible
(as defined in 6.2 Polyglot
markup and namespace declarations).
Note:
If the html
element is generated by an XSLT literal
result element of the form <html
xmlns="http://www.w3.org/1999/xhtml"> ... </html>
,
or by an XQuery direct element constructor of the same form, then
the html
element in the result document will have a
node name whose prefix is "",
which will satisfy the requirements of the DTD. In other cases the
prefix assigned to the element is implementation-dependent.
Note:
The XHTML 1.0 DTDs require the declaration
xmlns="http://www.w3.org/1999/xhtml"
to appear on the
html
element, and only on the html
element. The [Polyglot] specification
(see also 6.2 Polyglot markup
and namespace declarations below) permits namespace
declarations to appear in a conforming document, but there are
restrictions on which elements they can appear.
The serializer MUST output namespace declarations that are consistent with the namespace nodes present in the result tree, but it SHOULD avoid outputting redundant namespace declarations on elements where the DTD would make them invalid, for versions prior to HTML5, or where they would not be quirk-compatible, for serialization according to the syntax of HTML5.
Note:
[Polyglot] and Appendix C of [XHTML 1.0] describe a number of compatibility guidelines for users of XHTML who wish to render their XHTML documents with HTML user agents. In some cases, such as the guideline on the form empty elements take, only the serialization process itself has the ability to follow the guideline. In such cases, those guidelines are reflected in the requirements on the serializer described above.
In all other cases, the guidelines can be adhered to by the
instance of the data model that is input to the serialization
process. The guideline on the use of whitespace characters in
attribute values is one such example. Another example is that
xml:lang="..."
does not serialize to both
xml:lang="..."
and lang="..."
as required
by some legacy user agents. It is the responsibility of the person
or process that creates the instance of the data model that is
input to the serialization process to ensure it is created in a way
that is consistent with the guidelines. No serialization error
results if the input instance of the data model does not adhere to
the guidelines.
version
ParameterThe behavior for the version
parameter for the
XHTML output method is described in 5.1.1
XML Output Method: the version Parameter.
html-version
ParameterThe html-version
parameter specifies whether the
XHTML output method will produce a serialized document following
rules that are tailored to the requirements of the XHTML syntax of
[HTML5] or the requirements of [XHTML 1.0] and [XHTML
1.1].
The differences are described in detail throughout 6 XHTML Output Method.
encoding
ParameterThe behavior for encoding
parameter for the XHTML
output method is described in 5.1.3 XML
Output Method: the encoding Parameter.
indent
and
suppress-indentation
ParametersIf the indent
parameter has the value
yes
, the serializer MAY add or remove
whitespace as it serializes the result tree, if it observes the following
constraints.
Whitespace MUST NOT be added other than before or after an element, or adjacent to an existing whitespace character.
Whitespace MUST NOT be added or removed
adjacent to an inline element. The inline elements are those
elements recognized as HTML elements that
are in the %inline category of any of the XHTML 1.0 DTD's,
in the %inline.class category of the XHTML 1.1 DTD, those
elements defined to be phrasing content in [HTML5], and elements recognized as HTML elements with
local names ins
and del
if they are used
as inline elements (i.e., if they do not contain element
children).
[Definition: The elements listed as
phrasing content in [HTML5] are:
a
, abbr
, area
(if it is a
descendant of a map element), audio
, b
,
bdi
, bdo
, br
,
button
, canvas
, cite
,
code
, data
, datalist
,
del
, dfn
, em
,
embed
, i
, iframe
,
img
, input
, ins
,
kbd
, keygen
, label
,
map
, mark
, math
,
meter
, noscript
, object
,
output
, progress
, q
,
ruby
, s
, samp
,
script
, select
, small
,
span
, strong
, sub
,
sup
, svg
, template
,
textarea
, time
, u
,
var
, video
, and wbr
.]
Note:
This list of phrasing content is that given in section 3.2.4.1.5 Phrasing content of the draft of [HTML5] current at the time this document is published. As noted elsewhere, processors conforming to this specification MAY support the list of phrasing-content elements included in later versions of [HTML5].
Whitespace MUST NOT be added or removed
inside a formatted element, the formatted elements being those
recognized as HTML elements with
local names pre
, script
,
style
, title
, and
textarea
.
Whitespace characters MUST NOT be added in the
content of an element whose expanded QName matches a
member of the list of expanded QNames in the value of the
suppress-indentation
parameter. The expanded
QName of an element node is considered to match a member of the
list of expanded QNames if:
the two expanded QNames are equal;
the expanded QNames both have null namespace URIs, and the local parts of the two QNames are equal without regard to case; or
the value of the html-version
serialization parameter is 5.0
, the local parts of the
two QNames are equal without regard to case and one QName has a
null
namespace URI and the namespace URI of the other is equal to
the XHTML
namespace URI.
Note:
The effect of the above constraints is to ensure any insertion or deletion of whitespace would not affect how an HTML user agent that conforms to the specified version of HTML would render the output, assuming the serialized document does not refer to any HTML style sheets.
The HTML definition of whitespace is different from the XML definition: see section 9.1 of [HTML] 4.01 specification.
cdata-section-elements
ParameterThe behavior for cdata-section-elements
parameter
for the XHTML output method is described in 5.1.5 XML Output Method: the
cdata-section-elements Parameter.
omit-xml-declaration
and standalone
ParametersThe behavior for omit-xml-declaration
and
standalone
parameters for the XHTML output method is
described in 5.1.6 XML
Output Method: the omit-xml-declaration and standalone
Parameters.
Note:
As with the XML output method, the XHTML output method specifies
that an XML declaration will be output unless it is suppressed
using the omit-xml-declaration
parameter. Appendix C.1
of [XHTML 1.0] provides advice on the
consequences of including, or omitting, the XML declaration.
doctype-system
and
doctype-public
ParametersIf the value of the html-version
serialization parameter is 5.0
, the
doctype-system
serialization parameter is absent,
the first element node child of the document node that
is to be serialized is recognized as an HTML element, the
local part of the QName of which is equal to the string
HTML
, without regard to case, and any text
node preceding that element in document order contains only
whitespace characters, then the XHTML output method
MUST output a document type declaration
immediately before the first element, with no public or system
identifier. The name following <!DOCTYPE
MUST be the same as the local part of the
name of the element.
Otherwise, the behavior for
doctype-system
and doctype-public
parameters for the XHTML output method is described in 5.1.7 XML Output Method: the doctype-system and
doctype-public Parameters.
undeclare-prefixes
ParameterThe behavior for undeclare-prefixes
parameter for
the XHTML output method is described in 5.1.8 XML Output Method: the
undeclare-prefixes Parameter.
normalization-form
ParameterThe behavior for normalization-form
parameter for
the XHTML output method is described in 5.1.9 XML Output Method: the
normalization-form Parameter.
media-type
ParameterThe behavior for media-type
parameter for the XHTML
output method is described in 5.1.10
XML Output Method: the media-type Parameter.
use-character-maps
ParameterThe behavior for use-character-maps
parameter for
the XHTML output method is described in 5.1.11 XML Output Method: the
use-character-maps Parameter.
byte-order-mark
ParameterThe behavior for byte-order-mark
parameter for the
XHTML output method is described in 5.1.12 XML Output Method: the
byte-order-mark Parameter.
escape-uri-attributes
ParameterIf the escape-uri-attributes
parameter has the
value yes
, the XHTML output method
MUST apply URI escaping to URI attribute values, except that
relative URIs MUST NOT be absolutized.
Note:
This escaping is deliberately confined to non-ASCII characters,
because escaping of ASCII characters is not always appropriate, for
example when URIs or URI fragments are interpreted locally by the
HTML user agent. Even in the case of non-ASCII characters, escaping
can sometimes cause problems. More precise control of URI escaping is therefore
available by setting escape-uri-attributes
to
no
, and controlling the escaping of URIs by using
methods defined in Section
6.2 fn:encode-for-uri FO30 and
Section
6.3 fn:iri-to-uri FO30.
include-content-type
ParameterIf the instance of the data model includes a head
element recognized as an HTML element, and the
include-content-type
parameter has the value
yes
, the XHTML output method MUST add
a meta
element as the first child element of the
head
element, specifying the character encoding
actually used.
For example,
<head> <meta http-equiv="Content-Type" content="text/html; charset=EUC-JP" /> ...
The content type SHOULD be set to the value
given for the media-type
parameter.
Note:
It is recommended that the host language use as default value for this
parameter one of the MIME types ([RFC2046])
registered for XHTML. Currently, these are text/html
(registered by [RFC2854]) and
application/xhtml+xml
(registered by [RFC3236]). Note that some user agents fail to
recognize the charset parameter if the content type is not
text/html
.
If a meta
element has been added to the
head
element as described above, then any existing
meta
element child of the head
element
having an http-equiv
attribute with the value
"Content-Type", making the comparison without regard to
case after first stripping leading and trailing spaces from the
value of the attribute solely for the purposes of
comparison, MUST be discarded.
Note:
This process removes possible parameters in the attribute value. For example,
<meta http-equiv="Content-Type" content="text/html;version='3.0'" />
in the data model instance would be replaced by,
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
item-separator
ParameterThe effect of the item-separator
serialization
parameter is described in 2 Sequence
Normalization.
[Definition: The namespace declarations in a document are quirk-compatible if and only if the document satisfies the following constraints:]
Each occurrence of the html
element declares the
default namespace to be the XHTML namespace, i.e.
http://www.w3.org/1999/xhtml
.
Each occurrence of the MathML math
element declares
the default namespace to be the MathML namespace, i.e.
http://www.w3.org/1998/Math/MathML
.
Each occurrence of the SVG svg
element declares the
default namespace to be the SVG namespace, i.e.
http://www.w3.org/2000/svg
.
For each occurrence of an attribute in the XLink namespace
(http://www.w3.org/1999/xlink
), some namespace
declaration is in scope binding the prefix xlink
to
that namespace. Namespace declarations for this prefix and
namespace appear only on non-HTML elements.
Note:
It is recommended, for compatibility with [Polyglot], that such namespace declarations
be placed on the enclosing svg
or math
elements.
No other namespace declarations occur in the document.
Note:
This definition is derived from the draft of [Polyglot] current at the time this document was published. Users and implementors of this specification are encouraged to consult the most recent draft of [Polyglot] and to support them if possible. Such support is, however, unrelated to conformance to this specification.
The HTML output method serializes the instance of the data model as HTML.
For example, the following XSL stylesheet generates html output,
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html" version="4.0"/> <xsl:template match="/"> <html> <xsl:apply-templates/> </html> </xsl:template> ... </xsl:stylesheet>
In the example, the version
attribute of the
xsl:output
element indicates the version of the HTML
Recommendation [HTML] (or [HTML5]) to which the serialized result is to
conform.
At the time this document was published, the current version of [HTML5] was that cited in A.1 Normative References. Like all draft W3C specifications, [HTML5] is subject to revision before final publication as a W3C Recommendation. For all information normatively derived in this specification from [HTML5], processors conforming to this specification MUST take the information in question from the version cited in A.1 Normative References, or from later versions of [HTML5] published by W3C. If they take the information from versions other than the one cited in A.1 Normative References, then it is implementation-defined which future version of [HTML5] is used as the source of the information, including the lists of elements recognized as HTML elements, void elements, phrasing elements, and Boolean attributes. If future versions of [HTML5] differ from the current draft in any of these areas, implementations MAY support multiple versions, and MAY provide a user option for choosing which one to use.
It is entirely the responsibility of the person or process that creates the instance of the data model to ensure that the instance of the data model conforms to the HTML Recommendation [HTML]. It is not an error if the instance of the data model is invalid HTML. Equally, it is entirely under the control of the person or process that creates the instance of the data model whether the output conforms to HTML. If the result tree is valid HTML, the serializer MUST serialize the result in a way that conforms with the version of HTML specified by the requested HTML version.
As is described in detail below, the HTML output method will not output an element differently from the XML output method unless the element is to be serialized as an HTML element. [Definition: The portion of the serialized document representing the result of serializing an element , that is not to be serialized as an HTML element is known as an XML Island.] [Definition: An element node is serialized as an HTML element if]
the expanded QName of the element has a null namespace URI, regardless of the value of the requested HTML version, or
the value of the requested HTML version is 5.0
or
greater, and the element node is in the XHTML namespace.
If the element is to be serialized as an HTML element, but
the local part of the expanded QName is not recognized as the name
of an HTML element, the element MUST be output in
the same way as a non-empty, inline element such as
span
. In particular:
Any namespace node in the result tree for the XML namespace, is ignored
by the HTML output method. In addition, if the requested HTML
version is 5.0
, any element node that has a prefix
and is in the XHTML namespace, MathML
namespace, or SVG namespace MUST be
serialized with an unprefixed element name. The serializer
MUST serialize an attribute with the name
xmlns
whose value is equal to the namespace URI of the
element node, unless an ancestor element in the serialized result
already has an attribute named xmlns
with the same
value, and no intervening element has an attribute named
xmlns
with a different value. If the
element node has a namespace node for the default namespace whose
value is not equal to the namespace URI of the element node,
the namespace node is ignored. The serializer MUST
NOT serialize a namespace declaration for the namespace
node declaring the element node's prefix, unless an attribute of
the element node has the same prefix. For
namespace nodes in the result tree that are not ignored, the
HTML output method MUST represent these namespaces
using attributes named xmlns
or
xmlns:
prefix in the same way as the XML
output method would represent them when the version
parameter is set to 1.0
.
If the result
tree contains elements or attributes whose names have a
non-null namespace URI, the HTML
output method MUST generate namespace-prefixed
QNames for these nodes in the
same way as the XML output method would do when the
version
parameter is set to 1.0
.
Where special rules are defined later in this section for serializing specific HTML elements and attributes, these rules MUST NOT be applied to an element that is not to be serialized as an HTML element or an attribute whose name has a non-null namespace URI. However, the generic rules for the HTML output method that apply to all elements and attributes, for example the rules for escaping special characters in the text and the rules for indentation, MUST be used also for namespaced elements and attributes.
When serializing an element whose name is not defined in the
HTML specification, but that is is to be serialized as an HTML element, the
HTML output method MUST apply the same rules (for
example, indentation rules) as when serializing a span
element. The descendants of such an element MUST
be serialized as if they were descendants of a span
element.
When serializing an element whose name is in a non-null
namespace, the HTML output method MUST apply the
same rules (for example, indentation rules) as when serializing a
div
element. The descendants of such an element
MUST be serialized as if they were descendants of
a div
element, except for the influence of the
cdata-section-elements
serialization parameter on any
text node children of the element.
The HTML output method MUST NOT output an
end-tag for an empty element if the element type has an empty
content model, and the value of the requested HTML
version is less than 5.0
, or the element is
a void element
and the value of the requested HTML version is
5.0
.
For HTML 4.0, the element types that have an empty content
model are area
, base
,
basefont
, br
, col
,
embed
, frame
,
hr
, img
, input
,
isindex
, link
, meta
and
param
. For HTML5, the void elements are as defined above in
6 XHTML Output Method. It
is implementation-defined whether the
basefont
, frame
and isindex
elements, which are not part of HTML5 are considered to be void
elements when the requested HTML version has the value
5.0
.
For example, an element written as <br/>
or
<br></br>
in an XSLT stylesheet
MUST be output as <br>
.
Note:
The markup generation step of the phases of serialization only creates start tags and end tags for the HTML output method, never XML-style empty element tags. As such, a serializer MUST serialize an HTML element that has no children, but whose content model is not empty, using a pair of adjacent start and end element tags, or as a solitary start tag if permitted by the context.
For any element node that is to be serialized as an HTML element, the
HTML output method MUST compare the local
part of the name of the element node with the names of HTML
elements making the comparison without regard to
case. If the local part of the name of the element
node, compares equal to that of any HTML element, the element node
MUST be recognized as being that kind of HTML
element. For example, elements named br
,
BR
or Br
MUST all be
recognized as the HTML br
element and output without
an end-tag.
The HTML output method MUST NOT perform
escaping for any text node descendant, nor for any attribute
of an element node descendant, of a
script
or style
element.
For example, a script
element created by an XQuery
direct element constructor or an XSLT literal result element, such
as:
<script>if (a < b) foo()</script>
or
<script><![CDATA[if (a < b) foo()]]></script>
MUST be output as
<script>if (a < b) foo()</script>
A common requirement is to output a script
element
as shown in the example below:
<script type="application/ecmascript"> document.write ("<em>This won't work</em>") </script>
This is invalid HTML, for the reasons explained in section B.3.2 of the [HTML] 4.01 specification. Nevertheless, it is possible to output this fragment, using either of the following constructs:
Firstly, by use of a script
element created by an
XQuery direct element constructor or an XSLT literal result
element:
<script type="application/ecmascript"> document.write ("<em>This won't work</em>") </script>
Secondly, by constructing the markup from ordinary text characters:
<script type="application/ecmascript"> document.write ("<em>This won't work</em>") </script>
As the [HTML] specification points out, the correct way to write this is to use the escape conventions for the specific scripting language. For JavaScript, it can be written as:
<script type="application/ecmascript"> document.write ("<em>This will work<\/em>") </script>
The [HTML] 4.01 specification also shows examples of how to write this in various other scripting languages. The escaping MUST be done manually; it will not be done by the serializer.
The HTML output method MUST NOT escape
"<
" characters occurring in attribute values.
A boolean attribute is an attribute with only a single allowed value in any of the HTML DTDs or that is specified to be a Boolean attribute by [HTML5], where the allowed value is equal without regard to case to the name of the attribute. The HTML output method MUST output any boolean attribute in minimized form if and only if the value of the attribute node actually is equal to the name of the attribute making the comparison without regard to case.
[Definition: The attributes identified as Boolean attributes in [HTML5] are those given in the following table (using just the local name of their parent elements): ]
Attribute | Element(s) |
---|---|
async | script |
autofocus | button, input, keygen, select, textarea |
autoplay | audio, video |
checked | input |
controls | audio, video |
default | track |
defer | script |
disabled | button, fieldset, input, keygen, optgroup, option, select, textarea |
formnovalidate | button, input |
hidden | HTML elements |
ismap | img |
loop | audio, video |
multiple | input, select |
muted | audio, video |
novalidate | form |
open | details |
open | dialog |
readonly | input, textarea |
required | input, select, textarea |
reversed | ol |
scoped | style |
seamless | iframe |
selected | option |
typemustmatch | object |
Note:
This list of Boolean attributes is that given in the index of the draft of [HTML5] current at the time this document is published. As noted elsewhere, processors conforming to this specification MAY support the list of Boolean attributes included in later versions of [HTML5].
For example, a start-tag created using the following XQuery direct element constructor or XSLT literal result element
<OPTION selected="selected">
MUST be output as
<OPTION selected>
The HTML output method MUST NOT escape a
&
character occurring in an attribute value
immediately followed by a {
character (see Section
B.7.1 of the HTML Recommendation [HTML]).
For example, a start-tag created using the following XQuery direct element constructor or XSLT literal result element
<BODY bgcolor='&{{randomrbg}};'>
MUST be output as
<BODY bgcolor='&{randomrbg};'>
See 7.4 The Influence of Serialization Parameters upon the HTML Output Method for additional directives on how attributes MAY be written.
The HTML output method MAY output a character
using a character entity reference in preference to using a numeric
character reference, if an entity is defined for the character in
the version of HTML that the output method is using. Entity
references and character references SHOULD be used
only where the character is not present in the selected encoding,
or where the visual representation of the character is unclear (as
with
, for example).
When outputting a sequence of whitespace
characters in the instance of the data model, within an
element where whitespace characters are treated
normally (but not in elements such as pre
and
textarea
), the HTML output method MAY
represent it using any sequence of whitespace
characters that will be treated in the same way by an
HTML user agent. See section 3.5 of [XHTML Modularization] for some
additional information on handling of whitespace by an HTML user
agent for versions of HTML prior to HTML5, and see the
[HTML5] for information on the handling of
whitespace characters by an HTML5 user agent..
Note:
The terms space character and white_space character defined in HTML5 do not match the definition of whitespace character in this specification.
Certain characters are permitted in XML, but not in
HTML prior to HTML5 — for example, the control
characters #x7F-#x9F, are permitted in both XML 1.0
and XML 1.1, and the control characters #x1-#x8, #xB, #xC and
#xE-#x1F are permitted in XML 1.1, but none of these
is permitted in HTML prior to HTML5. It is a
serialization
error [err:SERE0014] to use the HTML output method
if such characters appear in the instance of the data
model and the value of the requested HTML version is less than
5.0
. The serializer MUST signal the
error.
The HTML output method MUST terminate
processing instructions with >
rather than
?>
. It is a serialization error [err:SERE0015] to use the HTML output
method when >
appears within a processing
instruction in the data model instance being serialized.
version
and
html-version
ParametersThe html-version
or the
version
serialization parameter indicates
the version of the HTML Recommendation [HTML] or [HTML5]
to which the serialized result is to conform. [Definition: If the
html-version
serialization parameter is not absent,
the requested HTML version is the value of the
html-version
serialization parameter; otherwise, it is
the value of the version
serialization
parameter.] If the serializer does not support the version of HTML
specified by the requested HTML version, it
MUST signal a serialization error [err:SESU0013].
This document provides the normative definition of serialization for the HTML output method if the requested HTML version has the lexical form of a value of type decimal whose value is 1.0 or greater, but no greater than 5.0. For any other value of version parameter, the behavior is implementation-defined. In that case the implementation-defined behavior MAY supersede all other requirements of this recommendation.
encoding
ParameterThe encoding
parameter specifies the encoding to be
used. Serializers are
REQUIRED to support values of UTF-8
and UTF-16
. A serialization error [err:SESU0007] occurs if an output encoding
other than UTF-8
or UTF-16
is requested
and the serializer
does not support that encoding. The serializer MUST signal the
error.
It is possible that the instance of the data model will contain
a character that cannot be represented in the encoding that the
serializer is using
for output. In this case, if the character occurs in a context
where HTML recognizes character references, then the character
MUST be output as a character entity reference or
decimal numeric character reference; otherwise (for example, in a
script
or style
element or in a comment),
the serializer
MUST signal a serialization error [err:SERE0008].
See 7.4.13 HTML Output
Method: the include-content-type Parameter regarding how
this parameter is used with the include-content-type
parameter.
indent
and
suppress-indentation
ParametersIf the indent
parameter has the value
yes
, then the HTML output method MAY
add or remove whitespace as it serializes the result tree, if it
observes the following constraints.
Whitespace MUST NOT be added other than before or after an element, or adjacent to an existing whitespace character.
Whitespace MUST NOT be added or removed
adjacent to an inline element. The inline elements are those
included in the %inline
category of any of the HTML
4.01 DTD's or those elements defined to be phrasing content in
[HTML5], as well as the
ins
and del
elements if they are used as
inline elements (i.e., if they do not contain element
children).
Whitespace MUST NOT be added or removed
inside a formatted element, the formatted elements being
pre
, script
, style
,
title
, and
textarea
.
Whitespace characters MUST NOT be added in the
content of an element whose expanded QName matches a
member of the list of expanded QNames in the value of the
suppress-indentation
parameter. The expanded
QName of an element node is considered to match a member of the
list of expanded QNames if:
the two expanded QNames are equal;
the expanded QNames both have null namespace URIs, and the local parts of the two QNames are equal without regard to case; or
the value of the requested HTML version is 5.0
, the
local parts of the two QNames are equal without regard to
case and one QName has a null namespace URI and the namespace URI
of the other is equal to the XHTML namespace URI.
Note:
The effect of the above constraints is to ensure any insertion or deletion of whitespace would not affect how an conforming HTML user agent would render the output, assuming the serialized document does not refer to any HTML style sheets.
Note that the HTML definition of whitespace is different from the XML definition (see section 9.1 of the [HTML] specification).
cdata-section-elements
ParameterThe cdata-section-elements
parameter is not
applicable to the HTML output method, except in the case of
XML Islands.
omit-xml-declaration
and standalone
ParametersThe omit-xml-declaration
and
standalone
parameters are not applicable to the HTML
output method.
doctype-system
and
doctype-public
ParametersIf the doctype-public
or
doctype-system
parameters are specified, then the HTML
output method MUST output a document type
declaration. If the doctype-public
parameter is
specified, then the output method MUST output
PUBLIC
followed by the specified public identifier; if
the doctype-system
parameter is also specified, it
MUST also output the specified system identifier
following the public identifier. If the doctype-system
parameter is specified but the doctype-public
parameter is not specified, then the output method
MUST output SYSTEM
followed by the
specified system identifier.
If the value of the requested HTML version is 5.0
, the
doctype-public
and doctype-system
serialization parameters are both absent, the first element
node child of the document node that is to be serialized is
to be serialized as an HTML element, the local
part of the QName of which is equal to the string
HTML
, without regard to case, and any text
node that precedes that element node in document contain only
whitespace characters, then the HTML output method
MUST output a document type declaration, with no
public or system identifier.
If the HTML output method MUST output a
document type declaration, it MUST be serialized
immediately before the first element, if any, and the name
following <!DOCTYPE
MUST be
HTML
or html
.
undeclare-prefixes
ParameterThe undeclare-prefixes
parameter is not applicable
to the HTML output method.
normalization-form
ParameterThe normalization-form
parameter is applicable to
the HTML output method. The values NFC
and
none
MUST be supported by the
serializer. A
serialization
error [err:SESU0011] results if the value of the
normalization-form
parameter specifies a normalization
form that is not supported by the serializer; the serializer MUST signal the
error.
media-type
ParameterThe media-type
parameter is applicable to the HTML
output method. See 3 Serialization
Parameters for more information. See 7.4.13 HTML Output Method: the
include-content-type Parameter regarding how this parameter
is used with the include-content-type
parameter.
use-character-maps
ParameterThe use-character-maps
parameter is applicable to
the HTML output method. See 9
Character Maps for more information.
byte-order-mark
ParameterThe byte-order-mark
parameter is applicable to the
HTML output method. See 3 Serialization
Parameters for more information.
escape-uri-attributes
ParameterIf the escape-uri-attributes
parameter has the
value yes
, the HTML output method
MUST apply URI escaping to URI attribute values, except that
relative URIs MUST NOT be absolutized.
Note:
This escaping is deliberately confined to non-ASCII characters,
because escaping of ASCII characters is not always appropriate, for
example when URIs or URI fragments are interpreted locally by the
HTML user agent. Even in the case of non-ASCII characters, escaping
can sometimes cause problems. More precise control of URI escaping is therefore
available by setting escape-uri-attributes
to
no
, and controlling the escaping of URIs by using
methods defined in Section
6.2 fn:encode-for-uri FO30 and
Section
6.3 fn:iri-to-uri FO30.
include-content-type
ParameterIf there is a head
element, and the
include-content-type
parameter has the value
yes
, the HTML output method MUST add
a meta
element as the first child element of the
head
element specifying the character encoding
actually used.
For example,
<HEAD> <META http-equiv="Content-Type" content="text/html; charset=EUC-JP"> ...
The content type MUST be set to the value given
for the media-type
parameter.
If a meta
element has been added to the
head
element as described above, then any existing
meta
element child of the head
element
having an http-equiv
attribute with the value
"Content-Type", making the comparison without regard to
case after first stripping leading and trailing spaces from the
value of the attribute solely for the purposes of
comparison, MUST be discarded.
Note:
This process removes possible parameters in the attribute value. For example,
<meta http-equiv="Content-Type" content="text/html;version='3.0'"/>
in the data model instance would be replaced by,
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
item-separator
ParameterThe effect of the item-separator
serialization
parameter is described in 2 Sequence
Normalization.
The Text output method serializes the instance of the data model by outputting the string value of the document node created by the markup generation step of the phases of serialization without any escaping.
A newline character in the instance of the data model MAY be output using any character sequence that is conventionally used to represent a line ending in the chosen system environment.
version
ParameterThe version
parameter is not applicable to the Text
output method.
html-version
ParameterThe html-version
parameter is not applicable to the
Text output method.
encoding
ParameterThe encoding
parameter identifies the encoding that
the Text output method MUST use to convert
sequences of characters to sequences of bytes. Serializers are
REQUIRED to support values of UTF-8
and UTF-16
. A serialization error [err:SESU0007] occurs if the serializer does not support the
encoding specified by the encoding
parameter. The
serializer
MUST signal the error. If the instance of the data
model contains a character that cannot be represented in the
encoding that the serializer is using for output, the serializer
MUST signal a serialization error [err:SERE0008].
indent
and
suppress-indentation
ParametersThe indent
and
suppress-indentation
parameters are not
applicable to the Text output method.
cdata-section-elements
ParameterThe cdata-section-elements
parameter is not
applicable to the Text output method.
omit-xml-declaration
and standalone
ParametersThe omit-xml-declaration
and
standalone
parameters are not applicable to the Text
output method.
doctype-system
and
doctype-public
ParametersThe doctype-system
and doctype-public
parameters are not applicable to the Text output method.
undeclare-prefixes
ParameterThe undeclare-prefixes
parameter is not applicable
to the Text output method.
normalization-form
ParameterThe normalization-form
parameter is applicable to
the Text output method. The values NFC
and
none
MUST be supported by the
serializer. A
serialization
error [err:SESU0011] results if the value of the
normalization-form
parameter specifies a normalization
form that is not supported by the serializer; the serializer MUST signal the
error.
media-type
ParameterThe media-type
parameter is applicable to the Text
output method. See 3 Serialization
Parameters for more information.
use-character-maps
ParameterThe use-character-maps
parameter is applicable to
the Text output method. See 9
Character Maps for more information.
byte-order-mark
ParameterThe byte-order-mark
parameter is applicable to the
Text output method. See 3 Serialization
Parameters for more information.
escape-uri-attributes
ParameterThe escape-uri-attributes
parameter is not
applicable to the Text output method.
include-content-type
ParameterThe include-content-type
parameter is not
applicable to the Text output method.
item-separator
ParameterThe effect of the item-separator
serialization
parameter is described in 2 Sequence
Normalization.
The use-character-maps
parameter is a list of
characters and corresponding string substitutions.
Character maps allow a specific character appearing in a text or attribute node in the instance of the data model to be replaced with a specified string of characters during serialization. The string that is substituted is output "as is," and the serializer performs no checks that the resulting document is well-formed. This mechanism can therefore be used to introduce arbitrary markup in the serialized output. See Section 25.1 Character Maps XT30 of [XSL Transformations (XSLT) Version 3.0] for examples of using character mapping in XSLT.
Character mapping is applied to the characters that actually appear in a text or attribute node in the instance of the data model, before any other serialization operations such as escaping or Unicode Normalization are applied. If a character is mapped, then it is not subjected to XML or HTML escaping, nor to Unicode Normalization. The string that is substituted for a character is not validated or processed in any way by the serializer, except for translation into the target encoding. In particular, it is not subjected to XML or HTML escaping, it is not subjected to Unicode Normalization, and it is not subjected to further character mapping.
Character mapping is not applied to characters in text nodes whose parent elements are listed
in the cdata-section-elements
parameter, nor to
characters for which output escaping has been disabled (disabling
output escaping is a feature in all versions of XSLT),
nor to characters in attribute values that are subject to URI escaping defined for
the HTML and XHTML output methods, unless URI escaping has been disabled using the
escape-uri-attributes
parameter in the output
definition.
On serialization, occurrences of a character specified in the
use-character-maps
in text nodes and attribute values are replaced by the
corresponding string from the use-character-maps
parameter.
Note:
Using a character map can result in non-well-formed documents if the string contains XML-significant characters. For example, it is possible to create documents containing unmatched start and end tags, references to entities that are not declared, or attributes that contain tags or unescaped quotation marks.
If a character is mapped, then it is not subjected to XML or HTML escaping.
A serialization error [err:SERE0008] occurs if character mapping causes the output of a string containing a character that cannot be represented in the encoding that the serializer is using for output. The serializer MUST signal the error.
Serialization is intended primarily as a component of a host language. [Definition: A host language is another specification that includes, by reference, this specification and all of its requirements. A host language might be a programming language such as [XSL Transformations (XSLT) Version 3.0] or [XQuery 3.0: An XML Query Language], or it might be an application programming interface (API) intended to be used by programs written in some other high-level programming language. The use of the term language is not intended to preclude the possibility that this specification might be referenced outside the context of a programming language specification.] This document relies on specifications that use it to specify conformance criteria for Serialization in their respective environments. Specifications that set conformance criteria for their use of Serialization MUST NOT change the semantic definitions of Serialization as given in this specification, except by subsetting and/or compatible extensions. It is the responsibility of the host language to specify how serialization errors are to be handled.
Certain facilities in this specification are described as producing implementation-defined results. A claim that asserts conformance with this specification MUST be accompanied by documentation stating the effect of each implementation-defined feature. For convenience, a non-normative checklist of implementation-defined features is provided at E Checklist of Implementation-Defined Features.
The following schema describes the structure of a Data Model instance that can be used to specify the settings of serialization parameters using the mechanism described in 3.1 Setting Serialization Parameters by Means of a Data Model Instance.
A copy of this schema is available at http://www.w3.org/2014/04/xslt-xquery-serialization/schema-for-serialization-parameters.xsd.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3.org/2010/xslt-xquery-serialization" xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization" elementFormDefault="qualified"> <xs:annotation> <xs:documentation> This is a schema for serialization parameters for XSLT and XQuery Serialization 3.0. This schema is available for use under the conditions of the W3C Software License published at http://www.w3.org/Consortium/Legal/copyright-software-19980720 It defines a schema for XML Infoset instances with which a user of a host language MAY specify serialization parameters for use in serializing an instance of the XQuery and XPath Data Model. It also provides hooks that allow the inclusion of implementation- defined serialization parameters and implementation-defined modifiers to serialization parameters. </xs:documentation> </xs:annotation> <xs:simpleType name="QNames-type"> <xs:list itemType="xs:QName"/> </xs:simpleType> <xs:simpleType name="yes-no-type"> <xs:restriction base="xs:token"> <xs:enumeration value="no"/> <xs:enumeration value="yes"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="yes-no-omit-type"> <xs:restriction base="xs:token"> <xs:enumeration value="no"/> <xs:enumeration value="omit"/> <xs:enumeration value="yes"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="char-type"> <xs:restriction base="xs:string"> <xs:maxLength value="1"/> <xs:minLength value="1"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="encoding-string-type"> <xs:restriction base="xs:string"> <xs:pattern value="[A-Za-z][A-Za-z0-9._\-]*"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="method-type"> <xs:union> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="html"/> <xs:enumeration value="text"/> <xs:enumeration value="xml"/> <xs:enumeration value="xhtml"/> </xs:restriction> </xs:simpleType> <xs:simpleType> <xs:restriction base="xs:QName"> <xs:pattern value=".*:.*"/> </xs:restriction> </xs:simpleType> </xs:union> </xs:simpleType> <xs:simpleType name="pubid-char-string-type"> <xs:restriction base="xs:string"> <xs:pattern value="([- \r\n\ta-zA-Z0-9'()+,./:=?;!*#@$_%])*"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="system-id-string-type"> <xs:restriction base="xs:string"> <xs:pattern value="[^']*|[^"]*"/> </xs:restriction> </xs:simpleType> <!-- - Base type of all serialization parameter types --> <xs:complexType name="base-param-type"> <xs:complexContent> <xs:restriction base="xs:anyType"> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:restriction> </xs:complexContent> </xs:complexType> <!-- - Generic string serialization parameters --> <xs:complexType name="string-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="xs:string" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Generic decimal serialization parameters --> <xs:complexType name="decimal-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="xs:decimal" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Serialization parameter type for "yes" or "no" - serialization parameters --> <xs:complexType name="yes-no-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="output:yes-no-type" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Serialization parameter type for list of xs:QName - serialization parameters --> <xs:complexType name="QNames-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="output:QNames-type" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Serialization parameter type for "yes", "no" or "omit" - serialization parameters --> <xs:complexType name="yes-no-omit-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="output:yes-no-omit-type" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Serialization parameter type for NMTOKEN serialization parameters --> <xs:complexType name="NMTOKEN-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="xs:NMTOKEN" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Base element declaration for all serialization parameter elements --> <xs:element name="serialization-parameter-element" abstract="true" type="output:base-param-type"/> <!-- - Serialization parameter element for byte-order-mark parameter --> <xs:element id="byte-order-mark" name="byte-order-mark" type="output:yes-no-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for cdata-section-elements parameter --> <xs:element id="cdata-section-elements" name="cdata-section-elements" type="output:QNames-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter type for doctype-public parameter --> <xs:complexType name="doctype-public-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="output:pubid-char-string-type" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Serialization parameter element for doctype-public parameter --> <xs:element id="doctype-public" name="doctype-public" type="output:doctype-public-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter type for doctype-system parameter --> <xs:complexType name="doctype-system-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="output:system-id-string-type" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Serialization parameter element for doctype-system parameter --> <xs:element id="doctype-system" name="doctype-system" type="output:doctype-system-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter type for encoding parameter --> <xs:complexType name="encoding-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="output:encoding-string-type" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Serialization parameter element for encoding parameter --> <xs:element id="encoding" name="encoding" type="output:encoding-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for escape-uri-attributes parameter --> <xs:element id="escape-uri-attributes" name="escape-uri-attributes" type="output:yes-no-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for html-version parameter --> <xs:element id="html-version" name="html-version" type="output:decimal-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for include-content-type parameter --> <xs:element id="include-content-type" name="include-content-type" type="output:yes-no-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for indent parameter --> <xs:element id="indent" name="indent" type="output:yes-no-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for item-separator parameter --> <xs:element id="item-separator" name="item-separator" type="output:string-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for media-type parameter --> <xs:element id="media-type" name="media-type" type="output:string-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter type for method parameter --> <xs:complexType name="method-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:attribute name="value" type="output:method-type" use="required"/> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Serialization parameter element for method parameter --> <xs:element id="method" name="method" type="output:method-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for normalization-form parameter --> <xs:element id="normalization-form" name="normalization-form" type="output:NMTOKEN-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for omit-xml-declaration parameter --> <xs:element id="omit-xml-declaration" name="omit-xml-declaration" type="output:yes-no-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for standalone parameter --> <xs:element id="standalone" name="standalone" type="output:yes-no-omit-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for suppress-indentation parameter --> <xs:element id="suppress-indentation" name="suppress-indentation" type="output:QNames-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for undeclare-prefixes parameter --> <xs:element id="undeclare-prefixes" name="undeclare-prefixes" type="output:yes-no-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter type for use-character-maps - parameter --> <xs:complexType name="use-character-maps-param-type"> <xs:complexContent> <xs:extension base="output:base-param-type"> <xs:sequence> <xs:element name="character-map" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:attribute name="character" type="output:char-type"/> <xs:attribute name="map-string" type="xs:string"/> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType> </xs:element> <xs:any minOccurs="0" namespace="##other" processContents="lax"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> <!-- - Serialization parameter element for use-character-maps parameter --> <xs:element id="use-character-maps" name="use-character-maps" type="output:use-character-maps-param-type" substitutionGroup="output:serialization-parameter-element"/> <!-- - Serialization parameter element for version parameter --> <xs:element id="version" name="version" type="output:string-param-type" substitutionGroup="output:serialization-parameter-element"/> <xs:element name="serialization-parameters"> <xs:complexType> <xs:sequence> <xs:element ref="output:serialization-parameter-element" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
This document uses the err
prefix which represents
the same namespace URI (http://www.w3.org/2005/xqt-errors) as
defined in [XML Path Language (XPath) 3.0].
Use of this namespace prefix binding in this document is not
normative.
It is an error if an item in S6 in sequence normalization is an attribute node or a namespace node.
It is an error if the serializer is unable to satisfy the rules for either a well-formed XML document entity or a well-formed XML external general parsed entity, or both, except for content modified by the character expansion phase of serialization.
It is an error to specify the doctype-system parameter, or to
specify the standalone parameter with a value other than
omit
, if the instance of the data model contains text
nodes or multiple element nodes as children of the root node.
It is an error if the serialized result would contain an
NCNameNames
that contains a character that is not permitted by the version of
Namespaces in XML specified by the version
parameter.
It is an error if the serialized result would contain a
character that is not permitted by the version of XML specified by
the version
parameter.
It is an error if an output encoding other than
UTF-8
or UTF-16
is requested and the
serializer does not
support that encoding.
It is an error if a character that cannot be represented in the encoding that the serializer is using for output appears in a context where character references are not allowed (for example if the character occurs in the name of an element).
It is an error if the omit-xml-declaration
parameter has the value yes
, and the
standalone
attribute has a value other than
omit
; or the version
parameter has a
value other than 1.0
and the
doctype-system
parameter is specified.
It is an error if the output method is xml
or
xhtml
, the value of the
undeclare-prefixes
parameter is yes
, and
the value of the version
parameter is 1.0.
It is an error if the value of the
normalization-form
parameter specifies a normalization
form that is not supported by the serializer.
It is an error if the value of the
normalization-form
parameter is
fully-normalized
and any relevant construct of the
result begins with a combining character.
It is an error if the serializer does not support the version of XML
specified by the version
parameter or the
version of HTML specified by the html-version
or the
version
serialization parameter..
It is an error to use the HTML output method if characters which are permitted in XML but not in the requested HTML version appear in the instance of the data model.
It is an error to use the HTML output method when
>
appears within a processing instruction in the
data model instance being serialized.
It is an error if a parameter value is invalid for the defined domain.
It is an error if evaluating an expression in order to extract the setting of a serialization parameter from a data model instance would yield an error.
It is an error if evaluating an expression in order to extract
the setting of the use-character-maps
serialization
parameter from a data model instance would yield a sequence of
length greater than one.
It is an error if an instance of the data model used to specify
the settings of serialization parameters specifies the value of the
same parameter more than once, or if the instance does not
have as its root node an element node or a document node with an
element node child, where the local part of the name of the element
node is serialization-parameters
and the namespace URI
is
http://www.w3.org/2010/xslt-xquery-serialization
.
This error has been removed.
The following list of attributes are declared as type
%URI
or %UriList
for a given HTML or
XHTML element, with the exception of the name
attribute for element A
which is not a URI type. The
name
attribute for element A
SHOULD be escaped as is recommended by the HTML
Recommendation [HTML] in Appendix B.2.1.
Attributes | Elements |
---|---|
action | FORM |
archive | OBJECT |
background | BODY |
cite | BLOCKQUOTE, DEL, INS, Q |
classid | OBJECT |
codebase | APPLET, OBJECT |
data | OBJECT |
datasrc | BUTTON, DIV, INPUT, OBJECT, SELECT, SPAN, TABLE, TEXTAREA |
for | SCRIPT |
formaction | BUTTON, INPUT |
href | A, AREA, BASE, LINK |
icon | COMMAND |
longdesc | FRAME, IFRAME, IMG |
manifest | HTML |
name | A |
poster | VIDEO |
profile | HEAD |
src | AUDIO, EMBED, FRAME, IFRAME, IMG, INPUT, SCRIPT, SOURCE, TRACK, VIDEO |
usemap | IMG, INPUT, OBJECT |
value | INPUT |
This appendix provides a summary of Serialization features whose effect is explicitly implementation-defined. The conformance rules (see 10 Conformance) require vendors to provide documentation that explains how these choices have been exercised.
method
serialization parameter, then the parameter specifies an implementation-defined
output method. (See 3 Serialization
Parameters)http://www.w3.org/2010/xslt-xquery-serialization
, the
implementation MAY interpret them to specify the
values of implementation-defined serialization parameters in an
implementation-defined manner. (See 3.1 Setting Serialization
Parameters by Means of a Data Model Instance)version
parameter for the XML or XHTML output method
for which this document does not provide a normative definition,
the behavior is implementation-defined. (See 5.1.1 XML Output Method: the version
Parameter)normalization-form
form
parameter is not NFC
, NFD
,
NFKC
, NFKD
,
fully-normalized
, or none
then the
meaning of the value and its effect is implementation-defined.
(See 5.1.9 XML Output Method:
the normalization-form Parameter)basefont
, frame
and
isindex
elements, which are not part of HTML5 are
considered to be void elements when the requested HTML
version has the value 5.0
. (See 7.1 Markup for Elements)version
parameter for the HTML output method for which
this document does not provide a normative definition, the behavior
is implementation-defined. (See 7.4.1 HTML Output Method: the version and
html-version Parameters)This appendix details the changes that have been made since the publication of the [XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)].
The following changes have been applied since the publication of the Candidate Recommendation to produce this document.
Bugzilla bug (if applicable) | Erratum (if applicable) | Category | Description of change | Affected sections |
Bugzilla bug 25149 | None | Editorial | Remove normative dependencies to documents whose technical stability is not assured. | |
Bugzilla bug 25156 | None | Editorial | Correct typo in sample XSLT expression for
use-character-maps parameter. |
The following changes have been applied since the publication of the fifth Public Working Draft to produce this, the sixth Public Working Draft.
Bugzilla bug (if applicable) | Erratum (if applicable) | Category | Description of change | Affected sections |
Bugzilla bug 20245 | None | Editorial | Editorial improvements to the description of how void elements and elements with an empty content model are processed by the XHTML output method. | |
Bugzilla bug 20251 and Bugzilla bug 20261 | None | Substantive | Corrections to the description of prefix stripping for the XHTML output method. |
The following changes have been applied since the publication of the fourth Public Working Draft to produce this, the fifth Public Working Draft.
Bugzilla bug (if applicable) | Erratum (if applicable) | Category | Description of change | Affected sections |
Bugzilla bug 16311 | None | Substantive | Added new serialization parameter for specifying a separator that is inserted between items in the sequence that is to be serialized. | |
None | None | Editorial | Added XSLT instructions equivalent to XQuery expressions for setting serialization parameters by means of a data model instance, and other editorial corrections and improvements. | |
None | None | Editorial | Clarified the definition of host language to make it clear that APIs can be considered to be host languages. | |
Bugzilla 6129 | None | Substantive | Extended the definitions of the HTML and XHTML output methods to include support for HTML5 serialization. |
|
Bugzilla 17619 | None | Editorial | Text associated with links to the definitions of the terms NCName, EncName and VersionNum was repeated several times. | |
Bugzilla 15915 | None | Editorial | Made uses of the terms "absent" and "unspecified" consistent. | |
Bugzilla 17282 | None | Substantive | Changed type of the normalization-form
serialization parameter to NMTOKEN . This would be an
incompatible change from XQuery 1.0, for any implementation that
supported the Serialization Feature, and supported an
implementation-defined value for the
normalization-form serialization parameter that did
not have the lexical form of an NMToken . |
The following changes were applied following the publication of the third Public Working Draft to produce the fourth Public Working Draft. None of these changes introduces an incompatibility with [XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)].
Bugzilla bug (if applicable) | Erratum (if applicable) | Category | Description of change | Affected sections |
Bugzilla bug 12852 | None | Substantive | Corrected the type of the media-type serialization
parameter in the Schema for Serialization Parameters. |
|
Bugzilla bug 13688 | None | Substantive | Corrected the regular expression associated with the
encoding-string-type type in the Schema for
Serialization Parameters, so that hyphens are permitted to appear
in the encoding serialization parameter. |
|
Bugzilla bug 10176 | SE.E20 | Substantive | Clarified what it means for the html output method to output an XML island as XML. | |
Bugzilla bug 14751 | None | Editorial | Corrected typographical errors in the comments associated with
the yes-no-param-type and
encoding-param-type types in the Schema for
Serialization Parameters. |
The following changes were applied after the publication of the second Public Working Draft to produce the third Public Working Draft. None of these changes introduces an incompatibility with [XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)], except as noted below.
Bugzilla bug (if applicable) | Erratum (if applicable) | Category | Description of change | Affected sections |
Bugzilla bug 11635 | SE.E19 | Substantive | Clarified that serialization error SEPM0010 applies to the xhtml output method as well as the xml output method. |
The following changes were applied after the publication of the first Public Working Draft to produce the second Public Working Draft. None of these changes introduces an incompatibility with [XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)], except as noted below.
Bugzilla bug (if applicable) | Erratum (if applicable) | Category | Description of change | Affected sections |
Bugzilla bug 6535 | None | Substantive | Added definition of the suppress-indentation
serialization parameter. |
|
Bugzilla bug 7829 | SE.E14 | Substantive | Clarified how minimized attributes are handled under the rules of the HTML output method. | |
Bugzilla bug 8245 | SE.E15 | Editorial | Corrected description of a serialization error that mentions which control characters are not permitted under the rules of the HTML output method | |
Bugzilla bug 7823 | SE.E16 | Substantive | Clarified how the script and style
elements are handled for the HTML output method. |
|
Bugzilla bug 8651 | SE.E17 | Substantive | Clarified what it means to compare without regard to case. | |
Bugzilla bug 8206 | SE.E18 | Editorial | Clarified what it means to escape according to HTML or XML rules. | |
Bugzilla bug 6808 | None | Substantive | Relaxed rules for the XML output method that specify where a serializer is permitted to add whitespace. This introduces an incompatibility only inasmuch as the serialized results produced by a serializer conforming to this specification could differ from the results a serializer that adheres to [XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)] would be permitted to produce. | |
Bugzilla bug 9302 | None | Substantive | Defined a mechanism for specifying serialization parameter settings in the form of a data model instance. | |
None | None | Editorial | Replaced all uses of the words legal and illegal with more appropriate terms. |
The following changes were applied after the publication of [XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)] to produce the first Public Working Draft. None of these changes introduces an incompatibility with [XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)].
Bugzilla bug (if applicable) | Erratum (if applicable) | Category | Description of change | Affected sections |
Bugzilla bug 6723 | SE.E13 | Substantive | Clarified how HTML elements that have no children but whose content model is not empty are serialized. | |
Bugzilla bug 6732 | SE.E12 | Substantive | Clarified for which versions of XML and HTML this document makes normative statements. | |
None | None | Substantive | Take into account presence of function items in a sequence that is to be serialized. | |
None | None | Editorial | Miscellaneous minor editorial corrections and improvements. |