\n","\n\n","MUST be output as","\n\n","A common requirement is to output a "," element\nas shown in the example below:","\n\n","This is invalid HTML, for the reasons explained in\nsection B.3.2 of the [HTML] 4.01\nspecification. Nevertheless, it is possible to output this\nfragment, using either of the following constructs:","Firstly, by use of a "," element created by an\nXQuery direct element constructor or an XSLT literal result\nelement:","Secondly, by constructing the markup from ordinary text\ncharacters:","\n\n","As the [HTML] specification points out,\nthe correct way to write this is to use the escape conventions for\nthe specific scripting language. For JavaScript, it can be written\nas:","\n\n","The [HTML] 4.01 specification also shows\nexamples of how to write this in various other scripting languages.\nThe escaping MUST be done manually; it will not be\ndone by the serializer .","7.2 Writing\nAttributes"," escape\n\"","\" characters occurring in attribute values.","A boolean attribute is an attribute with only a single allowed\nvalue in any of the HTML DTDs or that is specified to be\na Boolean\nattribute by [HTML5] , where\nthe allowed value is equal without regard to case to the name of\nthe attribute. The HTML output method MUST output\nany boolean attribute in minimized form if and only if the value of\nthe attribute node actually is equal to the name of the attribute\nmaking the comparison without regard to case .","[Definition : The attributes identified as\nBoolean attributes in [HTML5] are those\ngiven in the following table (using just the local name of their\nparent elements): ]","Attribute","Element(s)","async","autofocus","button, input, keygen, select, textarea","autoplay","audio, video","checked","controls","default","defer","disabled","button, fieldset, input, keygen, optgroup, option, select,\ntextarea","formnovalidate","button, input","hidden","HTML elements","ismap","loop","multiple","input, select","muted","novalidate","form","open","details","dialog","readonly","input, textarea","required","input, select, textarea","reversed","ol","scoped","seamless","selected","option","typemustmatch","This list of Boolean attributes is that given in the \nindex of the draft of [HTML5] current at\nthe time this document is published. As noted elsewhere, processors\nconforming to this specification MAY support the\nlist of Boolean attributes included in later versions of [HTML5] .","For example, a start-tag created using the following XQuery\ndirect element constructor or XSLT literal result element","\n\n","\n \n"," escape a\n"," character occurring in an attribute value\nimmediately followed by a "," character (see ","Section\nB.7.1"," of the HTML Recommendation ",").","\n\n","\n\n","See 7.4 The Influence of Serialization\nParameters upon the HTML Output Method for additional\ndirectives on how attributes MAY be written.","7.3 Writing\nCharacter Data"," output a character\nusing a character entity reference in preference to using a numeric\ncharacter reference, if an entity is defined for the character in\nthe version of HTML that the output method is using. Entity\nreferences and character references "," be used\nonly where the character is not present in the selected encoding,\nor where the visual representation of the character is unclear (as\nwith "," ",", for example).","When outputting a sequence of ","whitespace\ncharacters"," in the instance of the data model, within an\nelement where whitespace ","characters are"," treated\nnormally (but not in elements such as ","), the HTML output method ","\nrepresent it using any sequence of whitespace\n","characters"," that will be treated in the same way by an\nHTML user agent. See section 3.5 of ","[XHTML Modularization]"," for some\nadditional information on handling of whitespace by an HTML user\nagent ","for versions of HTML prior to HTML5, and see the\n[HTML5] for information on the handling of\nwhitespace characters by an HTML5 user agent.","The terms space character and white_space character\ndefined in HTML5 do not match the definition of whitespace\ncharacter in this specification.","Certain characters are ","permitted"," in XML, but not in\nHTML ","prior to HTML5","— for example, the control\ncharacters #x7F-#x9F, are permitted in both XML 1.0\nand XML 1.1, and the control characters #x1-#x8, #xB, #xC and\n#xE-#x1F are permitted in XML 1.1, but none of these\nis permitted in HTML prior to HTML5 ",". It is a\n","] to use the HTML output method\n","if"," such characters appear in the instance of the data\nmodel "," is less than\n"," terminate\nprocessing instructions with "," rather than\n","?>",". It is a ","] to use the HTML output\nmethod when "," appears within a processing\ninstruction in the data model instance being serialized.","7.4 The Influence of\nSerialization Parameters upon the HTML Output Method","7.4.1 HTML Output\nMethod: the ","Parameters"," or the","serialization parameter"," indicates\nthe version of the HTML Recommendation ","or [HTML5] ","\nto which the serialized result is to conform. [",": ","If the\n"," serialization parameter is not absent,\nthe "," is the value of the\n"," serialization parameter; otherwise, it is\nthe value of the "," serialization\nparameter.","] If the "," does not support the version of HTML\nspecified by ","the requested HTML version ",", it\n"," signal a ","].","This document provides the normative definition of serialization\nfor the HTML output method if the requested HTML version has the lexical\nform of a value of type decimal whose value is 1.0 or greater, but\nno greater than 5.0 . For any other value of version\nparameter, the behavior is implementation-defined . In that case the implementation-defined \nbehavior MAY supersede all other requirements of\nthis recommendation.","7.4.2 HTML\nOutput Method: the "," parameter specifies the encoding to be\nused. ","It is possible that the instance of the data model will contain\na character that cannot be represented in the encoding that the\n"," is using\nfor output. In this case, if the character occurs in a context\nwhere HTML recognizes character references, then the character\n"," be output as a character entity reference or\ndecimal numeric character reference; otherwise (for example, in a\n"," element or in a comment),\nthe ","7.4.13 HTML Output\nMethod: the include-content-type Parameter"," regarding how\nthis parameter is used with the ","\nparameter.","7.4.3 HTML Output\nMethod: the ",", then the HTML output method ","\nadd or remove whitespace as it serializes the ","if it\nobserves the following constraints."," be added or removed\nadjacent to an inline element. The inline elements are those\nincluded in the ","%inline"," category of any of the HTML\n4.01 DTD's ","or those elements defined to be phrasing content in\n[HTML5] ",", as well as the\n"," elements if they are used as\ninline elements (i.e., if they do not contain element\nchildren)."," be added or removed\ninside a formatted element, the formatted elements being\n",", the\nlocal parts of the two QNames are equal ","without regard to\ncase"," and one QName has a "," and the namespace URI\nof the other is equal to the XHTML namespace URI.","The effect of the above constraints is to ensure any\ninsertion or deletion of whitespace would not affect how an\nconforming HTML user agent would render the output, assuming the\nserialized document does not refer to any HTML style\nsheets.","Note that the HTML definition of whitespace is different from\nthe XML definition (see section 9.1 of the [HTML] specification).","7.4.4 HTML Output Method: the\n"," parameter is not\napplicable to the HTML output method, except in the case of\n","XML Islands","7.4.5 HTML Output Method: the\n"," parameters are not applicable to the HTML\noutput method.","7.4.6 HTML Output\nMethod: the "," parameters are specified, then the HTML\noutput method "," output a document type\ndeclaration. If the "," parameter is\nspecified, then the output method "," output\n"," followed by the specified public identifier; if\nthe "," parameter is also specified, it\n"," also output the specified system identifier\nfollowing the public identifier. If the ","\nparameter is specified but the ","\nparameter is not specified, then the output method\n"," followed by the\nspecified system identifier.","\nserialization parameters are both absent, ","the first element\nnode child of"," the document node that is to be serialized is\nto be ",", the local\npart of the QName of which is equal to the string\n","and any text\nnode that precedes that element node in document contain only\nwhitespace characters,"," then the HTML output method\n"," output a document type declaration, with no\npublic or system identifier.","If the HTML output method "," output a\ndocument type declaration, it "," be serialized\nimmediately before the first element, if any, and the name\nfollowing "," be\n","7.4.7 HTML Output Method: the\n"," parameter is not applicable\nto the HTML output method.","7.4.8 HTML Output Method: the\n"," parameter is applicable to\nthe HTML output method. The values ","7.4.9 HTML\nOutput Method: the "," parameter is applicable to the HTML\noutput method. See "," for more information. See ","7.4.13 HTML Output Method: the\ninclude-content-type Parameter"," regarding how this parameter\nis used with the ","7.4.10 HTML Output Method: the\n"," parameter is applicable to\nthe HTML output method. See ","7.4.11 HTML Output Method: the\n"," parameter is applicable to the\nHTML output method. See ","7.4.12 HTML Output Method: the\n",", the HTML output method\n","7.4.13 HTML Output Method: the\n","If there is a "," element, and the\n",", the HTML output method "," element specifying the character encoding\nactually used.","\n\n \n...\n"," be set to the value given\nfor the "," attribute with the value\n\"Content-Type\"",", making the comparison without regard to\ncase after first stripping leading and trailing spaces from the\nvalue of the attribute solely for the purposes of\ncomparison, ","\n \n","\n \n","7.4.14 HTML Output Method: the\n","8 Text Output\nMethod","The Text output method serializes the instance of the data model\nby outputting the string value of the document node created by the markup generation\nstep of the phases of\nserialization without any escaping.","A newline character in the instance of the data model\nMAY be output using any character sequence that is\nconventionally used to represent a line ending in the chosen system\nenvironment.","8.1 The Influence of\nSerialization Parameters upon the Text Output Method","8.1.1 Text Output\nMethod: the "," parameter is not applicable to the Text\noutput method.","8.1.2\nText Output Method: the "," parameter is not applicable to the\nText output method.","8.1.3 Text\nOutput Method: the "," parameter identifies the encoding that\nthe Text output method "," use to convert\nsequences of characters to sequences of bytes. ","] occurs if the "," does not support the\nencoding specified by the "," parameter. The\n"," signal the error. If the instance of the data\nmodel contains a character that cannot be represented in the\nencoding that the "," is using for output, the ","8.1.4 Text Output\nMethod: the "," parameters are"," not\napplicable to the Text output method.","8.1.5 Text Output Method: the\n"," parameter is not\napplicable to the Text output method.","8.1.6 Text Output Method: the\n"," parameters are not applicable to the Text\noutput method.","8.1.7 Text Output\nMethod: the ","\nparameters are not applicable to the Text output method.","8.1.8 Text Output Method: the\n"," parameter is not applicable\nto the Text output method.","8.1.9 Text Output Method: the\n"," parameter is applicable to\nthe Text output method. The values ","8.1.10 Text\nOutput Method: the "," parameter is applicable to the Text\noutput method. See ","8.1.11 Text Output Method: the\n"," parameter is applicable to\nthe Text output method. See ","8.1.12 Text Output Method: the\n"," parameter is applicable to the\nText output method. See ","8.1.13 Text Output Method: the\n","8.1.14 Text Output Method: the\n","8.1.15 Text Output Method: the\n","9 Character\nMaps"," parameter is a list of\ncharacters and corresponding string substitutions.","Character maps allow a specific character appearing in a text or\nattribute node in the instance\nof the data model to be replaced with a specified string of\ncharacters during serialization. The string that is substituted is\noutput \"as is,\" and the serializer performs no checks that the resulting\ndocument is well-formed. This mechanism can therefore be used to\nintroduce arbitrary markup in the serialized output. See Section 25.1\nCharacter Maps XT30 of [XSL Transformations (XSLT) Version 3.0] for\nexamples of using character mapping in XSLT.","Character mapping is applied to the characters that actually\nappear in a text or attribute node in the instance of the data model, before any\nother serialization operations such as escaping or Unicode\nNormalization are applied. If a character is mapped, then it is\nnot subjected to XML or HTML escaping, nor to Unicode\nNormalization. The string that is substituted for a character is\nnot validated or processed in any way by the serializer , except for translation into the\ntarget encoding. In particular, it is not subjected to XML or HTML\nescaping, it is not subjected to Unicode Normalization, and it is\nnot subjected to further character mapping.","Character mapping is not applied to characters in text "," whose parent elements are listed\nin the "," parameter, nor to\ncharacters for which output escaping has been disabled (disabling\noutput escaping is ","a feature in all versions of XSLT","),\nnor to characters in attribute values that are subject to "," defined for\nthe HTML and XHTML output methods, unless "," has been disabled using the\n"," parameter in the output\ndefinition.","On serialization, occurrences of a character specified in the\n"," in text "," and attribute values are replaced by the\ncorresponding string from the ","Using a character map can result in non-well-formed documents if\nthe string contains XML-significant characters. For example, it is\npossible to create documents containing unmatched start and end\ntags, references to entities that are not declared, or attributes\nthat contain tags or unescaped quotation marks.","If a character is mapped, then it is not subjected to XML or\nHTML escaping.","A serialization error [err:SERE0008 ] occurs if character mapping\ncauses the output of a string containing a character that cannot be\nrepresented in the encoding that the serializer is using for output. The serializer \nMUST signal the error.","10 Conformance","Serialization is intended primarily as a component of a\nhost language .\n[Definition : A host language is\nanother specification that includes, by reference, this\nspecification and all of its requirements. A host language might be\na programming language such as [XSL\nTransformations (XSLT) Version 3.0] or [XQuery 3.0: An XML Query Language] , or it\nmight be an application programming interface (API) intended to be\nused by programs written in some other high-level programming\nlanguage. The use of the term language is not intended to\npreclude the possibility that this specification might be\nreferenced outside the context of a programming language\nspecification. ] This document relies on\nspecifications that use it to specify conformance criteria for\nSerialization in their respective environments. Specifications that\nset conformance criteria for their use of Serialization\nMUST NOT change the semantic definitions of\nSerialization as given in this specification, except by subsetting\nand/or compatible extensions. It is the responsibility of the\nhost language to\nspecify how serialization errors are to be\nhandled.","Certain facilities in this specification are described as\nproducing implementation-defined results. A claim that asserts\nconformance with this specification MUST be\naccompanied by documentation stating the effect of each\nimplementation-defined feature. For convenience, a non-normative\nchecklist of implementation-defined features is provided at\nE Checklist of\nImplementation-Defined Features .","A References","A.1 Normative References","XQuery and XPath Data Model (XDM)\n3.0","XQuery and XPath\nData Model (XDM) 3.0 , Norman Walsh, Anders Berglund,\nJohn Snelson, Editors. World Wide Web Consortium, 08 April 2014.\nThis version is\nhttp://www.w3.org/TR/2014/REC-xpath-datamodel-30-20140408/. The\nlatest\nversion is available at\nhttp://www.w3.org/TR/xpath-datamodel-30/.","XQuery and XPath Functions and Operators\n3.0","XQuery and XPath\nFunctions and Operators 3.0 , Michael Kay, Editor. World\nWide Web Consortium, 08 April 2014. This version is\nhttp://www.w3.org/TR/2014/REC-xpath-functions-30-20140408/. The\nlatest\nversion is available at\nhttp://www.w3.org/TR/xpath-functions-30/.","HTML5","HTML5 ,\nRobin Berjon, Steve Faulkner, Travis Leithead, et. al. ,\nEditors. World Wide Web Consortium, 04 Feb 2014. This\nversion is http://www.w3.org/TR/2014/CR-html5-20140204/. The\nlatest version is\navailable at http://www.w3.org/TR/html5/.","HTML 4.01\nSpecification , Dave Raggett, Arnaud Le Hors, and Ian\nJacobs, Editors. World Wide Web Consortium, 24 Dec 1999.\nThis version is http://www.w3.org/TR/1999/REC-html401-19991224. The\nlatest version is\navailable at http://www.w3.org/TR/html401.","IANA","Character\nSets . Internet Assigned Numbers Authority. Oct\n2012.","RFC2046","Multipurpose Internet\nMail Extensions (MIME) Part Two: Media Types , N. Freed,\nN. Borenstein. Network Working Group, IETF, Nov 1996.","RFC2119","Key words\nfor use in RFCs to Indicate Requirement Levels , S.\nBradner. Network Working Group, IETF, Mar 1997.","RFC2978","IANA\nCharset Registration Procedures , N. Freed and J. Postel\nNetwork Working Group, IETF, Oct 2000.","Unicode Encoding","Unicode\nCharacter Encoding Model , Unicode Consortium. Unicode\nStandard Annex #17.","UAX #15: Unicode Normalization\nForms","Unicode\nNormalization Forms , Unicode Consortium. Unicode\nStandard Annex #15.","XHTML\n1.0","XHTML™ 1.0 The\nExtensible HyperText Markup Language (Second Edition) ,\nSteven Pemberton, Editor. World Wide Web Consortium,\n01 Aug 2002. This version is\nhttp://www.w3.org/TR/2002/REC-xhtml1-20020801. The latest version is available at\nhttp://www.w3.org/TR/xhtml1.","XHTML\n1.1","XHTML™ 1.1 -\nModule-based XHTML - Second Edition , Shane McCarron and\nMasayasu Ishikawa, Editors. World Wide Web Consortium,\n23 Nov 2010. This version is\nhttp://www.w3.org/TR/2010/REC-xhtml11-20101123. The latest version is available at\nhttp://www.w3.org/TR/xhtml11/.","XML10","Extensible Markup\nLanguage (XML) 1.0 (Fifth Edition) , Tim Bray, Jean\nPaoli, Michael Sperberg-McQueen, et. al. , Editors. World\nWide Web Consortium, 26 Nov 2008. This version is\nhttp://www.w3.org/TR/2008/REC-xml-20081126/. The latest version is available at\nhttp://www.w3.org/TR/xml.","XML11","Extensible Markup\nLanguage (XML) 1.1 (Second Edition) , Tim Bray, Jean\nPaoli, Michael Sperberg-McQueen, et. al. , Editors. World\nWide Web Consortium, 16 Aug 2006. This version is\nhttp://www.w3.org/TR/2006/REC-xml11-20060816. The latest version is available at\nhttp://www.w3.org/TR/xml11/.","XML\nNames","Namespaces in\nXML 1.0 (Third Edition) , Tim Bray, Dave Hollander,\nAndrew Layman, et. al. , Editors. World Wide Web\nConsortium, 08 Dec 2009. This version is\nhttp://www.w3.org/TR/2009/REC-xml-names-20091208/. The latest version is available at\nhttp://www.w3.org/TR/xml-names.","XML Names 1.1","Namespaces\nin XML 1.1 (Second Edition) , Tim Bray, Dave Hollander,\nAndrew Layman, and Richard Tobin, Editors. World Wide Web\nConsortium, 16 Aug 2006. This version is\nhttp://www.w3.org/TR/2006/REC-xml-names11-20060816. The latest version is available\nat http://www.w3.org/TR/xml-names11/.","XML\nPath Language (XPath) 3.0","XML Path\nLanguage (XPath) 3.0 , Jonathan Robie, Don Chamberlin,\nMichael Dyck, John Snelson, Editors. World Wide Web Consortium, 08\nApril 2014. This version is\nhttp://www.w3.org/TR/2014/REC-xpath-30-20140408/. The latest version is available at\nhttp://www.w3.org/TR/xpath-30/.","XQuery 3.0: An XML Query Language","XQuery 3.0: An\nXML Query Language , Jonathan Robie, Don Chamberlin,\nMichael Dyck, John Snelson, Editors. World Wide Web Consortium, 08\nApril 2014. This version is\nhttp://www.w3.org/TR/2014/REC-xquery-30-20140408/. The latest version is available\nat http://www.w3.org/TR/xquery-30/.","XSL\nTransformations (XSLT) Version 2.0","XSL\nTransformations (XSLT) Version 2.0 (Second Edition) ,\nMichael Kay, Editor. World Wide Web Consortium, 23 January 2007.\nThis version is http://www.w3.org/TR/2007/REC-xslt20-20070123/. The\nlatest version is\navailable at http://www.w3.org/TR/xslt20/.","A.2 Informative References","Character Model for the World Wide Web 1.0:\nNormalization","Character\nModel for the World Wide Web 1.0: Normalization ,\nFrançois Yergeau, Martin Dürst, Richard Ishida, et. al. ,\nEditors. World Wide Web Consortium, 01 May 2012. This\nversion is http://www.w3.org/TR/2012/WD-charmod-norm-20120501/. The\nlatest version is\navailable at http://www.w3.org/TR/charmod-norm/.","Polyglot","Polyglot\nMarkup: A robust profile of the HTML5 vocabulary , Eliot\nGraff and Leif Halvard Silli, Editors. World Wide Web Consortium,\n04 Feb 2014. This version is\nhttp://www.w3.org/TR/2014/WD-html-polyglot-20140204/. The latest version is\navailable at http://www.w3.org/TR/html-polyglot/.","RFC2854","The\n'text/html' Media Type , D. Connolly, L. Masinter.\nNetwork Working Group, IETF, Jun 2000.","RFC3236","The\n'application/xhtml+xml' Media Type , M. Baker and P.\nStark. Network Working Group, IETF, Jan 2002.","XML Schema","XML Schema\nPart 1: Structures Second Edition , Henry Thompson, David\nBeech, Murray Maloney, and Noah Mendelsohn, Editors. World Wide Web\nConsortium, 28 Oct 2004. This version is\nhttp://www.w3.org/TR/2004/REC-xmlschema-1-20041028/. The latest version is available\nat http://www.w3.org/TR/xmlschema-1/.","XHTML Modularization","XHTML™\nModularization 1.1 - Second Edition , Shane McCarron,\nEditor. World Wide Web Consortium, 29 Jul 2010. This\nversion is\nhttp://www.w3.org/TR/2010/REC-xhtml-modularization-20100729. The\nlatest\nversion is available at\nhttp://www.w3.org/TR/xhtml-modularization/.","XQuery 1.0 and XPath 2.0 Data\nModel","XQuery\n1.0 and XPath 2.0 Data Model (XDM) (Second Edition) ,\nNorman Walsh, Mary Fernández, Ashok Malhotra, et. al. ,\nEditors. World Wide Web Consortium, 14 December 2010. This version\nis http://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/. The\nlatest version \nis available at http://www.w3.org/TR/xpath-datamodel/.","XSLT 2.0 and XQuery 1.0 Serialization\n(Second Edition)","\nXSLT 2.0 and XQuery 1.0 Serialization (Second Edition), W3C\nRecommendation , Henry Zongaro, Norman Walsh, Joanne\nTong, et. al. , Editors. World Wide Web Consortium,\n14 December 2010. This version is\nhttp://www.w3.org/TR/2010/REC-xslt-xquery-serialization-20101214/","XSL\nTransformations (XSLT) Version 3.0","XSL\nTransformations (XSLT) Version 3.0 , Michael Kay, Editor.\nWorld Wide Web Consortium, 12 December 2013. This version is\nhttp://www.w3.org/TR/2013/WD-xslt-30-20131212/. The latest version is available at\nhttp://www.w3.org/TR/xslt-30/.","B Schema\nfor Serialization Parameters","The following schema describes the structure of a Data Model\ninstance that can be used to specify the settings of serialization\nparameters using the mechanism described in 3.1 Setting Serialization\nParameters by Means of a Data Model Instance .","A copy of this schema is available at \nhttp://www.w3.org/2014/04/xslt-xquery-serialization/schema-for-serialization-parameters.xsd .","C Summary of Error\nConditions","This document uses the ","err"," prefix which represents\nthe same namespace URI (http://www.w3.org/2005/xqt-errors) as\ndefined in ","[XML Path Language (XPath) 3.0]",".\nUse of this namespace prefix binding in this document is not\nnormative.","It is an error if an item in S6 in sequence\nnormalization is an attribute node or a namespace node.","It is an error if the serializer is unable to satisfy the rules for\neither a well-formed XML document entity or a well-formed XML\nexternal general parsed entity, or both, except for content\nmodified by the character expansion phase of serialization.","It is an error to specify the doctype-system parameter, or to\nspecify the standalone parameter with a value other than\n",", if the instance of the data model contains text\nnodes or multiple element nodes as children of the root node.","It is an error if the serialized result would contain an\n","It is an error if the serialized result would contain a\ncharacter that is not permitted by the version of XML specified by\nthe ","It is an error if an output encoding other than\n"," is requested and the\n"," does not\nsupport that encoding.","It is an error if a character that cannot be represented in the\nencoding that the serializer is using for output appears in a\ncontext where character references are not allowed (for example if\nthe character occurs in the name of an element).","It is an error if the "," attribute has a value other than\n","; or the "," parameter has a\nvalue other than "," and the\n","It is an error if the output method is ","or\n",", the value of the\n"," parameter is 1.0.","It is an error if the value of the\n"," and any relevant construct of the\nresult begins with a combining character."," does not support the version of XML\nspecified by the ","or the\nversion of HTML specified by the "," or the\n","It is an error to use the HTML output method if characters which\nare permitted in XML but not in the requested HTML\nversion appear in the instance of the data model.","It is an error to use the HTML output method when\n"," appears within a processing instruction in the\ndata model instance being serialized.","It is an error if a parameter value is invalid for the defined\ndomain.","It is an error if evaluating an expression in order to extract\nthe setting of a serialization parameter from a data model instance\nwould yield an error.","It is an error if evaluating an expression in order to extract\nthe setting of the "," serialization\nparameter from a data model instance would yield a sequence of\nlength greater than one.","It is an error if an instance of the data model used to specify\nthe settings of serialization parameters specifies the value of the\nsame parameter more than once",", or if the instance does not\nhave as its root node an element node or a document node with an\nelement node child, where the local part of the name of the element\nnode is "," and the namespace URI\nis\n","err:SERE0020","This error has been removed.","D List of URI Attributes","The following list of attributes are declared as type\n","%URI","%UriList"," for a given HTML or\nXHTML element, with the exception of the ","name","\nattribute for element ","A"," which is not a URI type. The\n"," attribute for element "," be escaped as is recommended by the HTML\nRecommendation "," in Appendix B.2.1.","Attributes","Elements","action","FORM","archive","OBJECT","background","BODY","BLOCKQUOTE, DEL, INS, Q","classid","codebase","APPLET, OBJECT","datasrc","BUTTON, DIV, INPUT, OBJECT, SELECT, SPAN, TABLE, TEXTAREA","for","SCRIPT","formaction","BUTTON, INPUT","href","A, AREA, BASE, LINK","icon","COMMAND","longdesc","FRAME, IFRAME, IMG","manifest","poster","VIDEO","profile","HEAD","src","AUDIO, EMBED, FRAME, IFRAME, IMG, INPUT,\nSCRIPT, SOURCE, TRACK, VIDEO ","usemap","IMG, INPUT, OBJECT","value","INPUT","E Checklist of\nImplementation-Defined Features (Non-Normative)","This appendix provides a summary of Serialization features whose\neffect is explicitly implementation-defined . The conformance rules (see\n10 Conformance ) require vendors\nto provide documentation that explains how these choices have been\nexercised.","For any implementation-defined output method, it is implementation-defined \nwhether sequence normalization process takes\nplace. (See 2 Sequence\nNormalization )","If the namespace URI is non-null for the ","\nserialization parameter, then the parameter specifies an ","\noutput method. (See ","The effect of additional serialization parameters on the output\nof the serializer ,\nwhere the name of such a parameter MUST be\nnamespace-qualified, is implementation-defined or implementation-dependent . The extent of this effect\non the output MUST NOT override the provisions of\nthis specification. (See 3 Serialization\nParameters )","Implementation-defined schema components\nMAY be included in the set of schema components\nthat are used in evaluating an XQuery expression or XSLT\ninstruction in the process of using an XDM instance to determine\nthe settings serialization parameters. (See 3.1 Setting Serialization\nParameters by Means of a Data Model Instance )","If an instance of the data model used to determine the\nsettings of serialization parameters contains elements or\nattributes that are in a namespace other than\n"," (See ","The effect of providing an option that allows the encoding\nphase to be skipped, so that the result of serialization is a\nstream of Unicode characters, is implementation-defined . The serializer is not required to\nsupport such an option. (See 4 Phases of\nSerialization )","If an implementation supports a value of the\n"," parameter for the XML or XHTML output method\nfor which this document does not provide a normative definition,\nthe behavior is ","5.1.1 XML Output Method: the version\nParameter","A serializer \nMAY provide an implementation-defined mechanism to place CDATA\nsections in the result\ntree . (See 5.1.5 XML\nOutput Method: the cdata-section-elements Parameter )"," form\nparameter is not "," then the\nmeaning of the value and its effect is ",".\n(See ","5.1.9 XML Output Method:\nthe normalization-form Parameter","For information used in the XHTML and HTML output methods\nfor which this specification cites [HTML5] ,\nimplementations MUST take the information in\nquestion from the version cited in A.1 Normative References , or\nfrom later versions of [HTML5] published by\nW3C. If they take the information from versions other than the one\ncited in A.1 Normative\nReferences , then it is implementation-defined which future version of\n[HTML5] is used as the source of the\ninformation; this includes (but is not limited to) the lists of\nelements recognized as HTML elements , void elements , phrasing content , and\nBoolean\nattributes . (See 6 XHTML\nOutput Method )","For the HTML output method, it is ","\nwhether the "," elements, which are not part of HTML5 are\nconsidered to be void elements when the "," has the value ","7.1 Markup for Elements"," parameter for the HTML output method for which\nthis document does not provide a normative definition, the behavior\nis ","7.4.1 HTML Output Method: the version and\nhtml-version Parameters","F Change Log\n(Non-Normative)","This appendix details the changes that have been made since the\npublication of the [XSLT 2.0 and\nXQuery 1.0 Serialization (Second Edition)] .","F.1\nChanges applied for the Recommendation","The following changes have been applied since the publication of\nthe Candidate Recommendation to produce this document.","Bugzilla bug (if applicable)","Erratum (if applicable)","Category","Description of change","Affected sections","Bugzilla bug\n25149","Editorial","Remove normative dependencies to documents whose technical\nstability is not assured.","3.1 Setting\nSerialization Parameters by Means of a Data Model\nInstance","4 Phases of Serialization","6 XHTML Output Method","6.1.4 XHTML Output Method: the indent\nand suppress-indentation Parameters","6.2 Polyglot markup and\nnamespace declarations","7.2 Writing Attributes","7.3 Writing Character\nData","7.4.1 HTML Output Method: the version\nand html-version Parameters","7.4.3 HTML Output Method: the indent\nand suppress-indentation Parameters","9 Character Maps","A.1 Normative\nReferences","A.2 Informative\nReferences","E Checklist of\nImplementation-Defined Features","Bugzilla bug\n25156","Correct typo in sample XSLT expression for\n","F.2 Changes\napplied for the Candidate Recommendation","The following changes have been applied since the publication of\nthe fifth Public Working Draft to produce this, the sixth Public\nWorking Draft.","Bugzilla bug\n20245","Editorial improvements to the description of how void elements\nand elements with an empty content model are processed by the XHTML\noutput method.","Bugzilla bug\n20251 and Bugzilla bug\n20261 ","Substantive","Corrections to the description of prefix stripping for the\nXHTML output method.","F.3\nChanges applied for the fifth Public Working Draft","The following changes have been applied since the publication of\nthe fourth Public Working Draft to produce this, the fifth Public\nWorking Draft.","Bugzilla bug\n16311","Added new serialization parameter for specifying a separator\nthat is inserted between items in the sequence that is to be\nserialized.","5.1.15 XML Output Method: the\nitem-separator Parameter","6.1.15 XHTML Output Method:\nthe item-separator Parameter","7.4.14 HTML Output Method: the\nitem-separator Parameter","8.1.15 Text Output Method: the\nitem-separator Parameter","Added XSLT instructions equivalent to XQuery expressions for\nsetting serialization parameters by means of a data model instance,\nand other editorial corrections and improvements.","Clarified the definition of host language to make it clear that\nAPIs can be considered to be host languages.","Bugzilla\n6129","Extended the definitions of the HTML and XHTML output methods\nto include support for HTML5 serialization.","3 Serialization Parameters","5.1.2 XML Output Method: the\nhtml-version Parameter","6.1.2 XHTML Output Method: the\nhtml-version Parameter","6.1.7 XHTML Output Method: the\ndoctype-system and doctype-public Parameters","6.1.14 XHTML Output\nMethod: the include-content-type Parameter","7 HTML Output Method","7.4.6 HTML Output Method: the\ndoctype-system and doctype-public Parameters","Bugzilla\n17619","Text associated with links to the definitions of the terms\nNCName, EncName and VersionNum was repeated several times.","5.1.3 XML Output Method: the encoding\nParameter","Bugzilla\n15915","Made uses of the terms \"absent\" and \"unspecified\"\nconsistent.","Bugzilla\n17282","Changed type of the ","\nserialization parameter to ",". This would be an\nincompatible change from XQuery 1.0, for any implementation that\nsupported the Serialization Feature, and supported an\nimplementation-defined value for the\n"," serialization parameter that did\nnot have the lexical form of an ","NMToken","B Schema for Serialization\nParameters","F.4\nChanges applied for the fourth Public Working Draft","The following changes were applied\nfollowing the publication of the third Public Working\nDraft to produce the fourth Public Working Draft. None\nof these changes introduces an incompatibility with [XSLT 2.0 and XQuery 1.0 Serialization\n(Second Edition)] .","Bugzilla bug\n12852","Corrected the type of the "," serialization\nparameter in the Schema for Serialization Parameters.","Bugzilla bug\n13688","Corrected the regular expression associated with the\n","encoding-string-type"," type in the Schema for\nSerialization Parameters, so that hyphens are permitted to appear\nin the ","Bugzilla bug\n10176","SE.E20","Clarified what it means for the html output method to output an\nXML island as XML.","Bugzilla bug\n14751","Corrected typographical errors in the comments associated with\nthe ","yes-no-param-type","encoding-param-type"," types in the Schema for\nSerialization Parameters.","F.5\nChanges applied for the third Public Working Draft","The following changes were applied after the publication of the\nsecond Public Working Draft to produce the third Public Working\nDraft. None of these changes introduces an incompatibility with\n[XSLT 2.0 and XQuery 1.0\nSerialization (Second Edition)] , except as noted below.","Bugzilla bug\n11635","SE.E19","Clarified that serialization error SEPM0010 applies to the\nxhtml output method as well as the xml output method.","F.6\nChanges applied for the second Public Working Draft","The following changes were applied after the publication of the\nfirst Public Working Draft to produce the second Public Working\nDraft. None of these changes introduces an incompatibility with\n[XSLT 2.0 and XQuery 1.0\nSerialization (Second Edition)] , except as noted below.","Bugzilla bug\n6535","Added definition of the ","\nserialization parameter.","5.1.4 XML Output Method: the indent and\nsuppress-indentation Parameters","8.1.4 Text Output Method: the indent\nand suppress-indentation Parameters","Bugzilla bug\n7829","SE.E14","Clarified how minimized attributes are handled under the rules\nof the HTML output method.","Bugzilla bug\n8245","SE.E15","Corrected description of a serialization error that mentions\nwhich control characters are not permitted under the rules of the\nHTML output method","Bugzilla bug\n7823","SE.E16","Clarified how the ","\nelements are handled for the HTML output method.","Bugzilla bug\n8651","SE.E17","Clarified what it means to compare without regard to case.","Bugzilla bug\n8206","SE.E18","Clarified what it means to escape according to HTML or XML\nrules.","Bugzilla bug\n6808","Relaxed rules for the XML output method that specify where a\nserializer is permitted to add whitespace. This introduces an\nincompatibility only inasmuch as the serialized results produced by\na serializer conforming to this specification could differ from the\nresults a serializer that adheres to [XSLT 2.0 and XQuery 1.0 Serialization\n(Second Edition)] would be permitted to produce.","Bugzilla bug\n9302","Defined a mechanism for specifying serialization parameter\nsettings in the form of a data model instance.","Replaced all uses of the words legal and\nillegal with more appropriate terms.","F.7\nChanges applied for the first Public Working Draft","The following changes were applied after the publication of\n[XSLT 2.0 and XQuery 1.0\nSerialization (Second Edition)] to produce the first Public\nWorking Draft. None of these changes introduces an incompatibility\nwith [XSLT 2.0 and XQuery 1.0\nSerialization (Second Edition)] .","Bugzilla bug\n6723","SE.E13","Clarified how HTML elements that have no children but whose\ncontent model is not empty are serialized.","Bugzilla bug\n6732","SE.E12","Clarified for which versions of XML and HTML this document\nmakes normative statements.","Take into account presence of function items in a sequence that\nis to be serialized.","Miscellaneous minor editorial corrections and\nimprovements."]}
XSLT and XQuery Serialization
3.0
W3C Recommendation
08 April 2014
Status Update (6 April 2021): Feedback, comments, error reports on this specification should be sent via GitHub
https://github.com/w3c/qtspecs/issues or email to public-qt-comments@w3.org .
This version:
http://www.w3.org/TR/2014/REC-xslt-xquery-serialization-30-20140408/
Latest version of XSLT and XQuery Serialization 3.0:
http://www.w3.org/TR/xslt-xquery-serialization-30/
Previous versions of XSLT and XQuery Serialization 3.0:
http://www.w3.org/TR/2013/PR-xslt-xquery-serialization-30-20131022/ ,
http://www.w3.org/TR/2013/CR-xslt-xquery-serialization-30-20130328/ ,
http://www.w3.org/TR/2013/WD-xslt-xquery-serialization-30-20130108/ ,
http://www.w3.org/TR/2011/WD-xslt-xquery-serialization-30-20111213/ ,
http://www.w3.org/TR/2011/WD-xslt-xquery-serialization-30-20110614/ ,
http://www.w3.org/TR/2010/WD-xslt-xquery-serialization-30-20101214/ ,
http://www.w3.org/TR/2009/WD-xslt-xquery-serialization-11-20091215/
Most recent version of XSLT and XQuery Serialization 3:
http://www.w3.org/TR/xslt-xquery-serialization-3/
Most recent version of XSLT and XQuery Serialization:
http://www.w3.org/TR/xslt-xquery-serialization/
Most recent Recommendation of XSLT and XQuery
Serialization:
http://www.w3.org/TR/2010/REC-xslt-xquery-serialization-20101214/
Editors:
Henry Zongaro, IBM Canada Software Lab - Toronto Site <http://www-03.ibm.com/software/ca/en/canadalab/locations.html>
Andrew Coleman, IBM Hursley Laboratories <andrew_coleman@uk.ibm.com>
C. M. Sperberg-McQueen, Black Mesa Technologies <http://blackmesatech.com/>
Please check the
errata for any errors or issues reported since
publication.
See also
translations .
This document is also available in these non-normative formats:
XML and Change
markings relative to 1.0 Recommendation .
Copyright © 2014 W3C ®
(MIT , ERCIM ,
Keio , Beihang ), All Rights Reserved. W3C
liability ,
trademark
and document
use rules apply.
Status of this Document
This section describes the status of this document at the
time of its publication. Other documents may supersede this
document. A list of current W3C publications and the latest
revision of this technical report can be found in the W3C technical reports index at
http://www.w3.org/TR/.
This is one document in a set of six documents that have been
progressed to Recommendation together (XQuery 3.0, XQueryX 3.0,
XPath 3.0, Data Model 3.0, Functions and Operators 3.0, and
Serialization 3.0).
This is a Recommendation
of the W3C. It was jointly developed by the W3C XSLT Working Group and the W3C
XML Query Working Group ,
each of which is part of the XML Activity .
This Recommendation of XSLT and XQuery Serialization 3.0
represents the second version of
a previous W3C Recommendation .
This specification is designed to be referenced normatively from
other specifications defining a host language for it; it is not
intended to be implemented outside a host language. The
implementability of this specification has been tested in the
context of its normative inclusion in host languages defined by the
XQuery 3.0 and XSLT
3.0 (expected in 2014) specifications; see the XQuery
3.0 implementation report (and, in the future, the WGs expect
that there will also be a — possibly member-only — XSLT 3.0
implementation report) for details.
This document incorporates minor changes made against the
Proposed
Recommendation of 22 October 2013. Changes to this document
since the Proposed
Recommendation are detailed in F
Change Log .
Please report errors in this document using W3C's public Bugzilla system
(instructions can be found at http://www.w3.org/XML/2005/04/qt-bugzilla ).
If access to that system is not feasible, you may send your
comments to the W3C XSLT/XPath/XQuery public comments mailing list,
public-qt-comments@w3.org .
It will be very helpful if you include the string “[SER30]” in the
subject line of your report, whether made in Bugzilla or in email.
Please use multiple Bugzilla entries (or, if necessary, multiple
email messages) if you have more than one comment to make. Archives
of the comments and responses are available at http://lists.w3.org/Archives/Public/public-qt-comments/ .
This document has been reviewed by W3C Members, by software
developers, and by other W3C groups and interested parties, and is
endorsed by the Director as a W3C Recommendation. It is a stable
document and may be used as reference material or cited from
another document. W3C's role in making the Recommendation is to
draw attention to the specification and to promote its widespread
deployment. This enhances the functionality and interoperability of
the Web.
This document was produced by groups operating under the
5
February 2004 W3C Patent Policy . W3C maintains a public
list of any patent disclosures made in connection with the
deliverables of the XML Query Working Group and also maintains a
public
list of any patent disclosures made in connection with the
deliverables of the XSL Working Group; those pages also include
instructions for disclosing a patent. An individual who has actual
knowledge of a patent which the individual believes contains
Essential Claim(s) must disclose the information in accordance
with
section 6 of the W3C Patent Policy .
1 Introduction
This document defines serialization of the W3C XQuery and XPath
Data Model 3.0 (XDM), which is the data model of at
least [XML Path Language (XPath) 3.0] ,
[XSL Transformations (XSLT) Version 3.0] ,
and [XQuery 3.0: An XML Query Language] ,
and any other specifications that reference it.
In this document, examples and material labeled as "Note" are
provided for explanatory purposes and are not normative.
Serialization is the process of converting an instance of the
[XQuery and XPath Data Model (XDM)
3.0] into a sequence of octets. Serialization is well-defined
for most data model instances.
1.1 Terminology
In this specification, where they appear in upper case, the
words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "MAY",
"REQUIRED", and "RECOMMENDED" are to be interpreted as described in
[RFC2119] .
[Definition : As is indicated in 10 Conformance , conformance criteria for
serialization are determined by other specifications that refer to
this specification. A serializer is software that implements
some or all of the requirements of this specification in accordance
with such conformance criteria.] A serializer is not
REQUIRED to directly provide a programming
interface that permits a user to set serialization parameters or to
provide an input sequence for serialization. In this
document, material labeled as "Note" and examples are provided for
explanatory purposes and are not normative.
Certain aspects of serialization are described in this
specification as implementation-defined or implementation-dependent .
[Definition :
Implementation-defined indicates an aspect that
MAY differ between serializers , but whose actual behavior
MUST be specified either by another specification
that sets conformance criteria for serialization (see 10 Conformance ) or in documentation that
accompanies the serializer .]
[Definition :
Implementation-dependent indicates an aspect that
MAY differ between serializers , and whose actual behavior is not
REQUIRED to be specified either by another
specification that sets conformance criteria for serialization (see
10 Conformance ) or in
documentation that accompanies the serializer .]
[Definition : In some instances, the
sequence that is input to serialization cannot be successfully
converted into a sequence of octets given the set of serialization
parameter (3 Serialization
Parameters ) values specified. A serialization error
is said to occur in such an instance.] In some cases, a serializer is
REQUIRED to signal such an error. What it means to
signal a serialization error is determined by the relevant
conformance criteria (10
Conformance ) to which the serializer conforms. In other cases, there is an
implementation-defined choice between signaling a
serialization error and performing a recovery action. Such a
recovery action will allow a serializer to produce a sequence of octets that
might not fully reflect the usual requirements of the parameter
settings that are in effect.
[Definition : Where this specification
indicates that two strings are to be compared without regard to
case , the serializer MUST translate any
characters in the range #x41 (LATIN CAPITAL LETTER A) to #x5A
(LATIN CAPITAL LETTER Z), inclusive, to the corresponding
lower-case letters in the range #x61 (LATIN SMALL LETTER A) to #x7A
(LATIN SMALL LETTER Z) only for the purposes of making the
comparison. The comparison succeeds if the two strings are the same
length and the code point of each character in the first string is
equal to the code point of the character in the corresponding
position in the second string.]
Many terms used in this document are defined in the XPath
specification [XML Path Language (XPath)
3.0] or the Data Model specification [XQuery and XPath Data Model (XDM) 3.0] .
Particular attention is drawn to the following:
[Definition : The term atomization is defined in
Section
2.4.2 Atomization XP30 . It is a
process that takes as input a sequence of nodes and atomic
values XP30 , and returns a sequence of
atomic
values XP30 , in which the nodes are replaced by their typed
values XP30 as defined in [XQuery and XPath Data Model (XDM)
3.0] .]
[Definition : The
term Node is defined as part of Section 6 Nodes
DM30 . There are seven kinds of nodes in the data model: document,
element, attribute, text, namespace, processing instruction, and
comment.]
[Definition : The term sequence is defined in
Section 2
Basics XP30 . A sequence is an ordered collection of zero
or more items.]
[Definition : The term function is defined
in Section
2.8.1 Functions DM30 .]
[Definition : The term string value is
defined in Section
5.13 string-value Accessor DM30 .
Every node has a string value . For
example, the string
value of an element is the concatenation of the string values of all its
descendant text nodes .]
[Definition : The term expanded QName is
defined in Section 2 Basics
XP30 . An expanded QName consists of an optional
namespace URI and a local name. An expanded QName also retains its original
namespace prefix (if any), to facilitate casting the expanded QName
into a string.]
[Definition : An element or attribute that
is in no namespace, or an expanded-QName whose namespace part is an
empty sequence, is referred to as having a null namespace
URI ].
[Definition : An element or
attribute that does not have a null namespace URI , is referred to as
having a non-null namespace URI ].
[Definition : A space character, TAB
character, CR character or NL character is referred to as a
whitespace character .]
Where this specification indicates that an XSLT instruction is
evaluated, the behavior is as specified by [XSL
Transformations (XSLT) Version 2.0] . Where it indicates that an
XQuery expression is evaluated, the behavior is as specified by
[XQuery 3.0: An XML Query Language] .
1.2 Namespaces
This specification refers to several namespaces that affect the
process of serialization. These are:
[Definition : the Output
declaration namespace ,
http://www.w3.org/2010/xslt-xquery-serialization
];
[Definition : the XML namespace ,
http://www.w3.org/XML/1998/namespace
];
[Definition : the XHTML namespace
namespace, http://www.w3.org/1999/xhtml
];
[Definition : the SVG namespace ,
http://www.w3.org/2000/svg
]; and
[Definition : the MathML namespace
namespace, http://www.w3.org/1998/Math/MathML
.]
Wherever an element node or attribute node is said to be in a
particular namespace, it is understood that the namespace URI of
the node is equal to the namespace URI corresponding to that
namespace. Wherever a namespace node is said to be a namespace node
for a particular namespace, it is understood that the string value of the node
is equal to the namespace URI corresponding to that namespace.
2 Sequence Normalization
An instance of the data model that is input to the serialization
process is a sequence. Prior to serializing a sequence using any of
the output methods whose behavior is specified by this document
(3 Serialization Parameters ), the
serializer
MUST first compute a normalized sequence for
serialization; it is the normalized sequence that is actually
serialized. [Definition : The purpose of sequence
normalization is to create a sequence that can be serialized as
a well-formed XML document or external general parsed entity, that
also reflects the content of the input sequence to the extent
possible.] [Definition : The result of the sequence
normalization process is a result tree .]
The normalized sequence for serialization is constructed by
applying all of the following rules in order, with the initial
sequence being input to the first step, and the sequence that
results from any step being used as input to the subsequent step.
For any implementation-defined output method, it is implementation-defined
whether this sequence normalization process takes place.
Where the process of converting the input sequence to a
normalized sequence indicates that a value MUST be
cast to xs:string
, that operation is defined in
Section
18.1.1 Casting to xs:string and xs:untypedAtomic
FO30 of [XQuery and XPath Functions and Operators
3.0] . Where a step in the sequence normalization process
indicates that a node should be copied, the copy is performed in
the same way as an XSLT xsl:copy-of
instruction that
has a validation
attribute whose value is
preserve
and has a select
attribute whose
effective value is the node, as described in Section 11.9.2 Deep
Copy XT of [XSL
Transformations (XSLT) Version 2.0] , or equivalently in
the same way as an XQuery content expression as described in Step
1e of Section
3.9.1.3 Content XQ30 of [XQuery 3.0: An XML Query Language] , where the
construction mode is preserve
. The steps in
computing the normalized sequence are:
If the sequence that is input to serialization is empty, create
a sequence S1 that consists of a zero-length
string. Otherwise, copy each item in the sequence that is input to
serialization to create the new sequence
S1 .
For each item in S1 , if the item is atomic,
obtain the lexical representation of the item by casting it to an
xs:string
and copy the string representation to the
new sequence; otherwise, copy the item to the new sequence. The new
sequence is S2 .
If the item-separator
serialization parameter
is absent, then for each subsequence of adjacent strings in
S2 , copy a single string to the new sequence
equal to the values of the strings in the subsequence concatenated
in order, each separated by a single space. Copy all other items to
the new sequence. Otherwise, copy each item in
S2 to the new sequence, inserting between each
pair of items a string whose value is equal to the value of the
item-separator
parameter. The new sequence is
S3 .
For each item in S3 , if the item is a
string, create a text node in
the new sequence whose string value is equal to the string;
otherwise, copy the item to the new sequence. The new sequence is
S4 .
For each item in S4 , if the item is a
document node , copy its
children to the new sequence; otherwise, copy the item to the new
sequence. The new sequence is S5 .
For each subsequence of adjacent text nodes in
S5 , copy a single text node to the new sequence
equal to the values of the text nodes in the subsequence
concatenated in order. Any text nodes with values of zero length
are dropped. Copy all other items to the new sequence. The new
sequence is S6 .
It is a serialization error [err:SENR0001 ] if an item in
S6 is an attribute node , a namespace node or a function . Otherwise, construct a
new sequence, S7 , that consists of a single
document node and copy all the
items in the sequence, which are all nodes , as children of that document node .
S7 is the normalized sequence.
The result tree
rooted at the document node
that is created by the final step of this sequence normalization
process is the instance of the data model to which the rules of the
appropriate output method are applied. If the sequence
normalization process results in a serialization error , the serializer
MUST signal the error.
Note:
If the item-separator
serialization parameter
is absent, the sequence normalization process for a sequence
$seq
is equivalent to constructing a document
node using the XSLT
instruction:
<xsl:document>
<xsl:copy-of select="$seq" validation="preserve"/>
</xsl:document>
or the XQuery expression:
declare construction preserve;
document { $seq }
If the item-separator
serialization parameter
is present, the sequence normalization process for a sequence
$seq
is equivalent to constructing a document
node using the XSLT
instruction:
<xsl:document>
<xsl:for-each select="$seq">
<xsl:sequence select="if (position() gt 1) then $sep else ()"/>
<xsl:choose>
<xsl:when test=". instance of node()">
<xsl:sequence select="."/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:document>
or the XQuery expression:
declare construction preserve;
document {
for $item at $pos in $seq
let $node :=
if ($item instance of node()) then
$item
else
text { $item }
return
if ($pos eq 1) then
$node
else
($sep, $node)
}
where the value of the sep
variable is a
string whose value is equal to the value of the
item-separator
serialization parameter.
This process results in a serialization error [err:SENR0001 ] if $seq
contains functions, attribute nodes or namespace nodes .
3 Serialization
Parameters
There are a number of parameters that influence how
serialization is performed. Host languages MAY allow
users to specify any or all of these parameters, but they are not
REQUIRED to be able to do so. However, the
host language
specification MUST specify how the values of all
applicable parameters are to be determined.
It is a serialization error [err:SEPM0016 ] if a parameter value is
invalid for the given parameter. It is the responsibility of the
host language to
specify how invalid values should be handled at the level of that
language.
The following serialization parameters are defined:
Serialization parameter name
Permitted values for parameter
byte-order-mark
One of the enumerated values yes
or
no
. This parameter indicates whether the serialized
sequence of octets is to be preceded by a Byte Order Mark. (See
Section 5.1 of [Unicode Encoding] .)
The actual octet order used is implementation-dependent . If the encoding
defines no Byte Order Mark, or if the Byte Order Mark is prohibited
for the specific Unicode encoding or implementation environment,
then this parameter is ignored.
cdata-section-elements
A list of expanded QNames, possibly empty.
doctype-public
A string of PubidChar XML
characters. This parameter MAY be
absent.
doctype-system
A string of Unicode characters that does not
include both an apostrophe (#x27) and a quotation mark (#x22)
character. This parameter MAY be
absent.
encoding
A string of Unicode characters in the range #x21 to #x7E (that
is, printable ASCII characters); the value SHOULD
be a charset registered with the Internet Assigned Numbers
Authority [IANA] , [RFC2978] or begin with the characters
x-
or X-
.
escape-uri-attributes
One of the enumerated values yes
or
no
.
html-version
A decimal value. This parameter MAY be
absent.
include-content-type
One of the enumerated values yes
or
no
.
indent
One of the enumerated values yes
or
no
.
item-separator
A string of Unicode characters. This parameter
MAY be absent.
media-type
A string of Unicode characters specifying the media type (MIME
content type) [RFC2046] ; the charset
parameter of the media type MUST NOT be specified
explicitly in the value of the media-type
parameter.
If the destination of the serialized output is annotated with a
media type, this parameter MAY be used to provide
such an annotation. For example, it MAY be used to
set the media type in an HTTP header.
method
An expanded QName with a null namespace URI , and the local part of
the name equal to one of xml
, xhtml
,
html
or text
, or having a non-null
namespace URI . If the namespace URI is non-null, the parameter
specifies an implementation-defined output method.
normalization-form
One of the enumerated values NFC
,
NFD
, NFKC
, NFKD
,
fully-normalized
or none
, or an implementation-defined
value of type NMTOKEN
.
omit-xml-declaration
One of the enumerated values yes
or
no
.
standalone
One of the enumerated values yes
, no
or omit
.
suppress-indentation
A list of expanded QNames, possibly empty.
undeclare-prefixes
One of the enumerated values yes
or
no
.
use-character-maps
A list of pairs, possibly empty, with each pair consisting of a
single Unicode character and a string of Unicode characters.
version
A string of Unicode characters.
The value of the method
parameter is an expanded QName . If
the value has a null namespace URI , then the local name
identifies a method specified in this document and
MUST be one of xml
,
html
, xhtml
, or text
; in
this case, the output method specified MUST be
used for serializing. If the namespace URI is non-null, then it
identifies an implementation-defined output method; the behavior in
this case is not specified by this document.
In those cases where they have no important effect on the
content of the serialized result, details of the output methods
defined by this specification are left unspecified and are regarded
as implementation-dependent . Whether a serializer uses apostrophes or
quotation marks to delimit attribute values in the XML output
method is an example of such a detail.
The detailed semantics of each parameter will be described
separately for each output method for which it is
applicable . If the semantics of a parameter are not
described for an output method, then it is not applicable to that
output method.
Implementations MAY define additional
serialization parameters, and MAY allow users to
do so. For this purpose, the name of a serialization parameter is
considered to be a QName; the parameters listed above are QNames in
no namespace, while any additional serialization parameters
that are either implementation-defined or defined by the host language
MUST have names that are namespace-qualified.
Any such additional serialization parameters MUST
NOT be in the namespace
http://www.w3.org/2010/xslt-xquery-serialization
. A
host language
MAY specify the means by which an implementation
can define such an additional serialization parameter, and
implementations MAY provide mechanisms by which
users can define such an additional serialization parameter.
If the serialization method is one of the four methods
xml
, html
, xhtml
, or
text
, then the additional serialization parameters
MAY affect the output of the serializer to the extent (but only to the
extent) that this specification leaves the output implementation-defined
or implementation-dependent . For example, such
parameters might control whether namespace declarations on an
element are written before or after the attributes of the element,
or they might define the number of space or tab characters to be
inserted when the indent
parameter is set to
yes
; but they could not instruct the serializer to suppress the
error that occurs when the HTML output method encounters characters
that are not permitted (see error [err:SERE0014 ]).
3.1 Setting Serialization
Parameters by Means of a Data Model Instance
A host
language MAY provide, by reference to this
section, a mechanism by which the settings of serialization
parameters are supplied in the form of an instance of the data
model as specified in [XQuery and
XPath Data Model (XDM) 3.0] . The instance of the data model
used to determine the settings of serialization parameters
MUST be processed as if by the procedure described
below.
With the exception of the use-character-maps
parameter, the setting of each serialization parameter
defined in this specification is equal to the result
of evaluating the XQuery expression
(validate lax { document { . } })
/output:serialization-parameters
/output:*[local-name() eq $param-name]/data(@value)
or equivalently the XSLT instructions
<xsl:sequence>
<xsl:variable name="validated-instance">
<xsl:document validation="lax">
<xsl:sequence select="."/>
</xsl:document>
</xsl:variable>
<xsl:sequence select="$validated-instance
/output:serialization-parameters
/output:*[local-name() eq $param-name] /data(@value)"/>
</xsl:sequence>
with the supplied instance of the data model as the context
item, the param-name
variable having as its value a
value of type xs:string
equal to the local part of the
name of the particular serialization parameter, and the other
components of the dynamic context and static context as specified
in the subsequent tables. If in any case evaluating this expression
would yield an error, serialization error [err:SEPM0017 ] results.
If the result of evaluating this expression for a particular
serialization parameter is the empty sequence:
if the parameter is either cdata-section-elements
or suppress-indentation
and the result of evaluating
the XQuery expression
(validate lax { document { . } })
/output:serialization-parameters
/output:*[local-name() eq $param-name]
or equivalently the XSLT instructions
<xsl:sequence>
<xsl:variable name="validated-instance">
<xsl:document select="." validation="lax">
<xsl:sequence select="."/>
</xsl:document>
</xsl:variable>
<xsl:sequence select="$validated-instance
/output:serialization-parameters
/output:*[local-name() eq $param-name] "/>
</xsl:sequence>
with the same settings of the static context and dynamic context
is not an empty sequence, the setting of the parameter is the empty
list;
otherwise, the setting of the parameter is
absent .
The components of the static context used in evaluating the
XQuery expressions or XSLT instructions are as defined
in the following table.
Static Context Component
XQuery or XSLT
Setting
XPath 1.0 compatibility mode
Both
false
Statically known namespaces
XQuery
The pair
(output,http://www.w3.org/2010/xslt-xquery-serialization)
XSLT
The pairs
(output,http://www.w3.org/2010/xslt-xquery-serialization),
(xslt,http://www.w3.org/1999/XSL/Transform)
Default element/type namespace
Both
"none"
Default function namespace
Both
http://www.w3.org/2005/xpath-functions
In-scope schema types, In-scope element declarations,
Substitution groups, In-scope attribute declarations
Both
As defined by the schema for serialization parameters (B Schema for Serialization
Parameters ) and any additional implementation-defined
in-scope schema components
In-scope variables
Both
{param-name}
Context item static type
Both
node()
Statically-known function signatures
Both
{fn:data($arg as item()*) as
xs:anyAtomicType*
}
Statically known collations
Both
{ (http://www.w3.org/2005/xpath-functions/collation/codepoint,
The Unicode codepoint collation ) }
Default collation
Both
The Unicode codepoint collation
Construction mode
XQuery
strip
Ordering mode
XQuery
ordered
Default order for empty sequences
XQuery
least
Boundary space policy
XQuery
strip
Copy-namespaces mode
XQuery
(preserve,inherit)
Base URI
Both
Absent
Statically known documents
Both
None
Statically known collections
Both
None
Statically known default collection type
Both
node()*
Statically known decimal formats
Both
None
Set of named keys
XSLT
{}
Values of system properties
XSLT
None
Set of available instructions
XSLT
The empty set (not needed for evaluating these
expressions).
The remaining components of the dynamic context used in
evaluating the XQuery expressions or XSLT instructions
in the preceding table are as defined in the following table.
Dynamic Context Component
XQuery or XSLT
Setting
Context position
Both
1
Context size
Both
1
Variable values
Both
The param-name
variable has a value of type
xs:string
equal the local part of the name of the
serialization parameter under consideration
Function implementations
Both
The implementation of fn:data
Current dateTime
Both
Absent
Implicit timezone
Both
Absent
Available documents
Both
None
Available collections
Both
None
Default collection
Both
None
Current template rule
XSLT
Absent
Current mode
XSLT
The default mode
Current group
XSLT
Absent
Current grouping key
XSLT
Absent
Current captured substrings
XSLT
The empty sequence
Output state
XSLT
Temporary output state
In the case of the use-character-maps
parameter,
the XQuery expression
(validate lax { document { . } })
/output:serialization-parameters/output:use-character-maps
/output:character-map[@character eq $char]/string(@map-string)
or equivalently the XSLT instructions
<xsl:sequence>
<xsl:variable name="validated-instance">
<xsl:document validation="lax">
<xsl:sequence select="."/>
</xsl:document>
</xsl:variable>
<xsl:sequence select="$validated-instance
/output:serialization-parameters
/output:use-character-maps
/output:character-map[@character eq $char]
/string(@map-string)"/>
</xsl:sequence>
is evaluated for each Unicode character that is permitted in an
XML document. The dynamic context and static context used to
evaluate the expression are as defined above, except that in-scope
variables is the set {char
} and the value of the
variable "char
" is a value of type
xs:string
of length one whose value is the Unicode
character under consideration. If the result of evaluating the
expression is not an empty sequence, the pair consisting of the
Unicode character and the result of evaluating the expression is
part of the list of pairs in the value of the
use-character-maps
parameter. It is a serialization error
[err:SEPM0018 ] if
the result of evaluating this expression for any character is a
sequence of length greater than one.
Using the same settings of the components of the dynamic context
and static context, serialization error [err:SEPM0019 ] results if the result of
evaluating the following XQuery expression is not
true
(document { . })/output:serialization-parameters
/(count(distinct-values(*/node-name(.))) eq (count(*)))
or equivalently if the result of evaluating the following XSLT
instructions is not true.
<xsl:sequence>
<xsl:variable name="doc">
<xsl:document>
<xsl:sequence select="."/>
</xsl:document>
</xsl:variable>
<xsl:sequence
select="$doc/output:serialization-parameters
/(count(distinct-values(*/node-name(.))) eq (count(*)))"/>
</xsl:sequence>
The result of evaluating either will be false if the data model
instance supplies a value for any particular
serialization parameter more than once, or will be the empty
sequence if the data model instance does not have as its root node
an element node or a document node with an element node child,
where the local part of the name of the element node is
serialization-parameters
and the namespace URI is
http://www.w3.org/2010/xslt-xquery-serialization
.
Note:
A serializer or
implementation of a host language does not need to be accompanied
by an XQuery processor nor by a general-purpose schema validator in
order to meet the requirements of this section. It merely needs to
be capable of extracting values from an XDM instance that conforms
to the schema for serialization parameters, while checking that the
constraints implied by the schema and additional constraints
implied by the XQuery validate expression or explicitly stated in
this section are satisfied.
The host
language MAY provide additional mechanisms for
overriding the values of any serialization parameters specified
through the mechanism defined in this section, as well as
additional mechanisms for specifying the values of any
serialization parameters whose values are absent after
applying the mechanism defined in this section.
If the instance of the data model contains elements or
attributes that are in a namespace other than
http://www.w3.org/2010/xslt-xquery-serialization
, the
implementation MAY interpret them to specify the
values of implementation-defined serialization parameters in an
implementation-defined manner.
The following XML document, if converted to a data model
instance and processed using the mechanism described in this
section, would specify the settings of the method
,
version
and indent
serialization
parameters with the values xml
, 1.0
and
yes
, respectively.
<output:serialization-parameters
xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization">
<output:method value="xml"/>
<output:version value="1.0"/>
<output:indent value="yes"/>
</output:serialization-parameters>
The following document would specify the setting of the
cdata-section-elements
serialization parameter with
value the pair of expanded QNames
(http://example.org/book/chapter
,heading
)
and
(http://example.org/book
,footnote
)
<output:serialization-parameters
xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"
xmlns:book="http://example.org/book"
xmlns="http://example.org/book/chapter">
<output:cdata-section-elements value="heading book:footnote"/>
</output:serialization-parameters>
The following document would specify the value of the
method
serialization parameter with the value
html
.
Notice that in this example, the default namespace declaration
in scope has no effect on the interpretation of the setting of the
method
parameter.
<output:serialization-parameters
xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"
xmlns="http://example.org/ext">
<output:method value="html"/>
</output:serialization-parameters>
The following document would specify the value of the
method
serialization parameter with value equal to the
expanded QName (http://example.org/ext
,
jsp
), and the use-character-maps
parameter with value equal to the list of pairs, («, <%),
(», %>)
<output:serialization-parameters
xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"
xmlns:ext="http://example.org/ext">
<output:method value="ext:jsp"/>
<output:use-character-maps>
<output:character-map character="«" map-string="<%"/>
<output:character-map character="»" map-string="%>"/>
</output:use-character-maps>
</output:serialization-parameters>
4 Phases of
Serialization
Serialization comprises five phases of processing (preceded
optionally by the sequence normalization process described in
2 Sequence Normalization ).
For an implementation-defined output method, any of these
phases MAY be skipped or MAY be
performed in a different order than is specified here. For the
output methods defined in this specification, these phases are
carried out sequentially as follows:
A meta
element is added to the normalized sequence
along with discarding an existing meta
element, as
controlled by the include-content-type
parameter for
the XHTML and HTML output methods.
Markup generation produces the character representation
of those parts of the serialized result that describe the structure
of the normalized sequence. In the cases of the XML, HTML and XHTML
output methods, this phase produces the character representations
of the following:
the document type declaration;
start tags and end tags (except for attribute values, whose
representation is produced by the character expansion phase);
processing instructions; and
comments.
In the cases of the XML and XHTML output methods, this phase
also produces the following:
In the case of the text output method, this phase replaces
the single document node produced by sequence
normalization with a new document node that has exactly one
child, which is a text node. The string value of the new text node
is the string value of the document node that was produced by
sequence normalization.
Character expansion is concerned with the
representation of characters appearing in text and attribute
nodes in the normalized
sequence. For each text and attribute node , the following rules are applied in
sequence.
If the node is an attribute that is a URI attribute
value and the escape-uri-attributes
parameter is
set to require escaping of URI attributes, apply URI escaping as defined
below, and skip rules b-e. Otherwise, continue with rule b.
[Definition : URI escaping consists of the
following three steps applied in sequence to the content of
URI
attribute values :]
normalize to NFC using the method defined in Section
5.4.6 fn:normalize-unicode FO30
percent-encode any special characters in the URI using the
method defined in Section
6.4 fn:escape-html-uri FO30
escape according to the rules of the XML or HTML output
method, whichever is applicable, any characters that require
escaping, and any characters that cannot be represented in the
selected encoding. For example, replace <
with
<
. (See also section 7.3 Writing Character Data )
[Definition : The values of attributes
listed in D List of URI
Attributes are URI attribute values . Attributes are
not considered to be URI attributes simply because they are
namespace declaration attributes or have the type annotation
xs:anyURI
.]
If the node is a text node whose parent element is selected by
the rules of the cdata-section-elements
parameter for
the applicable output method, create CDATA sections as described
below, and skip rules c-e. Otherwise, continue with rule c.
Apply the following two processes in sequence to create CDATA
sections
Unicode Normalization if requested by
the normalization-form
parameter.
apply changes as detailed in the description of the
cdata-section-elements
parameter for the applicable
output method.
Apply character mapping as determined by the
use-character-maps
parameter for the applicable output
method. For characters that were substituted by this process, skip
rules d and e. For the remaining characters that were not modified
by character mapping, continue with rule d.
Apply Unicode Normalization if requested by
the normalization-form
parameter.
[Definition : Unicode
Normalization is the process of removing alternate
representations of equivalent sequences from textual data, to
convert the data into a form that can be binary-compared for
equivalence, as specified in [UAX #15: Unicode Normalization
Forms] . For specific recommendations for character
normalization on the World Wide Web, see [Character Model for the World Wide Web 1.0:
Normalization] . ]
The meanings associated with the possible values of the
normalization-form
parameter are defined in section
5.1.9 XML Output Method: the
normalization-form Parameter .
Continue with step e.
Escape according to the rules of the XML or HTML output
method, whichever is applicable, any characters (such as
<
and &
) where XML or HTML
requires escaping, and any characters that cannot be represented in
the selected encoding. For example, replace <
with
<
. (See also section 7.3 Writing Character Data ). For
characters such as >
where XML defines a built-in
entity but does not require its use in all circumstances, it is
implementation-dependent whether the character is escaped.
Indentation , as controlled by the indent
parameter and the suppress-indentation
parameter , MAY add or remove whitespace
according to the rules defined by the applicable output method.
Encoding , as controlled by the encoding
parameter, converts the character stream produced by the previous
phases into an octet stream.
Note:
Serialization is only defined in terms of encoding the result as
a stream of octets. However, a serializer MAY provide an option
that allows the encoding phase to be skipped, so that the result of
serialization is a stream of Unicode characters. The effect of any
such option is implementation-defined , and a serializer is not required to support such
an option.
5 XML Output
Method
The XML output method serializes the normalized sequence as an
XML entity that MUST satisfy the rules for either
a well-formed XML document entity or a well-formed XML external
general parsed entity, or both. A serialization error [err:SERE0003 ] results if the serializer is unable to satisfy
those rules, except for content modified by the character expansion
phase of serialization, as described in 4
Phases of Serialization . The effects of the character
expansion phase could result in the serialized output being not
well-formed, but will not result in a serialization error .
If a serialization error results, the serializer
MUST signal the error.
If the document node of the
normalized sequence has a single element node child and no text node children, then the serialized output is a
well-formed XML document entity, and the serialized output
MUST conform to the appropriate version of the XML
Namespaces Recommendation [XML Names] or
[XML Names 1.1] . If the normalized
sequence does not take this form, then the serialized output is a
well-formed XML external general parsed entity, which, when
referenced within a trivial XML document wrapper like this:
<?xml version="version "?>
<!DOCTYPE doc [
<!ENTITY e SYSTEM "entity-URI ">
]>
<doc>&e;</doc>
where entity-URI
is a URI for the entity, and the
value of the version
pseudo-attribute is the value of
the version
parameter, produces a document which
MUST itself be a well-formed XML document
conforming to the corresponding version of the XML Namespaces
Recommendation [XML Names] or [XML Names 1.1] .
[Definition : A reconstructed tree
may be constructed by parsing the XML document and converting it
into an instance of the data model as specified in [XQuery and XPath Data Model (XDM) 3.0] .]
The result of serialization MUST be such that the
reconstructed tree is the same as the
result tree except
for the following permitted differences:
If the document was produced by adding a document wrapper, as
described above, then it will contain an extra doc
element as the document element.
The order of attribute and namespace nodes in the two trees MAY be
different.
The following properties of corresponding nodes in the two trees MAY be
different:
the base-uri property of document nodes and element nodes ;
the document-uri and unparsed-entities properties of document
nodes ;
the type-name and typed-value properties of element and
attribute nodes ;
the nilled property of element nodes ;
the content property of text nodes , due to the effect of the indent
and use-character-maps
parameters.
The reconstructed tree MAY
contain additional attributes and text nodes resulting from the expansion of default and
fixed values in its DTD or schema; also, in the presence of a DTD,
non-CDATA attributes may lose whitespace characters as a result of
attribute value normalization.
The type annotations of the nodes in the two trees MAY be
different. Type annotations in a result tree are discarded when the tree is
serialized. Any new type annotations obtained by parsing the
document will depend on whether the serialized XML document is
assessed against a schema, and this MAY result in
type annotations that are different from those in the original
result tree .
Note:
In order to influence the type annotations in the instance of
the data model that would result from processing a serialized XML
document, the author of the XSLT stylesheet, XQuery expression or
other process might wish to create the instance of the data model
that is input to the serialization process so that it makes use of
mechanisms provided by [XML Schema] ,
such as xsi:type
and xsi:schemaLocation
attributes. The serialization process will not automatically create
such attributes in the serialized document if those attributes were
not part of the result
tree that is to be serialized.
Similarly, it is possible that an element node in the instance of the data model that is to be
serialized has the nilled
property with the value
true
, but no xsi:nil
attribute. The
serialization process will not create such an attribute in the
serialized document simply to reflect the value of the property.
The value of the nilled
property has no direct effect
on the serialized result.
Additional namespace nodes
MAY be present in the reconstructed
tree if the serialization process did not undeclare one or more
namespaces, as described in 5.1.8
XML Output Method: the undeclare-prefixes Parameter , and
the starting instance of the data model contained an element
node with a namespace node that declared some prefix, but a
child element of that node did
not have any namespace node
that declared the same prefix.
The result tree
MAY contain namespace nodes that are not present in the reconstructed
tree , as the process of creating an instance of the data model
MAY ignore namespace declarations in some
circumstances. See Section
6.2.3 Construction from an Infoset
DM30 and Section
6.2.4 Construction from a PSVI DM30
of [XQuery and XPath Data Model (XDM)
3.0] for additional information.
If the indent
parameter has the value
yes
,
See 5.1.4 XML Output Method: the indent
and suppress-indentation Parameters for more information on
the indent
parameter.
Additional nodes
MAY be present in the reconstructed
tree due to the effect of character mapping in the character
expansion phase, and the values of attribute nodes and text nodes in the reconstructed tree MAY
be different from those in the result tree , due to the effects of URI
expansion, character mapping and Unicode Normalization in the
character expansion phase of serialization.
Note:
The use-character-maps
parameter can cause
arbitrary characters to be inserted into the serialized XML
document in an unescaped form, including characters that would be
considered to be part of XML markup. Such characters could result
in arbitrary new element nodes ,
attribute nodes , and so on, in
the reconstructed tree that results from
processing the serialized XML document.
A consequence of this rule is that certain characters
MUST be output as character references, to ensure
that they survive the round trip through serialization and parsing.
Specifically, CR, NEL and LINE SEPARATOR characters in text
nodes MUST be
output respectively as "
",
"…
", and "

", or
their equivalents; while CR, NL, TAB, NEL and LINE SEPARATOR
characters in attribute nodes
MUST be output respectively as
"
", "

",
"	
", "…
", and
"

", or their equivalents. In addition, the
non-whitespace control characters #x1 through #x1F and #x7F through
#x9F in text nodes and
attribute nodes
MUST be output as character references.
For example, an attribute with the value "x" followed by "y"
separated by a newline will result in the output
"x
y"
(or with any equivalent character
reference). The XML output cannot be "x" followed by a literal
newline followed by a "y" because after parsing, the attribute
value would be "x y"
as a consequence of the XML
attribute normalization rules.
Note:
XML 1.0 did not permit an XML processor to normalize NEL or LINE
SEPARATOR characters to a LINE FEED character. However, if a
document entity that specifies version 1.1 invokes an external
general parsed entity with no text declaration or a text
declaration that specifies version 1.0, the external parsed entity
is processed according to the rules of XML 1.1. For this reason,
NEL and LINE SEPARATOR characters in text and attribute nodes MUST always be
escaped using character references, regardless of the value of the
version
parameter.
XML 1.0 permitted control characters in the range #x7F through
#x9F to appear as literal characters in an XML document, but XML
1.1 requires such characters, other than NEL, to be escaped as
character references. An external general parsed entity with no
text declaration or a text declaration that specifies a version
pseudo-attribute with value 1.0
that is invoked by an
XML 1.1 document entity MUST follow the rules of
XML 1.1. Therefore, the non-whitespace control characters in the
ranges #x1 through #x1F and #x7F through #x9F MUST
always be escaped, regardless of the value of the
version
parameter.
It is a serialization error [err:SEPM0004 ] to specify the
doctype-system parameter, or to specify the standalone parameter
with a value other than omit
, if the instance of the
data model contains text nodes
or multiple element nodes as
children of the root node . The
serializer
MUST either signal the error, or recover by
ignoring the request to output a document type declaration or
standalone
parameter.
5.1 The Influence of
Serialization Parameters upon the XML Output Method
5.1.1 XML Output
Method: the version
Parameter
The version
parameter specifies the version of XML
and the version of Namespaces in XML to be used for outputting the
instance of the data model. The version output in the XML
declaration (if an XML declaration is not omitted)
MUST correspond to the version of XML that the
serializer used for
outputting the instance of the data model. The value of the
version
parameter MUST match the
VersionNum XML
production of the XML Recommendation [XML10] or
[XML11] . A serialization error
[err:SESU0013 ]
results if the value of the version
parameter
specifies a version of XML that is not supported by the serializer ; the serializer
MUST signal the error.
This document provides the normative definition of serialization
for the XML output method if the version
parameter has
either the value 1.0
or 1.1
. For any
other value of version
parameter, the behavior is
implementation-defined . In that case the implementation-defined
behavior MAY supersede all other requirements of
this recommendation.
If the serialized result would contain an NCName Names
that contains a character that is not permitted by the version of
Namespaces in XML specified by the version
parameter,
a serialization
error [err:SERE0005 ] results. The serializer MUST signal the
error.
If the serialized result would contain a character that is not
permitted by the version of XML specified by the
version
parameter, a serialization error [err:SERE0006 ] results. The serializer
MUST signal the error.
For example, if the version
parameter has the value
1.0
, and the instance of the data model contains a
non-whitespace control character in the range #x1 to #x1F, a
serialization
error [err:SERE0006 ] results. If the
version
parameter has the value 1.1
and a
comment node in the instance of
the data model contains a non-whitespace control character in the
range #x1 to #x1F or a control character other than NEL in the
range #x7F to #x9F, a serialization error [err:SERE0006 ] results.
5.1.2 XML
Output Method: the html-version
Parameter
The html-version
parameter is not applicable to the
XML output method. It is the responsibility of the host language to specify
whether an error occurs if this parameter is specified in
combination with the XML output method, or if the parameter is
simply dropped.
5.1.3 XML Output
Method: the encoding
Parameter
The encoding
parameter specifies the encoding to be
used for outputting the instance of the data model. Serializers are
REQUIRED to support values of UTF-8
and UTF-16
. A serialization error [err:SESU0007 ] occurs if an output encoding
other than UTF-8
or UTF-16
is requested
and the serializer
does not support that encoding. The serializer MUST signal the
error, or recover by using UTF-8
or
UTF-16
instead. The serializer MUST NOT use an
encoding whose name does not match the EncName XML
production of the XML Recommendation [XML10] .
When outputting a newline character in the instance of the data
model, the serializer
is free to represent it using any character sequence that will be
normalized to a newline character by an XML parser, unless a
specific mapping for the newline character is provided in a
character map (see 9 Character
Maps ).
When outputting any other character that is defined in the
selected encoding, the character MUST be output
using the correct representation of that character in the selected
encoding.
It is possible that the instance of the data model will contain
a character that cannot be represented in the encoding that the
serializer is using
for output. In this case, if the character occurs in a context
where XML recognizes character references (that is, in the value of
an attribute node or text
node ), then the character
MUST be output as a character reference. A
serialization
error [err:SERE0008 ] occurs if such a character appears
in a context where character references are not allowed (for
example, if the character occurs in the name of an element). The
serializer
MUST signal the error.
For example, if a text node
contains the character LATIN SMALL LETTER E WITH ACUTE (#xE9), and
the value of the encoding
parameter is
US-ASCII
, the character MUST be
serialized as a character reference. If a comment node contains the same character, a serialization error
[err:SERE0008 ]
results.
5.1.4 XML Output
Method: the indent
and
suppress-indentation
Parameters
The indent
and suppress-indentation
parameters control whether the serializer MAY adjust the
whitespace in the serialized result so that a person will find it
easier to read. If the indent
parameter has the value
yes
, the serializer MAY output whitespace
characters in addition to the whitespace characters in the instance
of the data model. It MAY also elide from the
output whitespace characters that occurred in the instance of the
data model or replace such whitespace characters with other
whitespace characters.
[Definition :
The term content has the same meaning as the term Content XML
defined in Section 3.1
Start-Tags, End-Tags, and Empty-Element
Tags XML of [XML10] .] [Definition : The
immediate content of an element is the part of the content of the element that is not
also in the content of a
child element of that element.]
If the indent
parameter has the value
no
, the serializer MUST NOT output any
additional, elide or replace whitespace characters. If the
indent
parameter has the value yes
, the
serializer
MUST use an algorithm for dealing with whitespace
characters that satisfies all of the following constraints.
If more than one constraint applies, the serializer
MUST apply the most restrictive constraint. That
is, if any applicable constraint indicates that whitespace
MUST NOT be added, elided or replaced, that
constraint prevails; if an applicable constraint indicates that
whitespace SHOULD NOT be added, elided or
replaced, while all other applicable constraints indicate that
whitespace MAY be added, elided or replaced,
whitespace SHOULD NOT be added, elided or
replaced.
Whitespace characters MAY be added
adjacent to a text node
only if the text node contains only whitespace characters.
Whitespace characters in such a text node MAY also
be elided or replaced. For example, a tab MAY be
inserted as a replacement for existing spaces .
Whitespace characters MAY be added, elided or
replaced in the immediate content of an element whose type
annotation is xs:untyped
or xs:anyType
and that has element node children, in the immediate content
of an element whose content model is element only, or outside the
content of any element.
Whitespace characters MUST NOT be added,
elided or replaced in the immediate content of an element whose
content model is known to be simple or empty .
Whitespace characters SHOULD NOT be added,
elided or replaced in places where the characters
would constitute significant whitespace, for example, in the
immediate
content of an element that is annotated with a type other
than xs:untyped
or xs:anyType
, and
whose content model is known to be mixed.
Whitespace characters MUST NOT be added,
elided or replaced in the content of an element whose expanded
QName is a member of the list of expanded QNames in the value of
the suppress-indentation
parameter.
Whitespace characters MUST NOT be added,
elided or replaced in a part of the result document that is
controlled by an xml:space
attribute with value
preserve
. (See [XML10] for more
information about the xml:space
attribute.)
Note:
The effect of these rules is to ensure that whitespace is only
added in places where (a) XSLT's
<xsl:strip-space>
declaration could cause it to
be removed, and (b) it does not affect the string value of any element node with simple content. It is usually
not safe to indent document types that include elements with mixed
content.
Note:
The whitespace added may possibly be based on whitespace
stripped from either the source document or the stylesheet (in the
case of XSLT), or guided by other means that might depend on the
host language ,
in the case of an instance of the data model created using some
other process.
5.1.5 XML Output Method: the
cdata-section-elements
Parameter
The cdata-section-elements
parameter contains a
list of expanded QNames. If the expanded QName of the parent of a
text node is a member of the
list, then the text node
MUST be output as a CDATA section, except in those
circumstances described below.
If the text node contains
the sequence of characters ]]>
, then the currently
open CDATA section MUST be closed following the
]]
and a new CDATA section opened before the
>
.
If the text node contains
characters that are not representable in the character encoding
being used to output the instance of the data model, then the
currently open CDATA section MUST be closed before
such characters, the characters MUST be output
using character references or entity references, and a new CDATA
section MUST be opened for any further characters
in the text node .
CDATA sections MUST NOT be used except where
they have been explicitly requested by the user, either by using
the cdata-section-elements
parameter, or by using some
other implementation-defined mechanism.
Note:
This is phrased to permit an implementor to provide an option
that attempts to preserve CDATA sections present in the source
document.
5.1.6 XML Output Method: the
omit-xml-declaration
and standalone
Parameters
The XML output method MUST output an XML
declaration if the omit-xml-declaration
parameter has
the value no
. The XML declaration
MUST include both version information and an
encoding declaration. If the standalone
parameter has
the value yes
or the value no
, the XML
declaration MUST include a standalone document
declaration with the same value as the value of the
standalone
parameter. If the standalone
parameter has the value omit
, the XML declaration
MUST NOT include a standalone document
declaration; this ensures that it is both an XML declaration
(allowed at the beginning of a document entity) and a text
declaration (allowed at the beginning of an external general parsed
entity).
A serialization error [err:SEPM0009 ] results if the
omit-xml-declaration
parameter has the value
yes
, and
The serializer
MUST signal the error.
Otherwise, if the omit-xml-declaration
parameter
has the value yes
, the XML output method MUST
NOT output an XML declaration.
5.1.7 XML Output
Method: the doctype-system
and
doctype-public
Parameters
If the doctype-system
parameter is specified, the
XML output method MUST output a document type
declaration immediately before the first element. The name
following <!DOCTYPE
MUST be the
name of the first element, if any. If the
doctype-public
parameter is also specified, then the
XML output method MUST output PUBLIC
followed by the public identifier and then the system identifier;
otherwise, it MUST output SYSTEM
followed by the system identifier. The internal subset
MUST be empty. The doctype-public
parameter MUST be ignored unless the
doctype-system
parameter is specified.
5.1.8 XML
Output Method: the undeclare-prefixes
Parameter
The Data Model allows an element node that binds a non-empty prefix to have a child
element node that does not bind
that same prefix. In Namespaces in XML 1.1 ([XML Names 1.1] ), this can be represented
accurately by undeclaring prefixes. For the undeclaring prefix of
the child element node, if the undeclare-prefixes
parameter has the value yes
, the output method is XML
or XHTML, and the version
parameter value is greater
than 1.0
, the serializer MUST undeclare its
namespace. If the undeclare-prefixes
parameter has the
value no
and the output method is XML or XHTML, then
the undeclaration of prefixes MUST NOT occur.
Consider an element x:foo
with four in-scope
namespaces that associate prefixes with URIs as follows:
x
is associated with
http://example.org/x
y
is associated with
http://example.org/y
z
is associated with
http://example.org/z
xml
is associated with
http://www.w3.org/XML/1998/namespace
Suppose that it has a child element x:bar
with
three in-scope namespaces:
x
is associated with
http://example.org/x
y
is associated with
http://example.org/y
xml
is associated with
http://www.w3.org/XML/1998/namespace
If namespace undeclaration is in effect, it will be serialized
this way:
<x:foo xmlns:x="http://example.org/x"
xmlns:y="http://example.org/y"
xmlns:z="http://example.org/z">
<x:bar xmlns:z="">...</x:bar>
</x:foo>
In Namespaces in XML 1.0 ([XML
Names] ), prefix undeclaration is not possible. If the output
method is XML or XHTML, the value of the
undeclare-prefixes
parameter is yes
, and
the value of the version
parameter is
1.0
, a serialization error [err:SEPM0010 ] results; the serializer
MUST signal the error.
5.1.9 XML Output Method: the
normalization-form
Parameter
The normalization-form
parameter is applicable to
the XML output method. The values NFC
and
none
MUST be supported by the
serializer . A
serialization
error [err:SESU0011 ] results if the value of the
normalization-form
parameter specifies a normalization
form that is not supported by the serializer ; the serializer MUST signal the
error.
The meanings associated with the possible values of the
normalization-form
parameter are as follows:
If the value of the parameter is fully-normalized
,
then no relevant construct of the parsed entity created by
the serializer may
start with a composing character. The term relevant
construct has the meaning defined in section 2.13 of [XML11] . If this condition is not satisfied, a
serialization
error [err:SERE0012 ] MUST be
signaled.
Note:
Specifying fully-normalized
as the value of this
parameter does not guarantee that the XML document output by the
serializer will in
fact be fully normalized as defined in [XML11] . This is because the serializer does not check that the text is
include normalized
, which would involve checking all
external entities that it refers to (such as an external DTD).
Furthermore, the serializer does not check whether any character
escape generated using character maps represents a composing
character.
5.1.10 XML
Output Method: the media-type
Parameter
The media-type
parameter is applicable to the XML
output method. See 3 Serialization
Parameters for more information.
5.1.11 XML Output Method: the
use-character-maps
Parameter
The use-character-maps
parameter is applicable to
the XML output method. The result of serialization using the XML
output method is not guaranteed to be well-formed XML if character
maps have been specified. See 9
Character Maps for more information.
5.1.12 XML Output Method: the
byte-order-mark
Parameter
The byte-order-mark
parameter is applicable to the
XML output method. See 3 Serialization
Parameters for more information.
Note:
The byte order mark may be undesirable under certain
circumstances; for example, to concatenate resulting XML fragments
without additional processing to remove the byte order mark.
Therefore this specification does not mandate the
byte-order-mark
parameter to have the value
yes
when the encoding is UTF-16, even though the XML
1.0 and XML 1.1 specifications state that entities encoded in
UTF-16 MUST begin with a byte order mark.
Consequently, this specification does not guarantee that the
resulting XML fragment, without a byte order mark, will not cause
an error when processed by a conforming XML processor.
5.1.13 XML Output Method: the
escape-uri-attributes
Parameter
The escape-uri-attributes
parameter is not
applicable to the XML output method. It is the responsibility of
the host
language to specify whether an error occurs if this parameter
is specified in combination with the XML output method, or if the
parameter is simply dropped.
5.1.14 XML Output Method: the
include-content-type
Parameter
The include-content-type
parameter is not
applicable to the XML output method. It is the responsibility of
the host
language to specify whether an error occurs if this parameter
is specified in combination with the XML output method, or if the
parameter is simply dropped.
5.1.15
XML Output Method: the item-separator
Parameter
The effect of the item-separator
serialization
parameter is described in 2 Sequence
Normalization .
6 XHTML Output
Method
The XHTML output method serializes the instance of the data
model as XML, using the HTML compatibility guidelines defined in
the XHTML specification ([XHTML 1.0] or
the XHTML syntax of current drafts of HTML5 and
related specifications (see [HTML5]
and [Polyglot] ) .
At the time this document was published, the current version of
[HTML5] was that cited in A.1 Normative References . Like
all draft W3C specifications, [HTML5] is
subject to revision before final publication as a W3C
Recommendation. For all information normatively derived in this
specification from [HTML5] , processors
conforming to this specification MUST take the
information in question from the version cited in A.1 Normative References , or
from later versions of [HTML5] published by
W3C. If they take the information from versions other than the one
cited in A.1 Normative
References , then it is implementation-defined which future version of
[HTML5] is used as the source of the
information, including the lists of elements recognized as HTML elements , void elements , phrasing content , and
Boolean
attributes . If future versions of [HTML5]
differ from the current draft in any of these areas,
implementations MAY support multiple versions, and
MAY provide a user option for choosing which one
to use.
[Definition : An element node is
recognized as an HTML element by the XHTML output method
if]
the element node is in the XHTML namespace , regardless of the value of
the html-version
serialization parameter or if
the html-version
serialization parameter is
absent ; or
the value of the html- version
serialization parameter is 5.0
, the element has a
null
namespace URI , and the local part of the name is equal to the
name of an element defined by HTML5 [HTML5] ,
making the comparison without regard to case .
Note:
As noted elsewhere, processors conforming to this specification
MUST support the list of elements defined in the
version of [HTML5] current at the time this
specification is published, or that given in some later version of
[HTML5] . If they support the list in a later
version, it is implementation-defined which version of [HTML5] they support.
It is entirely the responsibility of the person or process that
creates the instance of the data model to ensure that the instance
of the data model conforms to the [XHTML 1.0]
or [XHTML 1.1] specification if the
html-version
serialization parameter is absent or has
a value less than 5.0
or that it
conforms to the XHTML syntax of HTML5 if the
value of the html-version
serialization parameter is
5.0
. It is not an error if the instance of the
data model is invalid XHTML. Equally, it is entirely under the
control of the person or process that creates the instance of the
data model whether the output conforms to XHTML 1.0 Strict, XHTML
1.0 Transitional, the XHTML syntax of HTML5 (see [HTML5] ) , [Polyglot] or any other specific definition of
XHTML.
The serialization of the instance of the data model follows the
same rules as for the XML output method, with the general
exceptions noted below and parameter-specific exceptions in
6.1 The Influence of Serialization
Parameters upon the XHTML Output Method . These differences
are based on the HTML compatibility guidelines published in
Appendix C of [XHTML 1.0] and on
[Polyglot] , both of which are
designed to ensure that as far as possible, XHTML is rendered
correctly on user agents designed originally to handle HTML.
If the value of the html-version
serialization
parameter is 5.0
, the instance of the data model that
is to be serialized is first subjected to prefix
normalization .
[Definition : During prefix
normalization , any element node in the instance of the data
model that is to be serialized that is in one of the XHTML namespace , the
SVG namespace or
the MathML
namespace has its name replaced by the local part of its name.
Such an element node is given a default namespace node whose value
is the element's namespace URI. Any namespace node for any of those
three namespaces that was previously present on any element node in
the instance of the data model is also removed, unless the prefix
that that namespace node declared is used as the prefix on the name
of an attribute on that element or an ancestor of that
element.]
The process of prefix normalization is equivalent to
replacing the instance of the data model that is to be serialized
with the result of the transformation described by this XSLT
stylesheet, with the instance of the data model as the initial
context item.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns:mathml="http://www.w3.org/1998/Math/MathML">
<xsl:template match="xhtml:*|svg:*|mathml:*">
<xsl:element name="{local-name()}" namespace="{namespace-uri()}">
<xsl:call-template name="copy-namespace-nodes"/>
<xsl:apply-templates select="@*|node()"/>
</xsl:element>
</xsl:template>
<xsl:template match="node()|@*">
<xsl:copy copy-namespaces="no">
<xsl:call-template name="copy-namespace-nodes"/>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template name="copy-namespace-nodes">
<xsl:copy-of select="namespace::*
[not(. = ('http://www.w3.org/1999/xhtml',
'http://www.w3.org/2000/svg',
'http://www.w3.org/1998/Math/MathML'))]"/>
</xsl:template>
</xsl:stylesheet>
[Definition : The following XHTML elements have an
EMPTY content model: area
, base
,
br
, col
, embed
,
hr
, img
, input
,
link
, meta
, basefont
,
frame
, isindex
, and
param
.]
[Definition :
The void elements of HTML5 are area
,
base
, br
, col
,
embed
, hr
, img
,
input
, keygen
, link
,
meta
, param
, source
,
track
and wbr
.]
Note:
This list of void elements is that given for
void elements in section
8.1.2 of the draft of [HTML5] current at
the time this document is published. As noted elsewhere, processors
conforming to this specification MAY support the
list of void elements included in later versions of [HTML5] .
[Definition : An element node is expected to
be empty if it is recognized as an HTML element and if
either]
the html-version
serialization parameter is absent
or has a value less than 5.0
and the content model is
EMPTY , or
the html-version
serialization parameter has the
value 5.0
and the element is a void element.
If an element node that has no child nodes is
not expected to be empty , the
serializer
MUST NOT use the minimized form. That is, it
MUST output <p></p>
and
not <p />
.
If an element that has no children is
expected
to be empty , the serializer MUST use the
minimized tag syntax, for example <br />
,
as the alternative syntax <br></br>
allowed by XML gives uncertain results in many legacy
user agents. If the html-version
serialization parameter is absent or has a value less
than 5.0
, the serializer MUST include a space
before the trailing />
, e.g.
<br />
, <hr />
and
<img src="karen.jpg" alt="Karen" />
.
If the html-version
serialization parameter is absent or has a value less
than 5.0
, the serializer MUST NOT use the
entity reference '
which, although
defined in XML and therefore in XHTML, is not defined
in versions of HTML prior to HTML5, and
is not recognized by all HTML user agents.
If the html-version
serialization parameter is absent or has a value less
than 5.0
, the serializer SHOULD output
namespace declarations in a way that is consistent with the
requirements of the XHTML DTD if this is possible. If the
value of the html-version
serialization
parameter is 5.0
, the serializer SHOULD output
namespace declarations in a way that ensures that the
namespace declarations in the resulting document are quirk-compatible
(as defined in 6.2 Polyglot
markup and namespace declarations ) .
Note:
If the html
element is generated by an XSLT literal
result element of the form <html
xmlns="http://www.w3.org/1999/xhtml"> ... </html>
,
or by an XQuery direct element constructor of the same form, then
the html
element in the result document will have a
node name whose prefix is "",
which will satisfy the requirements of the DTD. In other cases the
prefix assigned to the element is implementation-dependent.
Note:
The XHTML 1.0 DTDs require the declaration
xmlns="http://www.w3.org/1999/xhtml"
to appear on the
html
element, and only on the html
element. The [Polyglot] specification
(see also 6.2 Polyglot markup
and namespace declarations below) permits namespace
declarations to appear in a conforming document, but there are
restrictions on which elements they can appear.
The serializer MUST output namespace
declarations that are consistent with the namespace nodes present in the result tree , but it
SHOULD avoid outputting redundant
namespace declarations on elements where the DTD would make them
invalid, for versions prior to HTML5, or where they would not
be quirk-compatible , for serialization
according to the syntax of HTML5 .
Note:
[Polyglot] and
Appendix C of [XHTML 1.0] describe a number
of compatibility guidelines for users of XHTML who wish to render
their XHTML documents with HTML user agents. In some cases, such as
the guideline on the form empty elements take, only the
serialization process itself has the ability to follow the
guideline. In such cases, those guidelines are reflected in the
requirements on the serializer described above.
In all other cases, the guidelines can be adhered to by the
instance of the data model that is input to the serialization
process. The guideline on the use of whitespace characters in
attribute values is one such example. Another example is that
xml:lang="..."
does not serialize to both
xml:lang="..."
and lang="..."
as required
by some legacy user agents. It is the responsibility of the person
or process that creates the instance of the data model that is
input to the serialization process to ensure it is created in a way
that is consistent with the guidelines. No serialization error
results if the input instance of the data model does not adhere to
the guidelines.
6.1 The Influence
of Serialization Parameters upon the XHTML Output Method
6.1.2
XHTML Output Method: the html-version
Parameter
The html-version
parameter specifies whether the
XHTML output method will produce a serialized document following
rules that are tailored to the requirements of the XHTML syntax of
[HTML5] or the requirements of [XHTML 1.0] and [XHTML
1.1] .
The differences are described in detail throughout 6 XHTML Output Method .
6.1.4 XHTML Output
Method: the indent
and
suppress-indentation
Parameters
If the indent
parameter has the value
yes
, the serializer MAY add or remove
whitespace as it serializes the result tree , if it observes the following
constraints.
Whitespace MUST NOT be added other than
before or after an element, or adjacent to an existing whitespace
character.
Whitespace MUST NOT be added or removed
adjacent to an inline element. The inline elements are those
elements recognized as HTML elements that
are in the %inline category of any of the XHTML 1.0 DTD's,
in the %inline.class category of the XHTML 1.1 DTD, those
elements defined to be phrasing content in [HTML5] , and elements recognized as HTML elements with
local names ins
and del
if they are used
as inline elements (i.e., if they do not contain element
children).
[Definition : The elements listed as
phrasing content in [HTML5] are:
a
, abbr
, area
(if it is a
descendant of a map element), audio
, b
,
bdi
, bdo
, br
,
button
, canvas
, cite
,
code
, data
, datalist
,
del
, dfn
, em
,
embed
, i
, iframe
,
img
, input
, ins
,
kbd
, keygen
, label
,
map
, mark
, math
,
meter
, noscript
, object
,
output
, progress
, q
,
ruby
, s
, samp
,
script
, select
, small
,
span
, strong
, sub
,
sup
, svg
, template
,
textarea
, time
, u
,
var
, video
, and wbr
.]
Note:
This list of phrasing content is that given in section
3.2.4.1.5 Phrasing content of the draft of [HTML5] current at the time this document is
published. As noted elsewhere, processors conforming to this
specification MAY support the list of
phrasing-content elements included in later versions of [HTML5] .
Whitespace MUST NOT be added or removed
inside a formatted element, the formatted elements being those
recognized as HTML elements with
local names pre
, script
,
style
, title
, and
textarea
.
Whitespace characters MUST NOT be added in the
content of an element whose expanded QName matches a
member of the list of expanded QNames in the value of the
suppress-indentation
parameter. The expanded
QName of an element node is considered to match a member of the
list of expanded QNames if:
Note:
The effect of the above constraints is to ensure any
insertion or deletion of whitespace would not affect how
an HTML user agent that conforms to the
specified version of HTML would render the output, assuming
the serialized document does not refer to any HTML style
sheets.
The HTML definition of whitespace is different from the XML
definition: see section 9.1 of [HTML] 4.01
specification.
6.1.6 XHTML Output Method: the
omit-xml-declaration
and standalone
Parameters
The behavior for omit-xml-declaration
and
standalone
parameters for the XHTML output method is
described in 5.1.6 XML
Output Method: the omit-xml-declaration and standalone
Parameters .
Note:
As with the XML output method, the XHTML output method specifies
that an XML declaration will be output unless it is suppressed
using the omit-xml-declaration
parameter. Appendix C.1
of [XHTML 1.0] provides advice on the
consequences of including, or omitting, the XML declaration.
6.1.7 XHTML
Output Method: the doctype-system
and
doctype-public
Parameters
If the value of the html-version
serialization parameter is 5.0
, the
doctype-system
serialization parameter is absent,
the first element node child of the document node that
is to be serialized is recognized as an HTML element , the
local part of the QName of which is equal to the string
HTML
, without regard to case , and any text
node preceding that element in document order contains only
whitespace characters, then the XHTML output method
MUST output a document type declaration
immediately before the first element, with no public or system
identifier. The name following <!DOCTYPE
MUST be the same as the local part of the
name of the element .
Otherwise, the behavior for
doctype-system
and doctype-public
parameters for the XHTML output method is described in 5.1.7 XML Output Method: the doctype-system and
doctype-public Parameters .
6.1.13 XHTML Output Method: the
escape-uri-attributes
Parameter
If the escape-uri-attributes
parameter has the
value yes
, the XHTML output method
MUST apply URI escaping to URI attribute values , except that
relative URIs MUST NOT be absolutized.
Note:
This escaping is deliberately confined to non-ASCII characters,
because escaping of ASCII characters is not always appropriate, for
example when URIs or URI fragments are interpreted locally by the
HTML user agent. Even in the case of non-ASCII characters, escaping
can sometimes cause problems. More precise control of URI escaping is therefore
available by setting escape-uri-attributes
to
no
, and controlling the escaping of URIs by using
methods defined in Section
6.2 fn:encode-for-uri FO30 and
Section
6.3 fn:iri-to-uri FO30 .
6.1.14 XHTML Output Method: the
include-content-type
Parameter
If the instance of the data model includes a head
element recognized as an HTML element , and the
include-content-type
parameter has the value
yes
, the XHTML output method MUST add
a meta
element as the first child element of the
head
element, specifying the character encoding
actually used.
For example,
<head>
<meta http-equiv="Content-Type" content="text/html; charset=EUC-JP" />
...
The content type SHOULD be set to the value
given for the media-type
parameter.
Note:
It is recommended that the host language use as default value for this
parameter one of the MIME types ([RFC2046] )
registered for XHTML. Currently, these are text/html
(registered by [RFC2854] ) and
application/xhtml+xml
(registered by [RFC3236] ). Note that some user agents fail to
recognize the charset parameter if the content type is not
text/html
.
If a meta
element has been added to the
head
element as described above, then any existing
meta
element child of the head
element
having an http-equiv
attribute with the value
"Content-Type", making the comparison without regard to
case after first stripping leading and trailing spaces from the
value of the attribute solely for the purposes of
comparison, MUST be discarded.
Note:
This process removes possible parameters in the attribute value.
For example,
<meta http-equiv="Content-Type" content="text/html;version='3.0'" />
in the data model instance would be replaced by,
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
6.1.15 XHTML Output Method: the
item-separator
Parameter
The effect of the item-separator
serialization
parameter is described in 2 Sequence
Normalization .
6.2 Polyglot markup and namespace
declarations
[Definition : The namespace declarations in a
document are quirk-compatible if and only if the document
satisfies the following constraints:]
Each occurrence of the html
element declares the
default namespace to be the XHTML namespace, i.e.
http://www.w3.org/1999/xhtml
.
Each occurrence of the MathML math
element declares
the default namespace to be the MathML namespace, i.e.
http://www.w3.org/1998/Math/MathML
.
Each occurrence of the SVG svg
element declares the
default namespace to be the SVG namespace, i.e.
http://www.w3.org/2000/svg
.
For each occurrence of an attribute in the XLink namespace
(http://www.w3.org/1999/xlink
), some namespace
declaration is in scope binding the prefix xlink
to
that namespace. Namespace declarations for this prefix and
namespace appear only on non-HTML elements.
Note:
It is recommended, for compatibility with [Polyglot] , that such namespace declarations
be placed on the enclosing svg
or math
elements.
No other namespace declarations occur in the document.
Note:
This definition is derived from the draft of [Polyglot] current at the time this document
was published. Users and implementors of this specification are
encouraged to consult the most recent draft of [Polyglot] and to support them if possible.
Such support is, however, unrelated to conformance to this
specification.
7 HTML Output
Method
The HTML output method serializes the instance of the data model
as HTML.
For example, the following XSL stylesheet generates html
output,
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="4.0"/>
<xsl:template match="/">
<html>
<xsl:apply-templates/>
</html>
</xsl:template>
...
</xsl:stylesheet>
In the example, the version
attribute of the
xsl:output
element indicates the version of the HTML
Recommendation [HTML] (or [HTML5] ) to which the serialized result is to
conform.
At the time this document was published, the current version of
[HTML5] was that cited in A.1 Normative References . Like
all draft W3C specifications, [HTML5] is
subject to revision before final publication as a W3C
Recommendation. For all information normatively derived in this
specification from [HTML5] , processors
conforming to this specification MUST take the
information in question from the version cited in A.1 Normative References , or
from later versions of [HTML5] published by
W3C. If they take the information from versions other than the one
cited in A.1 Normative
References , then it is implementation-defined which future version of
[HTML5] is used as the source of the
information, including the lists of elements recognized as HTML elements , void elements , phrasing elements , and
Boolean
attributes . If future versions of [HTML5]
differ from the current draft in any of these areas,
implementations MAY support multiple versions, and
MAY provide a user option for choosing which one
to use.
It is entirely the responsibility of the person or process that
creates the instance of the data model to ensure that the instance
of the data model conforms to the HTML Recommendation [HTML] . It is not an error if the instance of the
data model is invalid HTML. Equally, it is entirely under the
control of the person or process that creates the instance of the
data model whether the output conforms to HTML. If the result
tree is valid HTML, the serializer MUST serialize the
result in a way that conforms with the version of HTML specified by
the requested HTML version .
7.1 Markup for
Elements
As is described in detail below, the HTML output
method will not output an element differently from the
XML output method unless the element is to be serialized as an HTML element .
[Definition : The portion of the serialized document
representing the result of serializing an element , that is
not to be serialized as an HTML element is
known as an XML Island. ] [Definition : An element node is
serialized as an HTML element if]
If the element is to be serialized as an HTML element , but
the local part of the expanded QName is not recognized as the name
of an HTML element, the element MUST be output in
the same way as a non-empty, inline element such as
span
. In particular:
Any namespace node in the result tree for the XML namespace , is ignored
by the HTML output method. In addition, if the requested HTML
version is 5.0
, any element node that has a prefix
and is in the XHTML namespace , MathML
namespace , or SVG namespace MUST be
serialized with an unprefixed element name. The serializer
MUST serialize an attribute with the name
xmlns
whose value is equal to the namespace URI of the
element node, unless an ancestor element in the serialized result
already has an attribute named xmlns
with the same
value, and no intervening element has an attribute named
xmlns
with a different value. If the
element node has a namespace node for the default namespace whose
value is not equal to the namespace URI of the element node,
the namespace node is ignored . The serializer MUST
NOT serialize a namespace declaration for the namespace
node declaring the element node's prefix, unless an attribute of
the element node has the same prefix. For
namespace nodes in the result tree that are not ignored , the
HTML output method MUST represent these namespaces
using attributes named xmlns
or
xmlns:
prefix in the same way as the XML
output method would represent them when the version
parameter is set to 1.0
.
If the result
tree contains elements or attributes whose names have a
non-null namespace URI , the HTML
output method MUST generate namespace-prefixed
QNames for these nodes in the
same way as the XML output method would do when the
version
parameter is set to 1.0
.
Where special rules are defined later in this section for
serializing specific HTML elements and attributes, these rules
MUST NOT be applied to an element that is
not to be serialized as an HTML element or
an attribute whose name has a non-null
namespace URI . However, the generic rules for the HTML output
method that apply to all elements and attributes, for example the
rules for escaping special characters in the text and the rules for
indentation, MUST be used also for namespaced
elements and attributes.
When serializing an element whose name is not defined in the
HTML specification, but that is is to be serialized as an HTML element , the
HTML output method MUST apply the same rules (for
example, indentation rules) as when serializing a span
element. The descendants of such an element MUST
be serialized as if they were descendants of a span
element.
When serializing an element whose name is in a non-null
namespace, the HTML output method MUST apply the
same rules (for example, indentation rules) as when serializing a
div
element. The descendants of such an element
MUST be serialized as if they were descendants of
a div
element, except for the influence of the
cdata-section-elements
serialization parameter on any
text node children of the element.
The HTML output method MUST NOT output an
end-tag for an empty element if the element type has an empty
content model, and the value of the requested HTML
version is less than 5.0
, or the element is
a void element
and the value of the requested HTML version is
5.0
.
For HTML 4.0, the element types that have an empty content
model are area
, base
,
basefont
, br
, col
,
embed
, frame
,
hr
, img
, input
,
isindex
, link
, meta
and
param
. For HTML5, the void elements are as defined above in
6 XHTML Output Method . It
is implementation-defined whether the
basefont
, frame
and isindex
elements, which are not part of HTML5 are considered to be void
elements when the requested HTML version has the value
5.0
.
For example, an element written as <br/>
or
<br></br>
in an XSLT stylesheet
MUST be output as <br>
.
Note:
The markup generation step of the phases of serialization only creates start tags
and end tags for the HTML output method, never XML-style empty
element tags. As such, a serializer MUST serialize an
HTML element that has no children, but whose content model is not
empty, using a pair of adjacent start and end element tags, or as a
solitary start tag if permitted by the context.
For any element node that is to be serialized as an HTML element , the
HTML output method MUST compare the local
part of the name of the element node with the names of HTML
elements making the comparison without regard to
case . If the local part of the name of the element
node, compares equal to that of any HTML element, the element node
MUST be recognized as being that kind of HTML
element. For example, elements named br
,
BR
or Br
MUST all be
recognized as the HTML br
element and output without
an end-tag.
The HTML output method MUST NOT perform
escaping for any text node descendant, nor for any attribute
of an element node descendant, of a
script
or style
element .
For example, a script
element created by an XQuery
direct element constructor or an XSLT literal result element, such
as:
<script>if (a < b) foo()</script>
or
<script><![CDATA[if (a < b) foo()]]></script>
MUST be output as
<script>if (a < b) foo()</script>
A common requirement is to output a script
element
as shown in the example below:
<script type="application/ecmascript">
document.write ("<em>This won't work</em>")
</script>
This is invalid HTML, for the reasons explained in
section B.3.2 of the [HTML] 4.01
specification. Nevertheless, it is possible to output this
fragment, using either of the following constructs:
Firstly, by use of a script
element created by an
XQuery direct element constructor or an XSLT literal result
element:
<script type="application/ecmascript">
document.write ("<em>This won't work</em>")
</script>
Secondly, by constructing the markup from ordinary text
characters:
<script type="application/ecmascript">
document.write ("<em>This won't work</em>")
</script>
As the [HTML] specification points out,
the correct way to write this is to use the escape conventions for
the specific scripting language. For JavaScript, it can be written
as:
<script type="application/ecmascript">
document.write ("<em>This will work<\/em>")
</script>
The [HTML] 4.01 specification also shows
examples of how to write this in various other scripting languages.
The escaping MUST be done manually; it will not be
done by the serializer .
7.2 Writing
Attributes
The HTML output method MUST NOT escape
"<
" characters occurring in attribute values.
A boolean attribute is an attribute with only a single allowed
value in any of the HTML DTDs or that is specified to be
a Boolean
attribute by [HTML5] , where
the allowed value is equal without regard to case to the name of
the attribute. The HTML output method MUST output
any boolean attribute in minimized form if and only if the value of
the attribute node actually is equal to the name of the attribute
making the comparison without regard to case .
[Definition : The attributes identified as
Boolean attributes in [HTML5] are those
given in the following table (using just the local name of their
parent elements): ]
Attribute
Element(s)
async
script
autofocus
button, input, keygen, select, textarea
autoplay
audio, video
checked
input
controls
audio, video
default
track
defer
script
disabled
button, fieldset, input, keygen, optgroup, option, select,
textarea
formnovalidate
button, input
hidden
HTML elements
ismap
img
loop
audio, video
multiple
input, select
muted
audio, video
novalidate
form
open
details
open
dialog
readonly
input, textarea
required
input, select, textarea
reversed
ol
scoped
style
seamless
iframe
selected
option
typemustmatch
object
Note:
This list of Boolean attributes is that given in the
index of the draft of [HTML5] current at
the time this document is published. As noted elsewhere, processors
conforming to this specification MAY support the
list of Boolean attributes included in later versions of [HTML5] .
For example, a start-tag created using the following XQuery
direct element constructor or XSLT literal result element
<OPTION selected="selected">
MUST be output as
The HTML output method MUST NOT escape a
&
character occurring in an attribute value
immediately followed by a {
character (see Section
B.7.1 of the HTML Recommendation [HTML] ).
For example, a start-tag created using the following XQuery
direct element constructor or XSLT literal result element
<BODY bgcolor='&{{randomrbg}};'>
MUST be output as
<BODY bgcolor='&{randomrbg};'>
See 7.4 The Influence of Serialization
Parameters upon the HTML Output Method for additional
directives on how attributes MAY be written.
7.3 Writing
Character Data
The HTML output method MAY output a character
using a character entity reference in preference to using a numeric
character reference, if an entity is defined for the character in
the version of HTML that the output method is using. Entity
references and character references SHOULD be used
only where the character is not present in the selected encoding,
or where the visual representation of the character is unclear (as
with
, for example).
When outputting a sequence of whitespace
characters in the instance of the data model, within an
element where whitespace characters are treated
normally (but not in elements such as pre
and
textarea
), the HTML output method MAY
represent it using any sequence of whitespace
characters that will be treated in the same way by an
HTML user agent. See section 3.5 of [XHTML Modularization] for some
additional information on handling of whitespace by an HTML user
agent for versions of HTML prior to HTML5, and see the
[HTML5] for information on the handling of
whitespace characters by an HTML5 user agent. .
Note:
The terms space character and white_space character
defined in HTML5 do not match the definition of whitespace
character in this specification.
Certain characters are permitted in XML, but not in
HTML prior to HTML5 — for example, the control
characters #x7F-#x9F, are permitted in both XML 1.0
and XML 1.1, and the control characters #x1-#x8, #xB, #xC and
#xE-#x1F are permitted in XML 1.1, but none of these
is permitted in HTML prior to HTML5 . It is a
serialization
error [err:SERE0014 ] to use the HTML output method
if such characters appear in the instance of the data
model and the value of the requested HTML version is less than
5.0
. The serializer MUST signal the
error.
The HTML output method MUST terminate
processing instructions with >
rather than
?>
. It is a serialization error [err:SERE0015 ] to use the HTML output
method when >
appears within a processing
instruction in the data model instance being serialized.
7.4 The Influence of
Serialization Parameters upon the HTML Output Method
7.4.1 HTML Output
Method: the version
and
html-version
Parameters
The html-version
or the
version
serialization parameter indicates
the version of the HTML Recommendation [HTML] or [HTML5]
to which the serialized result is to conform. [Definition : If the
html-version
serialization parameter is not absent,
the requested HTML version is the value of the
html-version
serialization parameter; otherwise, it is
the value of the version
serialization
parameter. ] If the serializer does not support the version of HTML
specified by the requested HTML version , it
MUST signal a serialization error [err:SESU0013 ].
This document provides the normative definition of serialization
for the HTML output method if the requested HTML version has the lexical
form of a value of type decimal whose value is 1.0 or greater, but
no greater than 5.0 . For any other value of version
parameter, the behavior is implementation-defined . In that case the implementation-defined
behavior MAY supersede all other requirements of
this recommendation.
7.4.2 HTML
Output Method: the encoding
Parameter
The encoding
parameter specifies the encoding to be
used. Serializers are
REQUIRED to support values of UTF-8
and UTF-16
. A serialization error [err:SESU0007 ] occurs if an output encoding
other than UTF-8
or UTF-16
is requested
and the serializer
does not support that encoding. The serializer MUST signal the
error.
It is possible that the instance of the data model will contain
a character that cannot be represented in the encoding that the
serializer is using
for output. In this case, if the character occurs in a context
where HTML recognizes character references, then the character
MUST be output as a character entity reference or
decimal numeric character reference; otherwise (for example, in a
script
or style
element or in a comment),
the serializer
MUST signal a serialization error [err:SERE0008 ].
See 7.4.13 HTML Output
Method: the include-content-type Parameter regarding how
this parameter is used with the include-content-type
parameter.
7.4.3 HTML Output
Method: the indent
and
suppress-indentation
Parameters
If the indent
parameter has the value
yes
, then the HTML output method MAY
add or remove whitespace as it serializes the result tree , if it
observes the following constraints.
Whitespace MUST NOT be added other than
before or after an element, or adjacent to an existing whitespace
character.
Whitespace MUST NOT be added or removed
adjacent to an inline element. The inline elements are those
included in the %inline
category of any of the HTML
4.01 DTD's or those elements defined to be phrasing content in
[HTML5] , as well as the
ins
and del
elements if they are used as
inline elements (i.e., if they do not contain element
children).
Whitespace MUST NOT be added or removed
inside a formatted element, the formatted elements being
pre
, script
, style
,
title
, and
textarea
.
Whitespace characters MUST NOT be added in the
content of an element whose expanded QName matches a
member of the list of expanded QNames in the value of the
suppress-indentation
parameter. The expanded
QName of an element node is considered to match a member of the
list of expanded QNames if:
Note:
The effect of the above constraints is to ensure any
insertion or deletion of whitespace would not affect how an
conforming HTML user agent would render the output, assuming the
serialized document does not refer to any HTML style
sheets.
Note that the HTML definition of whitespace is different from
the XML definition (see section 9.1 of the [HTML] specification).
7.4.4 HTML Output Method: the
cdata-section-elements
Parameter
The cdata-section-elements
parameter is not
applicable to the HTML output method, except in the case of
XML Islands .
7.4.5 HTML Output Method: the
omit-xml-declaration
and standalone
Parameters
The omit-xml-declaration
and
standalone
parameters are not applicable to the HTML
output method.
7.4.6 HTML Output
Method: the doctype-system
and
doctype-public
Parameters
If the doctype-public
or
doctype-system
parameters are specified, then the HTML
output method MUST output a document type
declaration. If the doctype-public
parameter is
specified, then the output method MUST output
PUBLIC
followed by the specified public identifier; if
the doctype-system
parameter is also specified, it
MUST also output the specified system identifier
following the public identifier. If the doctype-system
parameter is specified but the doctype-public
parameter is not specified, then the output method
MUST output SYSTEM
followed by the
specified system identifier.
If the value of the requested HTML version is 5.0
, the
doctype-public
and doctype-system
serialization parameters are both absent, the first element
node child of the document node that is to be serialized is
to be serialized as an HTML element , the local
part of the QName of which is equal to the string
HTML
, without regard to case , and any text
node that precedes that element node in document contain only
whitespace characters, then the HTML output method
MUST output a document type declaration, with no
public or system identifier.
If the HTML output method MUST output a
document type declaration, it MUST be serialized
immediately before the first element, if any, and the name
following <!DOCTYPE
MUST be
HTML
or html
.
7.4.7 HTML Output Method: the
undeclare-prefixes
Parameter
The undeclare-prefixes
parameter is not applicable
to the HTML output method.
7.4.8 HTML Output Method: the
normalization-form
Parameter
The normalization-form
parameter is applicable to
the HTML output method. The values NFC
and
none
MUST be supported by the
serializer . A
serialization
error [err:SESU0011 ] results if the value of the
normalization-form
parameter specifies a normalization
form that is not supported by the serializer ; the serializer MUST signal the
error.
7.4.10 HTML Output Method: the
use-character-maps
Parameter
The use-character-maps
parameter is applicable to
the HTML output method. See 9
Character Maps for more information.
7.4.11 HTML Output Method: the
byte-order-mark
Parameter
The byte-order-mark
parameter is applicable to the
HTML output method. See 3 Serialization
Parameters for more information.
7.4.12 HTML Output Method: the
escape-uri-attributes
Parameter
If the escape-uri-attributes
parameter has the
value yes
, the HTML output method
MUST apply URI escaping to URI attribute values , except that
relative URIs MUST NOT be absolutized.
Note:
This escaping is deliberately confined to non-ASCII characters,
because escaping of ASCII characters is not always appropriate, for
example when URIs or URI fragments are interpreted locally by the
HTML user agent. Even in the case of non-ASCII characters, escaping
can sometimes cause problems. More precise control of URI escaping is therefore
available by setting escape-uri-attributes
to
no
, and controlling the escaping of URIs by using
methods defined in Section
6.2 fn:encode-for-uri FO30 and
Section
6.3 fn:iri-to-uri FO30 .
7.4.13 HTML Output Method: the
include-content-type
Parameter
If there is a head
element, and the
include-content-type
parameter has the value
yes
, the HTML output method MUST add
a meta
element as the first child element of the
head
element specifying the character encoding
actually used.
For example,
<HEAD>
<META http-equiv="Content-Type" content="text/html; charset=EUC-JP">
...
The content type MUST be set to the value given
for the media-type
parameter.
If a meta
element has been added to the
head
element as described above, then any existing
meta
element child of the head
element
having an http-equiv
attribute with the value
"Content-Type", making the comparison without regard to
case after first stripping leading and trailing spaces from the
value of the attribute solely for the purposes of
comparison, MUST be discarded.
Note:
This process removes possible parameters in the attribute value.
For example,
<meta http-equiv="Content-Type" content="text/html;version='3.0'"/>
in the data model instance would be replaced by,
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
7.4.14 HTML Output Method: the
item-separator
Parameter
The effect of the item-separator
serialization
parameter is described in 2 Sequence
Normalization .
8 Text Output
Method
The Text output method serializes the instance of the data model
by outputting the string value of the document node created by the markup generation
step of the phases of
serialization without any escaping.
A newline character in the instance of the data model
MAY be output using any character sequence that is
conventionally used to represent a line ending in the chosen system
environment.
8.1 The Influence of
Serialization Parameters upon the Text Output Method
8.1.1 Text Output
Method: the version
Parameter
The version
parameter is not applicable to the Text
output method.
8.1.2
Text Output Method: the html-version
Parameter
The html-version
parameter is not applicable to the
Text output method.
8.1.3 Text
Output Method: the encoding
Parameter
The encoding
parameter identifies the encoding that
the Text output method MUST use to convert
sequences of characters to sequences of bytes. Serializers are
REQUIRED to support values of UTF-8
and UTF-16
. A serialization error [err:SESU0007 ] occurs if the serializer does not support the
encoding specified by the encoding
parameter. The
serializer
MUST signal the error. If the instance of the data
model contains a character that cannot be represented in the
encoding that the serializer is using for output, the serializer
MUST signal a serialization error [err:SERE0008 ].
8.1.4 Text Output
Method: the indent
and
suppress-indentation
Parameters
The indent
and
suppress-indentation
parameters are not
applicable to the Text output method.
8.1.5 Text Output Method: the
cdata-section-elements
Parameter
The cdata-section-elements
parameter is not
applicable to the Text output method.
8.1.6 Text Output Method: the
omit-xml-declaration
and standalone
Parameters
The omit-xml-declaration
and
standalone
parameters are not applicable to the Text
output method.
8.1.7 Text Output
Method: the doctype-system
and
doctype-public
Parameters
The doctype-system
and doctype-public
parameters are not applicable to the Text output method.
8.1.8 Text Output Method: the
undeclare-prefixes
Parameter
The undeclare-prefixes
parameter is not applicable
to the Text output method.
8.1.9 Text Output Method: the
normalization-form
Parameter
The normalization-form
parameter is applicable to
the Text output method. The values NFC
and
none
MUST be supported by the
serializer . A
serialization
error [err:SESU0011 ] results if the value of the
normalization-form
parameter specifies a normalization
form that is not supported by the serializer ; the serializer MUST signal the
error.
8.1.10 Text
Output Method: the media-type
Parameter
The media-type
parameter is applicable to the Text
output method. See 3 Serialization
Parameters for more information.
8.1.11 Text Output Method: the
use-character-maps
Parameter
The use-character-maps
parameter is applicable to
the Text output method. See 9
Character Maps for more information.
8.1.12 Text Output Method: the
byte-order-mark
Parameter
The byte-order-mark
parameter is applicable to the
Text output method. See 3 Serialization
Parameters for more information.
8.1.13 Text Output Method: the
escape-uri-attributes
Parameter
The escape-uri-attributes
parameter is not
applicable to the Text output method.
8.1.14 Text Output Method: the
include-content-type
Parameter
The include-content-type
parameter is not
applicable to the Text output method.
8.1.15 Text Output Method: the
item-separator
Parameter
The effect of the item-separator
serialization
parameter is described in 2 Sequence
Normalization .
9 Character
Maps
The use-character-maps
parameter is a list of
characters and corresponding string substitutions.
Character maps allow a specific character appearing in a text or
attribute node in the instance
of the data model to be replaced with a specified string of
characters during serialization. The string that is substituted is
output "as is," and the serializer performs no checks that the resulting
document is well-formed. This mechanism can therefore be used to
introduce arbitrary markup in the serialized output. See Section 25.1
Character Maps XT30 of [XSL Transformations (XSLT) Version 3.0] for
examples of using character mapping in XSLT.
Character mapping is applied to the characters that actually
appear in a text or attribute node in the instance of the data model, before any
other serialization operations such as escaping or Unicode
Normalization are applied. If a character is mapped, then it is
not subjected to XML or HTML escaping, nor to Unicode
Normalization. The string that is substituted for a character is
not validated or processed in any way by the serializer , except for translation into the
target encoding. In particular, it is not subjected to XML or HTML
escaping, it is not subjected to Unicode Normalization, and it is
not subjected to further character mapping.
Character mapping is not applied to characters in text nodes whose parent elements are listed
in the cdata-section-elements
parameter, nor to
characters for which output escaping has been disabled (disabling
output escaping is a feature in all versions of XSLT ),
nor to characters in attribute values that are subject to URI escaping defined for
the HTML and XHTML output methods, unless URI escaping has been disabled using the
escape-uri-attributes
parameter in the output
definition.
On serialization, occurrences of a character specified in the
use-character-maps
in text nodes and attribute values are replaced by the
corresponding string from the use-character-maps
parameter.
Note:
Using a character map can result in non-well-formed documents if
the string contains XML-significant characters. For example, it is
possible to create documents containing unmatched start and end
tags, references to entities that are not declared, or attributes
that contain tags or unescaped quotation marks.
If a character is mapped, then it is not subjected to XML or
HTML escaping.
A serialization error [err:SERE0008 ] occurs if character mapping
causes the output of a string containing a character that cannot be
represented in the encoding that the serializer is using for output. The serializer
MUST signal the error.
10 Conformance
Serialization is intended primarily as a component of a
host language .
[Definition : A host language is
another specification that includes, by reference, this
specification and all of its requirements. A host language might be
a programming language such as [XSL
Transformations (XSLT) Version 3.0] or [XQuery 3.0: An XML Query Language] , or it
might be an application programming interface (API) intended to be
used by programs written in some other high-level programming
language. The use of the term language is not intended to
preclude the possibility that this specification might be
referenced outside the context of a programming language
specification. ] This document relies on
specifications that use it to specify conformance criteria for
Serialization in their respective environments. Specifications that
set conformance criteria for their use of Serialization
MUST NOT change the semantic definitions of
Serialization as given in this specification, except by subsetting
and/or compatible extensions. It is the responsibility of the
host language to
specify how serialization errors are to be
handled.
Certain facilities in this specification are described as
producing implementation-defined results. A claim that asserts
conformance with this specification MUST be
accompanied by documentation stating the effect of each
implementation-defined feature. For convenience, a non-normative
checklist of implementation-defined features is provided at
E Checklist of
Implementation-Defined Features .
A References
A.1 Normative References
XQuery and XPath Data Model (XDM)
3.0
XQuery and XPath
Data Model (XDM) 3.0 , Norman Walsh, Anders Berglund,
John Snelson, Editors. World Wide Web Consortium, 08 April 2014.
This version is
http://www.w3.org/TR/2014/REC-xpath-datamodel-30-20140408/. The
latest
version is available at
http://www.w3.org/TR/xpath-datamodel-30/.
XQuery and XPath Functions and Operators
3.0
HTML5
HTML5 ,
Robin Berjon, Steve Faulkner, Travis Leithead,
et. al. ,
Editors. World Wide Web Consortium, 04 Feb 2014. This
version is http://www.w3.org/TR/2014/CR-html5-20140204/. The
latest version is
available at http://www.w3.org/TR/html5/.
HTML
HTML 4.01
Specification , Dave Raggett, Arnaud Le Hors, and Ian
Jacobs, Editors. World Wide Web Consortium, 24 Dec 1999.
This version is http://www.w3.org/TR/1999/REC-html401-19991224. The
latest version is
available at http://www.w3.org/TR/html401.
IANA
RFC2046
RFC2119
RFC2978
Unicode Encoding
UAX #15: Unicode Normalization
Forms
XHTML
1.0
XHTML
1.1
XML10
XML11
XML
Names
Namespaces in
XML 1.0 (Third Edition) , Tim Bray, Dave Hollander,
Andrew Layman,
et. al. , Editors. World Wide Web
Consortium, 08 Dec 2009. This version is
http://www.w3.org/TR/2009/REC-xml-names-20091208/. The
latest version is available at
http://www.w3.org/TR/xml-names.
XML Names 1.1
Namespaces
in XML 1.1 (Second Edition) , Tim Bray, Dave Hollander,
Andrew Layman, and Richard Tobin, Editors. World Wide Web
Consortium, 16 Aug 2006. This version is
http://www.w3.org/TR/2006/REC-xml-names11-20060816. The
latest version is available
at http://www.w3.org/TR/xml-names11/.
XML
Path Language (XPath) 3.0
XML Path
Language (XPath) 3.0 , Jonathan Robie, Don Chamberlin,
Michael Dyck, John Snelson, Editors. World Wide Web Consortium, 08
April 2014. This version is
http://www.w3.org/TR/2014/REC-xpath-30-20140408/. The
latest version is available at
http://www.w3.org/TR/xpath-30/.
XQuery 3.0: An XML Query Language
XQuery 3.0: An
XML Query Language , Jonathan Robie, Don Chamberlin,
Michael Dyck, John Snelson, Editors. World Wide Web Consortium, 08
April 2014. This version is
http://www.w3.org/TR/2014/REC-xquery-30-20140408/. The
latest version is available
at http://www.w3.org/TR/xquery-30/.
XSL
Transformations (XSLT) Version 2.0
A.2 Informative References
Character Model for the World Wide Web 1.0:
Normalization
Polyglot
RFC2854
RFC3236
XML Schema
XML Schema
Part 1: Structures Second Edition , Henry Thompson, David
Beech, Murray Maloney, and Noah Mendelsohn, Editors. World Wide Web
Consortium, 28 Oct 2004. This version is
http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/. The
latest version is available
at http://www.w3.org/TR/xmlschema-1/.
XHTML Modularization
XQuery 1.0 and XPath 2.0 Data
Model
XSLT 2.0 and XQuery 1.0 Serialization
(Second Edition)
XSL
Transformations (XSLT) Version 3.0
B Schema
for Serialization Parameters
The following schema describes the structure of a Data Model
instance that can be used to specify the settings of serialization
parameters using the mechanism described in 3.1 Setting Serialization
Parameters by Means of a Data Model Instance .
A copy of this schema is available at
http://www.w3.org/2014/04/xslt-xquery-serialization/schema-for-serialization-parameters.xsd .
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3.org/2010/xslt-xquery-serialization"
xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"
elementFormDefault="qualified">
<xs:annotation>
<xs:documentation>
This is a schema for serialization parameters for
XSLT and XQuery Serialization 3.0.
This schema is available for use under the conditions of the
W3C Software License published at
http://www.w3.org/Consortium/Legal/copyright-software-19980720
It defines a schema for XML Infoset instances with which a user of
a host language MAY specify serialization parameters for use in
serializing an instance of the XQuery and XPath Data Model. It
also provides hooks that allow the inclusion of implementation-
defined serialization parameters and implementation-defined
modifiers to serialization parameters.
</xs:documentation>
</xs:annotation>
<xs:simpleType name="QNames-type">
<xs:list itemType="xs:QName"/>
</xs:simpleType>
<xs:simpleType name="yes-no-type">
<xs:restriction base="xs:token">
<xs:enumeration value="no"/>
<xs:enumeration value="yes"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="yes-no-omit-type">
<xs:restriction base="xs:token">
<xs:enumeration value="no"/>
<xs:enumeration value="omit"/>
<xs:enumeration value="yes"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="char-type">
<xs:restriction base="xs:string">
<xs:maxLength value="1"/>
<xs:minLength value="1"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="encoding-string-type">
<xs:restriction base="xs:string">
<xs:pattern value="[A-Za-z][A-Za-z0-9._\-]*"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="method-type">
<xs:union>
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="html"/>
<xs:enumeration value="text"/>
<xs:enumeration value="xml"/>
<xs:enumeration value="xhtml"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType>
<xs:restriction base="xs:QName">
<xs:pattern value=".*:.*"/>
</xs:restriction>
</xs:simpleType>
</xs:union>
</xs:simpleType>
<xs:simpleType name="pubid-char-string-type">
<xs:restriction base="xs:string">
<xs:pattern value="([- \r\n\ta-zA-Z0-9'()+,./:=?;!*#@$_%])*"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="system-id-string-type">
<xs:restriction base="xs:string">
<xs:pattern value="[^']*|[^"]*"/>
</xs:restriction>
</xs:simpleType>
<!--
- Base type of all serialization parameter types
-->
<xs:complexType name="base-param-type">
<xs:complexContent>
<xs:restriction base="xs:anyType">
<xs:anyAttribute namespace="##other" processContents="lax"/>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
<!--
- Generic string serialization parameters
-->
<xs:complexType name="string-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="xs:string" use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Generic decimal serialization parameters
-->
<xs:complexType name="decimal-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="xs:decimal" use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Serialization parameter type for "yes" or "no"
- serialization parameters
-->
<xs:complexType name="yes-no-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="output:yes-no-type" use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Serialization parameter type for list of xs:QName
- serialization parameters
-->
<xs:complexType name="QNames-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="output:QNames-type" use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Serialization parameter type for "yes", "no" or "omit"
- serialization parameters
-->
<xs:complexType name="yes-no-omit-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="output:yes-no-omit-type"
use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Serialization parameter type for NMTOKEN serialization parameters
-->
<xs:complexType name="NMTOKEN-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="xs:NMTOKEN" use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Base element declaration for all serialization parameter elements
-->
<xs:element name="serialization-parameter-element"
abstract="true"
type="output:base-param-type"/>
<!--
- Serialization parameter element for byte-order-mark parameter
-->
<xs:element id="byte-order-mark" name="byte-order-mark" type="output:yes-no-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for cdata-section-elements parameter
-->
<xs:element id="cdata-section-elements" name="cdata-section-elements" type="output:QNames-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter type for doctype-public parameter
-->
<xs:complexType name="doctype-public-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="output:pubid-char-string-type"
use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Serialization parameter element for doctype-public parameter
-->
<xs:element id="doctype-public" name="doctype-public" type="output:doctype-public-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter type for doctype-system parameter
-->
<xs:complexType name="doctype-system-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="output:system-id-string-type"
use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Serialization parameter element for doctype-system parameter
-->
<xs:element id="doctype-system" name="doctype-system" type="output:doctype-system-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter type for encoding parameter
-->
<xs:complexType name="encoding-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="output:encoding-string-type"
use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Serialization parameter element for encoding parameter
-->
<xs:element id="encoding" name="encoding" type="output:encoding-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for escape-uri-attributes parameter
-->
<xs:element id="escape-uri-attributes" name="escape-uri-attributes" type="output:yes-no-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for html-version parameter
-->
<xs:element id="html-version" name="html-version"
type="output:decimal-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for include-content-type parameter
-->
<xs:element id="include-content-type" name="include-content-type" type="output:yes-no-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for indent parameter
-->
<xs:element id="indent" name="indent" type="output:yes-no-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for item-separator parameter
-->
<xs:element id="item-separator" name="item-separator"
type="output:string-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for media-type parameter
-->
<xs:element id="media-type" name="media-type" type="output:string-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter type for method parameter
-->
<xs:complexType name="method-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:attribute name="value" type="output:method-type"
use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Serialization parameter element for method parameter
-->
<xs:element id="method" name="method" type="output:method-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for normalization-form parameter
-->
<xs:element id="normalization-form" name="normalization-form" type="output:NMTOKEN-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for omit-xml-declaration parameter
-->
<xs:element id="omit-xml-declaration" name="omit-xml-declaration" type="output:yes-no-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for standalone parameter
-->
<xs:element id="standalone" name="standalone" type="output:yes-no-omit-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for suppress-indentation parameter
-->
<xs:element id="suppress-indentation" name="suppress-indentation" type="output:QNames-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for undeclare-prefixes parameter
-->
<xs:element id="undeclare-prefixes" name="undeclare-prefixes" type="output:yes-no-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter type for use-character-maps
- parameter
-->
<xs:complexType name="use-character-maps-param-type">
<xs:complexContent>
<xs:extension base="output:base-param-type">
<xs:sequence>
<xs:element name="character-map" minOccurs="0"
maxOccurs="unbounded">
<xs:complexType>
<xs:attribute name="character" type="output:char-type"/>
<xs:attribute name="map-string" type="xs:string"/>
<xs:anyAttribute namespace="##other"
processContents="lax"/>
</xs:complexType>
</xs:element>
<xs:any minOccurs="0" namespace="##other"
processContents="lax"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<!--
- Serialization parameter element for use-character-maps parameter
-->
<xs:element id="use-character-maps" name="use-character-maps"
type="output:use-character-maps-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<!--
- Serialization parameter element for version parameter
-->
<xs:element id="version" name="version"
type="output:string-param-type"
substitutionGroup="output:serialization-parameter-element"/>
<xs:element name="serialization-parameters">
<xs:complexType>
<xs:sequence>
<xs:element ref="output:serialization-parameter-element"
minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
C Summary of Error
Conditions
This document uses the err
prefix which represents
the same namespace URI (http://www.w3.org/2005/xqt-errors) as
defined in [XML Path Language (XPath) 3.0] .
Use of this namespace prefix binding in this document is not
normative.
err:SENR0001
It is an error if an item in S6 in sequence
normalization is an attribute node or a namespace node.
err:SERE0003
It is an error if the serializer is unable to satisfy the rules for
either a well-formed XML document entity or a well-formed XML
external general parsed entity, or both, except for content
modified by the character expansion phase of serialization.
err:SEPM0004
It is an error to specify the doctype-system parameter, or to
specify the standalone parameter with a value other than
omit
, if the instance of the data model contains text
nodes or multiple element nodes as children of the root node.
err:SERE0005
It is an error if the serialized result would contain an
NCName Names
that contains a character that is not permitted by the version of
Namespaces in XML specified by the version
parameter.
err:SERE0006
It is an error if the serialized result would contain a
character that is not permitted by the version of XML specified by
the version
parameter.
err:SESU0007
It is an error if an output encoding other than
UTF-8
or UTF-16
is requested and the
serializer does not
support that encoding.
err:SERE0008
It is an error if a character that cannot be represented in the
encoding that the serializer is using for output appears in a
context where character references are not allowed (for example if
the character occurs in the name of an element).
err:SEPM0009
It is an error if the omit-xml-declaration
parameter has the value yes
, and the
standalone
attribute has a value other than
omit
; or the version
parameter has a
value other than 1.0
and the
doctype-system
parameter is specified.
err:SEPM0010
It is an error if the output method is xml
or
xhtml
, the value of the
undeclare-prefixes
parameter is yes
, and
the value of the version
parameter is 1.0.
err:SESU0011
It is an error if the value of the
normalization-form
parameter specifies a normalization
form that is not supported by the serializer .
err:SERE0012
It is an error if the value of the
normalization-form
parameter is
fully-normalized
and any relevant construct of the
result begins with a combining character.
err:SESU0013
It is an error if the serializer does not support the version of XML
specified by the version
parameter or the
version of HTML specified by the html-version
or the
version
serialization parameter. .
err:SERE0014
It is an error to use the HTML output method if characters which
are permitted in XML but not in the requested HTML
version appear in the instance of the data model.
err:SERE0015
It is an error to use the HTML output method when
>
appears within a processing instruction in the
data model instance being serialized.
err:SEPM0016
It is an error if a parameter value is invalid for the defined
domain.
err:SEPM0017
It is an error if evaluating an expression in order to extract
the setting of a serialization parameter from a data model instance
would yield an error.
err:SEPM0018
It is an error if evaluating an expression in order to extract
the setting of the use-character-maps
serialization
parameter from a data model instance would yield a sequence of
length greater than one.
err:SEPM0019
It is an error if an instance of the data model used to specify
the settings of serialization parameters specifies the value of the
same parameter more than once, or if the instance does not
have as its root node an element node or a document node with an
element node child, where the local part of the name of the element
node is serialization-parameters
and the namespace URI
is
http://www.w3.org/2010/xslt-xquery-serialization
.
err:SERE0020
This error has been removed.
D List of URI Attributes
The following list of attributes are declared as type
%URI
or %UriList
for a given HTML or
XHTML element, with the exception of the name
attribute for element A
which is not a URI type. The
name
attribute for element A
SHOULD be escaped as is recommended by the HTML
Recommendation [HTML] in Appendix B.2.1.
Attributes
Elements
action
FORM
archive
OBJECT
background
BODY
cite
BLOCKQUOTE, DEL, INS, Q
classid
OBJECT
codebase
APPLET, OBJECT
data
OBJECT
datasrc
BUTTON, DIV, INPUT, OBJECT, SELECT, SPAN, TABLE, TEXTAREA
for
SCRIPT
formaction
BUTTON, INPUT
href
A, AREA, BASE, LINK
icon
COMMAND
longdesc
FRAME, IFRAME, IMG
manifest
HTML
name
A
poster
VIDEO
profile
HEAD
src
AUDIO, EMBED, FRAME, IFRAME, IMG, INPUT,
SCRIPT, SOURCE, TRACK, VIDEO
usemap
IMG, INPUT, OBJECT
value
INPUT
E Checklist of
Implementation-Defined Features (Non-Normative)
This appendix provides a summary of Serialization features whose
effect is explicitly implementation-defined . The conformance rules (see
10 Conformance ) require vendors
to provide documentation that explains how these choices have been
exercised.
For any implementation-defined output method, it is implementation-defined
whether sequence normalization process takes
place. (See 2 Sequence
Normalization )
If the namespace URI is non-null for the method
serialization parameter, then the parameter specifies an implementation-defined
output method. (See 3 Serialization
Parameters )
The effect of additional serialization parameters on the output
of the serializer ,
where the name of such a parameter MUST be
namespace-qualified, is implementation-defined or implementation-dependent . The extent of this effect
on the output MUST NOT override the provisions of
this specification. (See 3 Serialization
Parameters )
Implementation-defined schema components
MAY be included in the set of schema components
that are used in evaluating an XQuery expression or XSLT
instruction in the process of using an XDM instance to determine
the settings serialization parameters. (See 3.1 Setting Serialization
Parameters by Means of a Data Model Instance )
If an instance of the data model used to determine the
settings of serialization parameters contains elements or
attributes that are in a namespace other than
http://www.w3.org/2010/xslt-xquery-serialization
, the
implementation MAY interpret them to specify the
values of implementation-defined serialization parameters in an
implementation-defined manner. (See 3.1 Setting Serialization
Parameters by Means of a Data Model Instance )
The effect of providing an option that allows the encoding
phase to be skipped, so that the result of serialization is a
stream of Unicode characters, is implementation-defined . The serializer is not required to
support such an option. (See 4 Phases of
Serialization )
If an implementation supports a value of the
version
parameter for the XML or XHTML output method
for which this document does not provide a normative definition,
the behavior is implementation-defined . (See 5.1.1 XML Output Method: the version
Parameter )
A serializer
MAY provide an implementation-defined mechanism to place CDATA
sections in the result
tree . (See 5.1.5 XML
Output Method: the cdata-section-elements Parameter )
If the value of the normalization-form
form
parameter is not NFC
, NFD
,
NFKC
, NFKD
,
fully-normalized
, or none
then the
meaning of the value and its effect is implementation-defined .
(See 5.1.9 XML Output Method:
the normalization-form Parameter )
For information used in the XHTML and HTML output methods
for which this specification cites [HTML5] ,
implementations MUST take the information in
question from the version cited in A.1 Normative References , or
from later versions of [HTML5] published by
W3C. If they take the information from versions other than the one
cited in A.1 Normative
References , then it is implementation-defined which future version of
[HTML5] is used as the source of the
information; this includes (but is not limited to) the lists of
elements recognized as HTML elements , void elements , phrasing content , and
Boolean
attributes . (See 6 XHTML
Output Method )
For the HTML output method, it is implementation-defined
whether the basefont
, frame
and
isindex
elements, which are not part of HTML5 are
considered to be void elements when the requested HTML
version has the value 5.0
. (See 7.1 Markup for Elements )
If an implementation supports a value of the
version
parameter for the HTML output method for which
this document does not provide a normative definition, the behavior
is implementation-defined . (See 7.4.1 HTML Output Method: the version and
html-version Parameters )
F Change Log
(Non-Normative)
This appendix details the changes that have been made since the
publication of the [XSLT 2.0 and
XQuery 1.0 Serialization (Second Edition)] .
F.1
Changes applied for the Recommendation
The following changes have been applied since the publication of
the Candidate Recommendation to produce this document.
Bugzilla bug (if applicable)
Erratum (if applicable)
Category
Description of change
Affected sections
Bugzilla bug
25149
None
Editorial
Remove normative dependencies to documents whose technical
stability is not assured.
Bugzilla bug
25156
None
Editorial
Correct typo in sample XSLT expression for
use-character-maps
parameter.
F.2 Changes
applied for the Candidate Recommendation
The following changes have been applied since the publication of
the fifth Public Working Draft to produce this, the sixth Public
Working Draft.
Bugzilla bug (if applicable)
Erratum (if applicable)
Category
Description of change
Affected sections
Bugzilla bug
20245
None
Editorial
Editorial improvements to the description of how void elements
and elements with an empty content model are processed by the XHTML
output method.
Bugzilla bug
20251 and Bugzilla bug
20261
None
Substantive
Corrections to the description of prefix stripping for the
XHTML output method.
F.3
Changes applied for the fifth Public Working Draft
The following changes have been applied since the publication of
the fourth Public Working Draft to produce this, the fifth Public
Working Draft.
Bugzilla bug (if applicable)
Erratum (if applicable)
Category
Description of change
Affected sections
Bugzilla bug
16311
None
Substantive
Added new serialization parameter for specifying a separator
that is inserted between items in the sequence that is to be
serialized.
None
None
Editorial
Added XSLT instructions equivalent to XQuery expressions for
setting serialization parameters by means of a data model instance,
and other editorial corrections and improvements.
None
None
Editorial
Clarified the definition of host language to make it clear that
APIs can be considered to be host languages.
Bugzilla
6129
None
Substantive
Extended the definitions of the HTML and XHTML output methods
to include support for HTML5 serialization.
Bugzilla
17619
None
Editorial
Text associated with links to the definitions of the terms
NCName, EncName and VersionNum was repeated several times.
Bugzilla
15915
None
Editorial
Made uses of the terms "absent" and "unspecified"
consistent.
Bugzilla
17282
None
Substantive
Changed type of the normalization-form
serialization parameter to NMTOKEN
. This would be an
incompatible change from XQuery 1.0, for any implementation that
supported the Serialization Feature, and supported an
implementation-defined value for the
normalization-form
serialization parameter that did
not have the lexical form of an NMToken
.
F.4
Changes applied for the fourth Public Working Draft
The following changes were applied
following the publication of the third Public Working
Draft to produce the fourth Public Working Draft. None
of these changes introduces an incompatibility with [XSLT 2.0 and XQuery 1.0 Serialization
(Second Edition)] .
Bugzilla bug (if applicable)
Erratum (if applicable)
Category
Description of change
Affected sections
Bugzilla bug
12852
None
Substantive
Corrected the type of the media-type
serialization
parameter in the Schema for Serialization Parameters.
Bugzilla bug
13688
None
Substantive
Corrected the regular expression associated with the
encoding-string-type
type in the Schema for
Serialization Parameters, so that hyphens are permitted to appear
in the encoding
serialization parameter.
Bugzilla bug
10176
SE.E20
Substantive
Clarified what it means for the html output method to output an
XML island as XML.
Bugzilla bug
14751
None
Editorial
Corrected typographical errors in the comments associated with
the yes-no-param-type
and
encoding-param-type
types in the Schema for
Serialization Parameters.
F.5
Changes applied for the third Public Working Draft
The following changes were applied after the publication of the
second Public Working Draft to produce the third Public Working
Draft. None of these changes introduces an incompatibility with
[XSLT 2.0 and XQuery 1.0
Serialization (Second Edition)] , except as noted below.
Bugzilla bug (if applicable)
Erratum (if applicable)
Category
Description of change
Affected sections
Bugzilla bug
11635
SE.E19
Substantive
Clarified that serialization error SEPM0010 applies to the
xhtml output method as well as the xml output method.
F.6
Changes applied for the second Public Working Draft
The following changes were applied after the publication of the
first Public Working Draft to produce the second Public Working
Draft. None of these changes introduces an incompatibility with
[XSLT 2.0 and XQuery 1.0
Serialization (Second Edition)] , except as noted below.
Bugzilla bug (if applicable)
Erratum (if applicable)
Category
Description of change
Affected sections
Bugzilla bug
6535
None
Substantive
Added definition of the suppress-indentation
serialization parameter.
Bugzilla bug
7829
SE.E14
Substantive
Clarified how minimized attributes are handled under the rules
of the HTML output method.
Bugzilla bug
8245
SE.E15
Editorial
Corrected description of a serialization error that mentions
which control characters are not permitted under the rules of the
HTML output method
Bugzilla bug
7823
SE.E16
Substantive
Clarified how the script
and style
elements are handled for the HTML output method.
Bugzilla bug
8651
SE.E17
Substantive
Clarified what it means to compare without regard to case.
Bugzilla bug
8206
SE.E18
Editorial
Clarified what it means to escape according to HTML or XML
rules.
Bugzilla bug
6808
None
Substantive
Relaxed rules for the XML output method that specify where a
serializer is permitted to add whitespace. This introduces an
incompatibility only inasmuch as the serialized results produced by
a serializer conforming to this specification could differ from the
results a serializer that adheres to [XSLT 2.0 and XQuery 1.0 Serialization
(Second Edition)] would be permitted to produce.
Bugzilla bug
9302
None
Substantive
Defined a mechanism for specifying serialization parameter
settings in the form of a data model instance.
None
None
Editorial
Replaced all uses of the words legal and
illegal with more appropriate terms.
F.7
Changes applied for the first Public Working Draft
The following changes were applied after the publication of
[XSLT 2.0 and XQuery 1.0
Serialization (Second Edition)] to produce the first Public
Working Draft. None of these changes introduces an incompatibility
with [XSLT 2.0 and XQuery 1.0
Serialization (Second Edition)] .
Bugzilla bug (if applicable)
Erratum (if applicable)
Category
Description of change
Affected sections
Bugzilla bug
6723
SE.E13
Substantive
Clarified how HTML elements that have no children but whose
content model is not empty are serialized.
Bugzilla bug
6732
SE.E12
Substantive
Clarified for which versions of XML and HTML this document
makes normative statements.
None
None
Substantive
Take into account presence of function items in a sequence that
is to be serialized.
None
None
Editorial
Miscellaneous minor editorial corrections and
improvements.