About: Unicode and HTML

An Entity of Type: Thing, from Named Graph: http://dbpedia.org, within Data Space: dbpedia.org

Web pages authored using HyperText Markup Language (HTML) may contain multilingual text represented with the Unicode universal character set. Key to the relationship between Unicode and HTML is the relationship between the "document character set", which defines the set of characters that may be present in a HTML document and assigns numbers to them, and the "external character encoding", or "charset", used to encode a given document as a sequence of bytes.

Property	Value
dbo:abstract	Web pages authored using HyperText Markup Language (HTML) may contain multilingual text represented with the Unicode universal character set. Key to the relationship between Unicode and HTML is the relationship between the "document character set", which defines the set of characters that may be present in a HTML document and assigns numbers to them, and the "external character encoding", or "charset", used to encode a given document as a sequence of bytes. In RFC 1866, the initial HTML 2.0 standard, the document character set was defined as ISO-8859-1 (later HTML standard defaults to Windows-1252 encoding). It was extended to ISO 10646 (which is basically equivalent to Unicode) by RFC 2070. It does not vary between documents of different languages or created on different platforms. The external character encoding is chosen by the author of the document (or the software the author uses to create the document) and determines how the bytes used to store and/or transmit the document map to characters from the document character set. Characters not present in the chosen external character encoding may be represented by character entity references. The relationship between Unicode and HTML tends to be a difficult topic for many computer professionals, document authors, and web users alike. The accurate representation of text in web pages from different natural languages and writing systems is complicated by the details of character encoding, markup language syntax, font, and varying levels of support by web browsers. (en)
dbo:wikiPageExternalLink	http://unicode.coeurlumiere.com/ http://www.hotpeachpages.net/a/characters.html http://www.unicodemap.org/ http://www.w3.org/TR/REC-html40/HTMLlat1.ent http://www.w3.org/TR/REC-html40/HTMLspecial.ent http://www.w3.org/TR/REC-html40/HTMLsymbol.ent http://www.w3.org/TR/unicode-xml/ https://web.archive.org/web/20071103125951/http:/unicode.coeurlumiere.com/ https://web.archive.org/web/20110924073701/http:/www.w3.org/TR/html5/semantics.html%23charset http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm http://www.unicode.org/charts/ http://scripts.sil.org/cms/scripts/page.php%3Fsite_id=nrsi http://www.pinyin.info/tools/converter/chars2uninumbers.html http://www.alanwood.net/unicode/ http://www.alanwood.net/unicode/cjk_compatibility_ideographs.html
dbo:wikiPageID	31985 (xsd:integer)
dbo:wikiPageLength	22301 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID	1116218032 (xsd:integer)
dbo:wikiPageWikiLink	dbr:Qoph dbr:List_of_XML_and_HTML_character_entity_references dbr:Numeric_character_reference dbr:Basic_Multilingual_Plane dbr:Decimal dbc:Unicode dbr:Character_(computing) dbr:Character_encoding dbr:UTF-16 dbr:UTF-16BE dbr:UTF-16LE dbr:UTF-8 dbr:Unicode dbr:Unicode_Transformation_Format dbr:Numerical_digit dbr:Computer_network dbr:Mem dbr:Safari_(web_browser) dbr:Em_dash dbr:Endianness dbr:Ge'ez_alphabet dbr:Google dbr:Greek_alphabet dbr:Mozilla_Firefox dbr:Thorn_(letter) dbr:Arabic_alphabet dbr:Short_I dbr:Simplified_Chinese_characters dbr:Comparison_of_Unicode_encodings dbr:Computer_font dbr:Delta_(letter) dbr:ß dbr:Document_Type_Definition dbr:Web_page dbr:Byte dbr:Traditional_Chinese_characters dbr:Web_browser dbr:Windows-1251 dbr:HTML_email dbr:7_(number) dbr:A dbr:ASCII dbr:Abstraction dbr:Cyrillic_script dbr:Numeric_character_references dbr:Charset_detection dbr:Grapheme dbr:Fe_(rune) dbr:HTML dbr:HTML5 dbr:HTTP dbr:Hangul dbr:Hebrew_alphabet dbr:Hexadecimal dbr:Hiragana dbr:Internet_Explorer dbr:Internet_Explorer_6 dbc:HTML dbr:Character_encodings_in_HTML dbr:Latin_alphabet dbr:Bit dbr:Code2000 dbr:Writing_system dbr:Byte_order_mark dbr:CJK_Unified_Ideographs dbr:File_system dbr:Háček dbr:ISO/IEC_8859-1 dbr:Netscape_Navigator dbr:Octet_(computing) dbr:Opera_(web_browser) dbr:Operating_system dbr:XHTML dbr:XML dbr:Unicode_block dbr:MIME dbr:Markup_language dbr:Universal_Character_Set dbr:World_Wide_Web dbr:Face_with_Tears_of_Joy_emoji dbr:ISO_10646 dbr:List_of_typefaces dbr:Natural_language dbr:Programming_language dbr:Thai_alphabet dbr:Windows-1252 dbr:Syllable dbr:UTF-32BE dbr:UTF-32LE dbr:Malayalam_alphabet dbr:Computer_storage dbr:Runic_alphabet dbr:ISO_8859-1 dbr:Qha dbr:Wikibooks:Unicode/Character_reference dbr:Meta:Help:Special_characters
dbp:date	2007-11-03 (xsd:date)
dbp:url	https://web.archive.org/web/20071103125951/http:/unicode.coeurlumiere.com/
dbp:wikiPageUsesTemplate	dbt:Citation_needed dbt:Essay-like dbt:IETF_RFC dbt:Main dbt:Multiple_issues dbt:Primary_sources dbt:Refimprove dbt:Reflist dbt:Short_description dbt:Snd dbt:Webarchive dbt:Unicode_navigation dbt:Rewrite dbt:SpecialChars dbt:U+ dbt:Toomanylinks dbt:Html_series
dcterms:subject	dbc:Unicode dbc:HTML
rdfs:comment	Web pages authored using HyperText Markup Language (HTML) may contain multilingual text represented with the Unicode universal character set. Key to the relationship between Unicode and HTML is the relationship between the "document character set", which defines the set of characters that may be present in a HTML document and assigns numbers to them, and the "external character encoding", or "charset", used to encode a given document as a sequence of bytes. (en)
rdfs:label	Unicode and HTML (en)
owl:sameAs	freebase:Unicode and HTML wikidata:Unicode and HTML https://global.dbpedia.org/id/3GmQD
prov:wasDerivedFrom	wikipedia-en:Unicode_and_HTML?oldid=1116218032&ns=0
foaf:isPrimaryTopicOf	wikipedia-en:Unicode_and_HTML
is dbo:wikiPageRedirects of	dbr:HTML_Unicode dbr:HTML_unicode dbr:Unicode_&_HTML dbr:Unicode_and_html dbr:Unicode_in_HTML dbr:Unicode_in_MSIE
is dbo:wikiPageWikiLink of	dbr:UTF-8 dbr:Integrated_circuit_layout_design_protection dbr:Swiftfox dbr:Gujarati_script dbr:Unicode_input dbr:Unicode_and_email dbr:Character_encodings_in_HTML dbr:HTML_Unicode dbr:HTML_unicode dbr:Playing_card_suit dbr:Unicode_&_HTML dbr:Unicode_and_html dbr:Unicode_in_HTML dbr:Unicode_in_MSIE
is foaf:primaryTopic of	wikipedia-en:Unicode_and_HTML