该版本已经废弃,我已翻译了新的版本《与HTML4 的差异

W3C

HTML5 相对于 HTML4 的差异

W3C 工作草案 2011年5月25日

当前版本:
http://www.w3.org/TR/2011/WD-html5-diff-20110525/
最新的发布版本:
http://www.w3.org/TR/html5-diff/
最新的编者草稿:
http://dev.w3.org/html5/html4-differences/
旧版本:
http://www.w3.org/TR/2011/WD-html5-diff-20110405/
http://www.w3.org/TR/2011/WD-html5-diff-20110113/
http://www.w3.org/TR/2010/WD-html5-diff-20101019/
http://www.w3.org/TR/2010/WD-html5-diff-20100624/
http://www.w3.org/TR/2010/WD-html5-diff-20100304/
http://www.w3.org/TR/2009/WD-html5-diff-20090825/
http://www.w3.org/TR/2009/WD-html5-diff-20090423/
http://www.w3.org/TR/2009/WD-html5-diff-20090212/
http://www.w3.org/TR/2008/WD-html5-diff-20080610/
http://www.w3.org/TR/2008/WD-html5-diff-20080122/
编者:
Anne van Kesteren (Opera Software ASA) <annevk@opera.com>
Simon Pieters (Opera Softwarebug database ASA) <simonp@opera.com>

摘要

HTML5定义了万维网核心语言——HTML的第五次主要修订,“HTML5 相对于 HTML4 的差异”描述了HTML4与HTML5之间的差异并提供了一些如此更改的理由。本文档可能不能提供仍在积极发展中的HTML5规范的准确信息。如有疑问,始终以HTML5规范本身为准。[HTML5]

本文档的状态

本节描述了本文档在其出版时的状态。其他文档也可能取代本文档。当前W3C发布列表以及本技术报告的最新修订可以在W3C技术报告索引里找到,网址是http://www.w3.org/TR/.

这是一份由HTML活动的一部分——HTML工作组产生的2011年5月25日的工作草案。工作组计划发布此文档作为随同HTML5规范的一份工作组笔记。征求意见的对应论坛是W3C Bugzilla。另外,提交评论到 public-html-comments@w3.org (subscribearchives)会安排评论转到bug数据库。

作为一个工作草案公布并不意味着W3C成员的认可。这是一个草案文件,并随时可能会被其他文档更新、取代或者废弃。在进程中的工作外引用此文档是不恰当的。

本文档是由一个小组根据2004年2月5日W3C专利政策操作并生成。W3C维护了一个用于小组的交付的专利披露的公开名单。该页还包括披露专利的说明。有专利的实际知识的个人,如果个人相信此专利包含必要的申明,必须按照W3C专利政策的第6条披露信息。

内容表

  • 1. 简介
  • 1.1. 尚未解决的问题
  • 1.2. 向后兼容
  • 1.3. 发展模式
  • 2. 语法
  • 2.1. 字符编码
  • 2.2. DOCTYPE
  • 2.3. MathML和SVG
  • 2.4. 杂项
  • 3. 语言
  • 3.1. 新元素
  • 3.2. 新属性
  • 3.3. 变更的元素
  • 3.4. 变更的属性
  • 3.5. 不包含的元素
  • 3.6. 不包含的属性
  • 4. APIs
  • 4.1. HTMLDocument扩展
  • 4.2. HTMLElement扩展
  • 5. HTML5更新日志
  • 5.1. Changes since 5 April 2011
  • 5.2. Changes from 13 January 2011 to 5 April 2011
  • 5.3. Changes from 19 October 2010 to 13 January 2011
  • 5.4. Changes from 24 June 2010 to 19 October 2010
  • 5.5. Changes from 4 March 2010 to 24 June 2010
  • 5.6. Changes from 25 August 2009 to 4 March 2010
  • 5.7. Changes from 23 April 2009 to 25 August 2009
  • 5.8. Changes from 12 February 2009 to 23 April 2009
  • 5.9. Changes from 10 June 2008 to 12 February 2009
  • 5.10. Changes from 22 January 2008 to 10 June 2008
  • Acknowledgments
  • References
  • 1. 简介

    HTML自从90年代初被引入到互联网以来一直在不断发展。某些特性被规范引入;其他则被软件的发布引入。在某些方面,实现以及作者实践不仅相互融合,也与规范与标准融合,但在其他方面它们也不断地偏离。

    HTML4在1997年成为W3C推荐标准。虽然它继续作为一个许多HTML的核心功能的粗略指南,但它并没有提供足够的用以构建相互操作的实现信息,更重要的是,也没有提供大量用以操作的部署内容的实现信息。定义了HTML4的XML序列化的XHTML1,以及为HTML和XHTML定义了JavaScript API的DOM Level 2 HTML也是同样的情况,HTML5将会取代这些文档。[DOM2HTML] [HTML4] [XHTML1]

    HTML5的草案反映了自2004年开始的研究当代HTML实现及部署内容的努力。草案:

    1. 定义了一种单一的语言叫做HTML5,它可以用HTML或者XML的语法书写。
    2. 定义详细的处理模式,以促进互操作的实现。
    3. 为文档改进标记。
    4. 为新兴词语引入标记和API,如Web应用程序。

    1.1. 尚未解决的问题

    HTML5仍然是一个草案。 HTML5的内容,以及依赖于HTML5的本文档的内容,仍还在HTML工作组和WHATWG邮件列表的讨论中,尚未解决的问题与HTML5草案是联系在一起的。

    1.2. 向后兼容

    HTML5被以一种与用户代理(译注,通常指浏览器,但浏览器只是其中之一)处理部署内容的方式向后兼容的方式定义。为了对作者保持创作语言的相对简单,几个元素和属性不再被包含在内,它们会在文档的其他小节列出。比如表示样式的元素用CSS来处理会更好。

    用户代理,当然,将会总是支持这些老的元素和属性,并且这就是为什么HTML5规范清楚地分离对作者和用户代理的要求的原因。例如,这意味着作者不能再使用 isindex 或者 plaintext 元素,但是用户代理为了兼容已被部署的内容,而被要求以兼容这些元素如何表现的方式支持它们。

    因为HTML5已经分离了对作者和用户代理的一致性要求,所以不再需要用“废弃(deprecated)”去标记一个已过时的功能。

    1.3. 发展模式

    在至少有两个规范的完整实现之前,HTML5规范都不会被完成。一个测试套件将被用于测量实现的完整性。这种方法与以往版本的HTML不同,之前的做法是最终的规范会在被实际实现之前就由委员会例行地通过。这一变化的目标是确保规范被实现,并且对作者来说一旦它完成就是可用的。

    2. 语法

    HTML5定义了一种HTML语法,并兼容在web上已发布的HTML4和XHTML1文档,但不兼容更为深奥的HTML4的SGML特性,比如处理指令(processing instructions)和速记标记(shorthand markup),因为大多数用户代理都不支持这些。使用HTML语法的文件几乎总是带以 text/html 媒体类型服务。

    HTML5也为这个很大程度上与流行实现相兼容的语法定义了详细的解析规则(包括“错误处理”),用户代理必须对有text/html媒体类型的资源使用这些规则。这里是一个符合HTML语法的范例文件:

    <!doctype html>
    <html>
      <head>
        <meta charset="UTF-8">
        <title>Example document</title>
      </head>
      <body>
        <p>Example paragraph</p>
      </body>
    </html>

    HTML5也为使用HTML语法的文件定义了一种text/html-sandboxed媒体类型。可以被用在服务不受信任的内容的时候。

    另一种可以被用作HTML5的语法是XML。此语法与XHTML1的文件和实现相兼容。使用这种语法的文件需使用XML媒体类型,并且元素需被放进 http://www.w3.org/1999/xhtml 命名空间并遵循XML规范中规定的规则。[XML]

    下面是一个符合HTML5的XML语法的范例文件:注意XML文件必须使用XML媒体类型,比如application/xhtml+xml或者application/xml

    <?xml version="1.0" encoding="UTF-8"?>
    <html xmlns="http://www.w3.org/1999/xhtml">
      <head>
        <title>Example document</title>
      </head>
      <body>
        <p>Example paragraph</p>
      </body>
    </html>

    2.1. 字符编码

    对于HTML5的HTML语法,作者有三种设置字符编码的方法:

  • 在传输层。例如通过使用HTTP Content-Type 头。
  • 在文件的开头使用字节顺序标记(BOM)字符。这些字符提供了所使用编码的签名。
  • 使用一个带有 charset 属性的 meta 元素,在该文件的前1024字节里,指定编码。例如,可以使用 <meta charset="UTF-8"> 来指定UTF-8编码。这取代了原来的 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">,尽管后者仍然有效。
  • 对于XML语法而言,作者必须使用XML规范中规定的规则来设置字符编码。

    2.2. DOCTYPE

    HTML5的HTML语法要求定义一个DOCTYPE来确保浏览器使用标准模式渲染页面。DOCTYPE没有其他目的,因此,对XML而言是可选的。带有XML媒体类型的文档总是被以标准模式处理。[DOCTYPE]

    DOCTYPE申明是 <!DOCTYPE html> 并且在HTML语法中大小写不敏感。之前版本的HTML的DOCTYPE要更长,那是因为HTML语言基于SGML,因此需要一个DTD的引用。HTML5不再是如此并且需要DOCTYPE只是为了确保标准模式应用于使用HTML语法撰写的文档。浏览器已经为 <!DOCTYPE html> 做到了这一点。

    2.3. MathML和SVG

    HTML5的HTML语法允许在文档中使用MathML和SVG元素。例如,一个简单的使用一些最少的语法特性的文档看起来像这样:

    <!doctype html>
    <title>SVG in text/html</title>
    <p>
     A green circle:
     <svg> <circle r="50" cx="50" cy="50" fill="green"/> </svg>
    </p>

    也有可能是更复杂的组合。比如使用SVG foreignObject 元素,你能够容纳MathML, HTML, 或者同时包含本身就在HTML中的SVG片段。

    2.4. 杂项

    还有一些其他语法的变化值得一提:

  • HTML现在原生支持IRIs(译注:国际化资源标识符,定义在RFC3987),尽管只有当文件编码是UTF-8或UTF-16时它们才能被完全利用起来。
  • lang 属性除了接收一个有效的语言标识符外还接收空字符串,就如同XML里的 xml:lang 一样。(译注:xml:lang接收值为主要语言子标记+跟随子标记,后者可为空,定义在RFC3066)
  • 3. 语言

    为了更清楚的说明HTML4和HTML5之前的各种不同,这部分被分割成几个小节。

    3.1. 新元素

    为了更好的结构化,引入下面这些元素

  • section 代表一个通用的文件或应用部分。它可以和 h1, h2, h3, h4, h5, 及 h6 元素同时使用来表明文档的结构。

  • article 代表了一个文档内容的独立片段,比如博客条目或报纸文章。

  • aside 代表了一段与页面的其余部分稍稍相关的内容。

  • hgroup 代表一个部分(section)的头(header)。

  • header 代表一组介绍或导航辅助。

  • footer 代表一个部分(section)的尾(footer)并可以容纳关于作者、版权等信息。

  • nav 代表了用于导航目的的文档的一部分。

  • figure 代表一个独立的流内容片段,通常作为一个文档主流的独立单元。

    <figure>
     <video src="example.webm" controls></video>
     <figcaption>Example</figcaption>
    </figure>

    figcaption 被用作标题 (它是可选的).

  • 还有其他一些新的元素:

  • videoaudio 用于多媒体内容。两者都提供了一个API使得作者可以编写他们自己的用户界面,但也有一种方法触发由用户代理提供的用户界面。如果有各种类型的可用流,source 被与这些元素放在一起使用。

  • track 提供了 video 元素的文本轨道。

  • embed 用于插件内容。

  • mark 代表一连串文档中用作参考的被备注和高亮的文本,由于其在另一个上下文相关。

  • progress 代表一个任务完成度,比如下载或者执行一系列耗时的操作。

  • meter 代表一个度量,比如磁盘使用情况。

  • time 表示日期和(或)时间。

  • ruby, rtrp 用来标注ruby注释。(译注:ruby注释是用ruby字符标示在汉字等东亚字符的上方或者右方用来标示的拼音的那部分。就像小学读课文时,标记在文字上的拼音,就是ruby字符,或者称之为ruby读作rubi。)

  • bdi 代表从其周围独立出来的一段文本,目的是要使用双向文本格式。(译注:书写方向一直是被考虑的,这个的主要用在LTR和RTL文本混合时)

  • wbr 代表一个换行时机。(译注:为使得文本更可读,浏览器依据wbr提示的换行时机进行更为合理的换行)

  • canvas 用来渲染即时创建的动态位图,比如图表和游戏。

  • command 代表一个用户可以调用的命令。

  • details 代表用户可以按需获取的额外的信息或控制。summary 元素提供了它的摘要,说明或标题。

  • datalistinput 的新属性 list 一起使用可以创建组合框:

    <input list="browsers">
    <datalist id="browsers">
     <option value="Safari">
     <option value="Internet Explorer">
     <option value="Opera">
     <option value="Firefox">
    </datalist>
  • keygen 代表了密钥对的生成控制。

  • output 代表了一些输出类型,比如表单里的通过脚本的计算结果。

  • input元素的type属性现在有了下面的新值:

  • tel
  • search
  • url
  • email
  • datetime
  • date
  • month
  • week
  • time
  • datetime-local
  • number
  • range
  • color
  • 这些新类型的想法是,用户代理可以提供用户界面,比如一个日历日期选择器或用户的地址簿的整合,并提交一个确定的格式到服务器。这给了用户更好的体验,因为他的输入在发送到服务器前被检查,这意味着更少的等待反馈的时间。

    3.2. 新属性

    HTML5为已经是HTML4一部分的各种元素引入了几个新属性:

  • 为了和 link 元素的一致性, aarea 元素现在有了media 属性。

  • 为了和 alink 元素的一致性,area 元素现在也有了hreflang, typerel 属性。

  • 主要为了和 a 元素的一致性,base 元素也有了 target 属性。(这已被广泛支持)

  • meta 元素现在有了 charset 属性,因为这已经被广泛支持并且提供了一个定义文档字符编码(character encoding)的好方法。

  • input (除了type 属性值是 hidden 外), select, textareabutton 元素上可以定义一个新的 autofocus 属性,它为页面加载时聚焦表单控件提供了一个描述性方法。使用这个特性应当加强用户体验,比如,如果用户不喜欢可以关闭它。

  • inputtextarea 元素上可以定义一个新的 placeholder 属性,它代表了一个提示,旨在帮助用户数据录入。

    <input type=email placeholder="a@b.com">
  • input, output, select, textarea, button, label, objectfieldset 元素有了新的 form 属性,允许控件关联一个表单。现在这些元素可以放置在页面的任何地方,它们仍和表单关联,而不是只能作为 form 元素的子元素。

    <label>Email:
     <input type=email form=foo name=email>
    </label>
    <form id=foo></form>
  • input (除了 type 属性值是 hidden, image 或一些按钮类型比如 submit 外), selecttextarea 有了新的 required 属性,它提示用户为了能够提交表单而必须填入一个值。对于 select 来说,第一个 option 元素必须以一个空值占位。

    <label>Color: <select name=color required>
     <option value="">Choose one
     <option>Red
     <option>Green
     <option>Blue
    </select></label>
  • fieldset 元素现在允许用 disabled 属性来禁用所有子元素的控制,并允许用 name 属性作脚本访问。

  • input 元素有几个新的属性来来指定约束:autocomplete, min, max, multiple, patternstep。之前已经提到,它还有一个与 datalist 元素一起使用的新的 list 属性。当使用 type=image 时,它也有 widthheight 属性来指定图像的尺寸。

  • inputtextarea 元素有了一个新元素 dirname,用于用户所设置的提交的方向性的控制(译注,即书写的方向性)。

  • textarea 元素也多了两个新属性,maxlengthwrap,分别用来控制最大输入长度和提交的换行行为。

  • form 元素有了 novalidate 属性,可以用来禁用表单验证提交(即表单总是可以被提交)。

  • inputbutton 元素有了新属性 formaction, formenctype, formmethod, formnovalidateformtarget。如果存在,它们覆盖 form 元素上的 action, enctype, method, novalidatetarget 属性。

  • menu 元素有了两个新属性:typelabel。它们允许元素转化成典型用户界面里的菜单,并结合全局 contextmenu 属性提供上下文菜单。

  • style 元素有了一个新的 scoped 属性,用来启用限定作用范围的样式表。在一个这样的 style 元素里的样式规则只应用到局部树。(译注,准确地说是应用到当前style元素的父元素根下的子树,即兄弟树)

  • script 元素有了一个新属性 async,可以影响脚本的加载和执行。

  • html 元素有了一个新属性 manifest,指向一个用于结合离线Web应用API的应用程序缓存清单。

  • link 元素有了一个新的属性 sizes。可以结合 icon 的关系(通过设置 rel 属性,可被用于如网站图示)一起使用来表明被引用图标的大小。因此允许了不同的尺寸的图标。

  • ol 元素有了一个新属性 reversed。当其存在时,代表列表中的顺序为降序。

  • iframe 元素有了三个新属性分别是 sandbox, seamless, 和 srcdoc,用以允许沙箱内容,例如,博客评论。

  • 一些HTML4的属性现在被应用到所有的元素。这些属性被称为全局属性(global attributes): accesskey, class, dir, id, lang, style, tabindextitle。此外,XHTML 1.0 只在一些元素上允许 xml:space,现在它被允许用在XHTML文档所有的元素上。

    也有一些新的全局属性:

  • contenteditable 属性表明元素是一个可编辑的区域。用户可以改变元素的内容以及操作标记。
  • contextmenu 属性用来指向一个作者提供的上下文目录。
  • 作者定义的属性 data-* 集合。作者可以定义他们想要的任何属性,只要他们在其之前加上 data- 的前缀,以此避免与未来的HTML版本的冲突。对这些属性的唯一的要求是它们不被用作用户代理的扩展。
  • draggabledropzone 属性可以与新的拖放API一起使用。
  • hidden 属性表示一个元素尚未,或不再有所关联。(译注,即用户代理将不显示定义了hidden属性的元素。但与表现层的比如tab切换隐藏图层不同,hidden后的元素针对所有的显示,比如屏幕阅读器。与form里的hidden域相似。)
  • rolearia-* 集合属性用来指导辅助技术。
  • spellcheck 属性允许暗示是否内容可以被拼写检查。
  • HTML5也使得所有来自HTML4的事件处理属性(那些形如 onevent-name 的属性)变成全局属性,并为其定义的新的事件添加了几个新的时间处理属性。比如,媒体元素(videoaudio)API所使用的 play 事件。

    3.3. 变更的元素

    这些元素在HTML5里被略微修改了含义,这是为了更好的反应它们如何被使用在Web上或者让它们变得更有用:

  • 没有 href 属性的 a 元素现在代表一个假设可能放置一个链接的占位符。它也能包含流内容(flow content)而不再是仅限于包含短语内容(phrasing content)。

  • address 元素现在被部分的新概念限定了作用范围。(译注,它现在代表的是它最近的article或body的祖先元素的联系信息。)

  • b 元素现在代表一段文本,这段文本仅仅出于功利的目的被提请注意,这种目的里没有传达任何额外的重要性,也没有交替的语言和心情的意味,比如文档摘要的关键字,审查中的产品名,文本驱动的交互软件的可操作词,或文章的导引。

  • cite 元素现在只代表作品标题(比如,书,报纸,随笔,诗歌,乐谱,歌曲,脚本,电影,电视节目,游戏,雕像,绘画,戏剧,演奏,歌剧,音乐,展览,法律案例报告,等等)。特别是HTML4里它被用来标记一个人的名字的例子不再被视为是合适的用法。

  • dl 元素现在代表一组名称-值的关联列表,并且不再适用于对话。

  • head 元素不再允许 object 元素作为子元素。

  • hr 元素现在代表一个段级专题间断(paragraph-level thematic break)。

  • i 元素现在代表一段有着交替的语言和心情意味的文本,或者,以表明一种不同的文本质量的方式与正常的散文相抵,比如分类命名,技术术语,其它语言的惯用短语,一个念头,或西文的船名。

  • 对于 label 元素,浏览器不应该再将焦点从标签移动到控件上,除非这种行为对于底层平台的用户界面是标准的。

  • menu 元素重新定义了用于工具栏和上下文菜单。

  • s 元素现在代表内容不再准确或不再有关联。

  • small 元素现在代表侧边注释的小字。

  • strong 元素现在代表重要性而不是强烈的强调。

  • u 元素现在代表一段文本,带着虽然明白地被呈现却不怎么准确的非文本的注解。比如专名号(译注,感谢呂康豪提供此处校订),或者拼错的标签文本。

  • 3.4. 变更的属性

    li 元素的 value 属性不再被废弃(即可用),因为它不是样式性的属性。ol 元素的 start 属性也是一样的情况。

    aarea 元素的 target 属性不再被废弃,因为它在Web应用中很有用,比如,在配合 iframe 时。

    如果脚本语言和样式语言分别是ECMAScript和CSS,那么 scriptstyletype 属性不再是必须的。

    tableborder 属性只允许值为"1"以及空字符。

    下面这些元素被允许使用,但是不鼓励作者使用它们,而是强烈鼓励使用替代的解决方案:

  • imgborder 属性。如果存在其值必须是"0"。作者可以使用CSS代替。

  • scriptlanguage 属性。如果存在其值必须是"JavaScript"(不区分大小写),并且不能与 type 属性冲突。作者可以简单地忽略它,因为它没什么作用。

  • aname 属性。作者可以使用 id 属性代替。

  • tablesummary 属性。HTML5草案定义了几种替代的解决方案。

  • img 和其他元素的 widthheight 属性不再允许包含百分比。

  • 3.5. 不包含的元素

    作者不该再使用这个小节里的元素。但用户代理将仍然支持它们,HTML5的几个小节定义了如何支持它们。比如,废弃的 isindex 元素由解析器部分处理。

    下面的元素不在HTML5内,因为它们的纯粹是表象(样式)作用,CSS能更好的处理它们的功能。

  • basefont
  • big
  • center
  • font
  • strike
  • tt
  • 下面的元素不在HTML5内,因为使用它们会破坏可用性和可访问性。

  • frame
  • frameset
  • noframes
  • 不包含下面的元素是因为它们甚少被使用,造成混淆,或者它们的功能能被其他元素处理。

  • 不包含 acronym 因为它造成了大量的混淆,作者可以用 abbr 表示缩写。
  • applet 已经被废弃,object 是更好的选择。
  • isindex 可以被表单控制代替。
  • dir 已经被废弃,更赞同使用 ul
  • 最后,noscript 元素只符合HTML语法。XML语法不包括它,它的用法依赖于HTML解析器。

    3.6. 不包含的属性

    一些来自HTML4的属性在HTML5中不再被允许使用。规范定义了用户代理应该如何在遗留文档里处理它们,但是它们并不是有效的,作者不能再使用它们。

    HTML5对你使用什么来代替它们有些建议

  • linkarevcharset 属性。
  • ashapecoords 属性。
  • imgiframelongdesc 属性。
  • linktarget 属性。
  • areanohref 属性。
  • headprofile 属性。
  • htmlversion 属性。
  • imgname 属性。(使用 id 替代).
  • metascheme 属性。
  • objectarchive, classid, codebase, codetype, declarestandby 属性。
  • paramvaluetypetype 属性。
  • tdthaxisabbr 属性。
  • tdscope 属性。
  • tablesummary 属性。
  • 此外,HTML5没有一个HTML4中的表象(样式)属性,因为CSS能更好的处理它们的功能。

  • caption, iframe, img, input, object, legend, table, hr, div, h1, h2, h3, h4, h5, h6, p, col, colgroup, tbody, td, tfoot, th, theadtralign 属性。
  • bodyalink, link, textvlink 属性。
  • bodybackground 属性。
  • table, tr, td, thbodybgcolor 属性。
  • objectborder 属性。
  • tablecellpaddingcellspacing 属性。
  • col, colgroup, tbody, td, tfoot, th, theadtrcharcharoff 属性。
  • brclear 属性。
  • dl, menu, olulcompact 属性。
  • tableframe 属性。
  • iframeframeborder 属性。
  • tdthheight 属性。
  • imgobjecthspacevspace 属性。
  • iframemarginheightmarginwidth 属性。
  • hrnoshade 属性。
  • tdthnowrap 属性。
  • tablerules 属性。
  • iframescrolling 属性。
  • hrsize 属性。
  • li, olultype 属性。
  • col, colgroup, tbody, td, tfoot, th, theadtrvalign 属性。
  • hr, table, td, th, col, colgroupprewidth 属性。
  • 4. APIs

    HTML5引入了若干API,来帮助创建Web应用。这些API可以和为了应用而被引入的新元素一起使用:

  • 用于播放视频和音频的API,可以和 videoaudio 元素一起使用。
  • 启用离线Web应用程序的API。
  • 允许一个Web应用程序自己为某些协议和媒体类型注册的API。
  • 结合新的全局 contenteditable 属性的编辑API。
  • 结合 draggable 属性的拖放API。
  • 公开历史记录的API,这样允许页面被添加到其中,以防止后退按钮失效。
  • 4.1. HTMLDocument扩展

    HTML5在许多方面从DOM Level 2 HTML扩展了 HTMLDocument 接口。该接口现在在所有实现文档接口的对象上实现,使其在一个复合文档的上下文中保持有意义。 它也有一些值得注意的新成员:

  • getElementsByClassName()用来按照它们的类名来选择元素。这个方法的定义方式允许它在任何带有 class 属性的内容以及一个 Document 对象如SVG和MathML上都能工作。

  • 作为解析和序列化一个HTML或XML文档的简单方法 innerHTML ,这个属性之前仅在Web浏览器里的 HTMLElement 上可用,并且也不是任何标准的一部分。

  • activeElementhasFocus 分别用来确定哪个元素当前拥有焦点以及是否 Document 拥有焦点。

  • 4.2. HTMLElement扩展

    HTMLElement 接口在HTML5里也获得了几个扩展:

  • getElementsByClassName(),基本上是一个 HTMLDocument 上的同名方法的限定了作用域的版本。

  • 在现在的Web浏览器里的 innerHTML,它在XML上下文里也被定义并工作(当它被用在一个XML文档里时)。

  • classList 是一个针对 className 的方便的访问器。由它返回的对象公开了一些方法(contains(), add(), remove()toggle()),用来操作元素的类。 a, arealink 元素有一个类似的属性叫做 relList,针对 rel 属性提供相同的功能。

  • 5. HTML5 Changelogs

    The changelogs in this section indicate what has been changed between publications of the HTML5 drafts. Rationale for changes can be found in the public-html@w3.org and whatwg@whatwg.org mailing list archives, and the WHATWG Weekly series of blog posts. More fundamental rationale is being collected on the WHATWG Rationale wiki page. Many editorial and minor technical changes are not included in these changelogs. Implementors are strongly encouraged to follow the development of the main specification on a frequent basis so they become aware of all changes that affect them early on.

    The changes in the changelogs are in rough chronological order.

    5.1. Changes since 5 April 2011

  • Support for the javascript: scheme in img, object, CSS, etc, has been dropped.
  • The toBlob() method has been added to canvas.
  • The drawFocusRing() method on the canvas 2d context has been split into two methods, drawSystemFocusRing() and drawCustomFocusRing().
  • The values attribute on PropertyNodeList has been replaced with a getValues() method.
  • The select event has been specified.
  • The selectDirection IDL attribute has been added to input and textarea.
  • The :enabled and :disabled pseudo-classes now match fieldset, and the :indeterminate pseudo-class can now match progress.
  • The getKind() method has been added to TrackList.
  • The MediaController API and the mediagroup attribute have been added to synchronize playback of media elements.
  • Some ARIA defaults have changed, and it is now invalid to specify ARIA attributes that match the defaults.
  • The getName() method on TrackList was renamed to getLabel().
  • The border attribute on table is now conforming.
  • The u element is now conforming.
  • The summary attribute on table is now non-conforming.
  • The audio attribute on video was changed to a boolean muted attribute.
  • The Content-Language meta pragma is now non-conforming.
  • 5.2. Changes from 13 January 2011 to 5 April 2011

  • The pushState and replaceState features have been changed based on implementation feedback in Firefox, and history.state has been introduced.
  • The tracks IDL attribute on media elements has been renamed to textTracks.
  • Event handler content attributes now support ECMAScript strict mode.
  • The forminput and formchange events, and the dispatchFormInput() and dispatchFormChange() methods have been dropped.
  • The rel keywords archives, up, last, index, first and related synonyms have been dropped.
  • Removing a media element from the DOM and inserting it again in the same script now doesn't pause the media element.
  • The video element's letterboxing rules are now specified in terms of CSS 'object-fit'.
  • Cross-origin fonts now don't leak information about the font when drawn on a canvas.
  • The character encoding declaration is now allowed to be within the first 1024 bytes instead of the first 512 bytes.
  • The onerror event handler on window is now invoked for compile-time script errors as well as runtime errors.
  • Script-inserted script elements now have async default to true, which can be set to false to make the scripts execute in insertion order.
  • The atob() and btoa() methods have been specified.
  • The suggested file extension for application cache manifest files has been changed from .manifest to .appcache.
  • The action and formaction attributes are no longer allowed to have the empty string as value.
  • 5.3. Changes from 19 October 2010 to 13 January 2011

  • Drag and drop model was refined.
  • A new global dropzone attribute was added.
  • A new bdi element was added to aid with user-generated content that may have bidi implications.
  • The dir attribute gained a new "auto" value.
  • A dirname attribute was added to input elements. When specified the directionality as specified by the user will be submitted to the server as well.
  • A new track element and associated TextTrack API were added for video text tracks.
  • The getSelection() API moved to a separate DOM Range draft. Similarly UndoManager has been removed from the W3C copy of HTML5 for now as it is not ready yet.

    5.4. Changes from 24 June 2010 to 19 October 2010

  • Numerous changes to the HTML parsing algorithm based on implementation feedback.
  • The hidden attribute now works for table-related elements.
  • The canvas getContext() method is now defined to be able to handle multiple contexts better.
  • The media elements' startTime IDL attribute was renamed to initialTime and startOffsetTime was added.
  • The prefetch link relationship can now be used on a elements.
  • The datetime attribute of ins and del no longer requires a time to be specified.
  • Using PUT and DELETE as HTTP methods for the form element is no longer supported.
  • The s element is no longer deprecated.
  • The video element has a new audio attribute.
  • Per usual, lots of other minor fixes have been made as well.

    5.5. Changes from 4 March 2010 to 24 June 2010

  • The ping attribute has been removed from the W3C version of HTML5.
  • The title element is optional for iframe srcdoc documents and other scenarios where a title is already available. As is the case with email.
  • keywords is now a standard metadata name for the meta element.
  • The allow-top-navigation value has been added for the sandbox attribute on the iframe element. It allows the embedded content to navigate its parent when specified.
  • The wbr element has been added.
  • The alternate keyword for the rel attribute of the link element can now be used to point to feeds again, even if the feed is not an alternative for the document.
  • The HTML to Atom mapping has been removed from the W3C version of HTML5.
  • In addition lots of minor changes, clarifications, and fixes have been made to the document.

    5.6. Changes from 25 August 2009 to 4 March 2010

  • The dialog element has been removed. A section with advice on how to mark up conversations has effectively replaced it.
  • document.head has been introduced to provide convenient access to the head element from script.
  • The link type feed has been removed. alternate with specific media types is to be used instead.
  • createHTMLDocument() has been introduced as API to allow easy creation of HTML documents.
  • Both the meter and progress elements no longer have "magic" processing of their contents because it could not be made to work internationally.
  • The meter and progress elements, as well as the output element, can now be labeled using the label element.
  • A new media type, text/html-sandboxed, was introduced to allow hosting of potentially hostile content without it causing harm.
  • A srcdoc attribute for the iframe element was introduced to allow embedding of potentially hostile content inline. It is expected to be used together with the sandbox and seamless attributes.
  • The figure element now uses a new element figcaption rather than legend because people want to use HTML5 long before it reaches W3C Recommendation.
  • The details element now uses a new element summary for exactly the same reason.
  • The autobuffer attribute on media elements was renamed to preload.
  • A whole lot of other smaller issues have also been resolved. The above list summarizes what is thought to be of primary interest to authors.

    In addition to all of the above, Microdata, the 2D context API for canvas, and Web Messaging (postMessage() API) have been split into their own drafts at the W3C (the WHATWG still publishes a version of HTML5 that includes them):

  • HTML Microdata
  • HTML Canvas 2D Context
  • HTML5 Web Messaging
  • Specific microdata vocabularies are gone altogether in the W3C draft of HTML5 and are not published as a separate draft. The WHATWG draft of HTML5 still includes them.

    5.7. Changes from 23 April 2009 to 25 August 2009

  • When the time element is empty user agents have to render the time in a locale-specific manner.
  • The load event is dispatched at Window, but now has Document as its target.
  • pushState() now affects the Referer (sic) header.
  • onundo and onredo are now on Window.
  • Media elements now have a startTime member that indicates where the current resource starts.
  • header has been renamed to hgroup and a new header element has been introduced.
  • createImageData() now also takes ImageData objects.
  • createPattern() can now take a video element as argument too.
  • The footer element is no longer allowed in header and header is not allowed in address or footer.
  • A new control has been introduced: <input type="tel">
  • The Command API now works for all elements.
  • accesskey is now properly defined.
  • section and article now take a cite attribute.
  • A new feature called Microdata has been introduced which allows people to embed custom data structures in their HTML documents.
  • Using the Microdata model three predefined vocabularies have also been included: vCard, vEvent, and a model for licensing.
  • Drag and drop has been updated to work with the Microdata model.
  • The last of the parsing quirks has been defined.
  • textLength has been added as member of the textarea element.
  • The rp element now takes phrasing content rather than a single character.
  • location.reload() is now defined.
  • The hashchange event now fires asynchronously.
  • Rules for compatibility with XPath 1.0 and XSLT 1.0 have been added.
  • The spellcheck IDL attribute now maps to a DOMString.
  • hasFeature() support has been reduced to a minimum.
  • The Audio() constructor sets the autobuffer attribute.
  • The td element is no longer allowed in thead.
  • The input element and DataTransfer object now have a files IDL attribute.
  • The datagrid and bb have been removed due to their design not being agreed upon.
  • The cue range API has been removed from the media elements.
  • Support for WAI-ARIA has been integrated.
  • On top of this list quite a few minor clarifications, typos, issues specific to implementors, and other small problems have been resolved.

    In addition, the following parts of HTML5 have been taken out and will likely be further developed at the IETF:

  • Definition of URLs.
  • Definition of Content-Type sniffing.
  • 5.8. Changes from 12 February 2009 to 23 April 2009

  • A new global attribute called spellcheck has been added.
  • Defined that ECMAScript this in the global object returns a WindowProxy object rather than the Window object.
  • The value IDL attribute for input elements in the File Upload state is now defined.
  • Definition of designMode was changed to be more in line with legacy implementations.
  • The drawImage() method of the 2D drawing API can now take a video element as well.
  • The way media elements load resources has been changed.
  • document.domain is now IPv6-compatible.
  • The video element gained an autobuffer boolean attribute that serves as a hint.
  • You are now allowed to specify the meta element with a charset attribute in XML documents if the value of that attribute matches the encoding of the document. (Note that it does not specify the value, it is just a talisman.)
  • The bufferingRate and bufferingThrottled members of media elements have been removed.
  • The media element resource selection algorithm is now asynchronous.
  • The postMessage() API now takes an array of MessagePort objects rather than just one.
  • The second argument of the add() method on the select element and the options member of the select element is now optional.
  • The action, enctype, method, novalidate, and target attributes on input and button elements have been renamed to formaction, formenctype, formmethod, formnovalidate, and formtarget.
  • A "storage mutex" concept has been added to deal with separate pages trying to change a storage object (document.cookie and localStorage) at the same time. The Navigator gained a getStorageUpdates() method to allow it to be explicitly released.
  • A syntax for SVG similar to MathML is now defined so that SVG can be included in text/html resources.
  • The placeholder attribute has been added to the textarea element.
  • Added a keygen element for key pair generation.
  • The datagrid element was revised to make the API more asynchronous and allow for unloaded parts of the grid.
  • In addition, several parts of HTML5 have been taken out and will be further developed by the Web Applications Working Group as standalone specifications:

  • WebSocket API
  • WebSocket protocol
  • Server-Sent Events
  • Web Storage (localStorage and sessionStorage)
  • Web SQL Database
  • 5.9. Changes from 10 June 2008 to 12 February 2009

  • The data member of ImageData objects has been changed from an array to a CanvasPixelArray object.
  • Shadows are now required from implementations of the canvas element and its API.
  • Security model for canvas is clarified.
  • Various changes to the processing model of canvas have been made in response to implementation and author feedback. E.g. clarifying what happens when NaN and Infinity are passed and fixing the definitions of arc() and arcTo().
  • innerHTML in XML was slightly changed to improve round-tripping.
  • The toDataURL() method on the canvas element now supports setting a quality level when the media type argument is image/jpeg.
  • The poster attribute of the video element now affects its intrinsic dimensions.
  • The behavior of the type attribute of the link element has been clarified.
  • Sniffing is now allowed for link when the expected type is an image.
  • A section on URLs is introduced dealing with how URL values are to be interpreted and what exactly authors are required to do. Every feature of the specification that uses URLs has been reworded to take the new URL section into account.
  • It is now explicit that the href attribute of the base element does not depend on xml:base.
  • It is now defined what the behavior should be when the base URL changes.
  • URL decomposition IDL attributes are now more aligned with Internet Explorer.
  • The xmlns attribute with the value http://www.w3.org/1999/xhtml is now allowed on all HTML elements.
  • data-* attributes and custom attributes on the embed element now have to match the XML Name production and cannot contain a colon.
  • WebSocket API is introduced for bidirectional communication with a server.
  • The default value of volume on media elements is now 1.0 rather than 0.5.
  • event-source was renamed to eventsource because no other HTML element uses a hyphen.
  • A message channel API has been introduced augmenting postMessage().
  • A new element named bb has been added. It represents a user agent command that the user can invoke.
  • The addCueRange() method on media elements has been modified to take an identifier which is exposed in the callbacks.
  • It is now defined how to mutate a DOM into an infoset.
  • The parent attribute of the Window object is now defined.
  • The embed element is defined to do extension sniffing for compatibility with servers that deliver Flash as text/plain. (This is marked as an issue in the specification to figure out if there is a better way to make this work.)
  • The embed can now be used without its src attribute.
  • getElementsByClassName() is defined to be ASCII case-insensitive in quirks mode for consistency with CSS.
  • In HTML documents localName no longer returns the node name in uppercase.
  • data-* attributes are defined to be always lowercase.
  • The opener attribute of the Window object is not to be present when the page was opened from a link with target="_blank" and rel="noreferrer".
  • The top attribute of the Window object is now defined.
  • The a element now allows nested flow content, but not nested interactive content.
  • It is now defined what the header element means to document summaries and table of contents.
  • What it means to fetch a resource is now defined.
  • Patterns are now required for the canvas element.
  • The autosubmit attribute has been removed from the menu element.
  • Support for outerHTML and insertAdjacentHTML() has been added.
  • xml:lang is now allowed in HTML when lang is also specified and they have the same value. In XML lang is allowed if xml:lang is also specified and they have the same value.
  • The frameElement attribute of the Window object is now defined.
  • An event loop and task queue is now defined detailing script execution and events. All features have been updated to be defined in terms of this mechanism.
  • If the alt attribute is omitted a title attribute, an enclosing figure element with a legend element descendant, or an enclosing section with an associated heading must be present.
  • The irrelevant attribute has been renamed to hidden.
  • The definitionURL attribute of MathML is now properly supported. Previously it would have ended up being all lowercase during parsing.
  • User agents must treat US-ASCII as Windows-1252 for compatibility reasons.
  • An alternative syntax for the DOCTYPE is allowed for compatibility with some XML tools.
  • Data templates have been removed (consisted of the datatemplate, rule and nest elements).
  • The media elements now support just a single loop attribute.
  • The load() method on media elements has been redefined as asynchronous. It also tries out files in turn now rather than just looking at the type attribute of the source element.
  • A new member called canPlayType() has been added to the media elements.
  • The totalBytes and bufferedBytes attributes have been removed from the media elements.
  • The Location object gained a resolveURL() method.
  • The q element has changed again. Punctuation is to be provided by the user agent again.
  • Various changes were made to the HTML parser algorithm to be more in line with the behavior Web sites require.
  • The unload and beforeunload events are now defined.
  • The IDL blocks in the specification have been revamped to be in line with the upcoming Web IDL specification.
  • Table headers can now have headers. User agents are required to support a headers attribute pointing to a td or th element, but authors are required to only let them point to th elements.
  • Interested parties can now register new http-equiv values.
  • When the meta element has a charset attribute it must occur within the first 512 bytes.
  • The StorageEvent object now has a storageArea attribute.
  • It is now defined how HTML is to be used within the SVG foreignObject element.
  • The notification API has been dropped.
  • How [[Get]] works for the HTMLDocument and Window objects is now defined.
  • The Window object gained the locationbar, menubar, personalbar, scrollbars, statusbar and toolbar attributes giving information about the user interface.
  • The application cache section has been significantly revised and updated.
  • document.domain now relies on the Public Suffix List. [PSL]
  • A non-normative rendering section has been added that describes user agent rendering rules for both obsolete and conforming elements.
  • A normative section has been added that defines when certain selectors as defined in the Selectors and the CSS3 Basic User Interface Module match HTML elements. [SELECTORS] [CSS-UI]
  • Web Forms 2.0, previously a standalone specification, has been fully integrated into HTML5 since last publication. The following changes were made to the forms chapter:

  • Support for XML submission has been removed.
  • Support for form filling has been removed.
  • Support for filling of the select and datalist elements through the data attribute has been removed.
  • Support for associating a field with multiple forms has been removed. A field can still be associated with a form it is not nested in through the form attribute.
  • The dispatchChangeInput() and dispatchFormChange() methods have been removed from the select, input, textarea, and button elements.
  • Repetition templates have been removed.
  • The inputmode attribute has been removed.
  • The input element in the File Upload state no longer supports the min and max attributes.
  • The allow attribute on input elements in the File Upload state is no longer authoritative.
  • The pattern and accept attributes for textarea have been removed.
  • RFC 3106 is no longer explicitly supported.
  • The submit() method now just submits, it no longer ensures the form controls are valid.
  • The input element in the Range state now defaults to the middle, rather than the minimum value.
  • The size attribute on the input element is now conforming (rather than deprecated).
  • object elements now partake in form submission.
  • The type attribute of the input element gained the values color and search.
  • The input element gained a multiple attribute which allows for either multiple e-mails or multiple files to be uploaded depending on the value of the type attribute.
  • The input, button and form elements now have a novalidate attribute to indicate that the form fields should not be required to have valid values upon submission.
  • When the label element contains an input it may still have a for attribute as long as it points to the input element it contains.
  • The input element now has an indeterminate IDL attribute.
  • The input element gained a placeholder attribute.
  • 5.10. Changes from 22 January 2008 to 10 June 2008

  • Implementation and authoring details around the ping attribute have changed.
  • <meta http-equiv=content-type> is now a conforming way to set the character encoding.
  • API for the canvas element has been cleaned up. Text support has been added.
  • globalStorage is now restricted to the same-origin policy and renamed to localStorage. Related event dispatching has been clarified.
  • postMessage() API changed. Only the origin of the message is exposed, no longer the URL. It also requires a second argument that indicates the origin of the target document.
  • Drag and drop API has got clarification. The dataTransfer object now has a types attribute indicating the type of data being transferred.
  • The m element is now called mark.
  • Server-sent events has changed and gotten clarification. It uses a new format so that older implementations are not broken.
  • The figure element no longer requires a caption.
  • The ol element has a new reversed attribute.
  • Character encoding detection has changed in response to feedback.
  • Various changes have been made to the HTML parser section in response to implementation feedback.
  • Various changes to the editing section have been made, including adding queryCommandEnabled() and related methods.
  • The headers attribute has been added for td elements.
  • The table element has a new createTBody() method.
  • MathML support has been added to the HTML parser section. (SVG support is still awaiting input from the SVG WG.)
  • Author-defined attributes have been added. Authors can add attributes to elements in the form of data-name and can access these through the DOM using dataset[name] on the element in question.
  • The q element has changed to require punctuation inside rather than having the browser render it.
  • The target attribute can now have the value _blank.
  • The showModalDialog API has been added.
  • The document.domain API has been defined.
  • The source element now has a new pixelratio attribute useful for videos that have some kind encoding error.
  • bufferedBytes, totalBytes and bufferingThrottled IDL attributes have been added to the video element.
  • Media begin event has been renamed to loadstart for consistency with the Progress Events specification.
  • charset attribute has been added to script.
  • The iframe element has gained the sandbox and seamless attributes which provide sandboxing functionality.
  • The ruby, rt and rp elements have been added to support ruby annotation.
  • A showNotification() method has been added to show notification messages to the user.
  • Support for beforeprint and afterprint events has been added.
  • Acknowledgments

    The editors would like to thank Ben Millard, Bruce Lawson, Cameron McCormack, Charles McCathieNevile, Dan Connolly, David Håsäther, Dennis German, Frank Ellermann, Frank Palinkas, Futomi Hatano, Gordon P. Hemsley, Henri Sivonen, James Graham, Jens Meiert, Jeremy Keith, Jürgen Jeka, Krijn Hoetmer, Leif Halvard Silli, Maciej Stachowiak, Marcos Caceres, Mark Pilgrim, Martijn Wargers, Martyn Haigh, Masataka Yakura, Michael Smith, Ms2ger, Olivier Gendrin, Øistein E. Andersen, Philip Jägenstedt, Philip Taylor, Randy Peterman, Toby Inkster, and Yngve Spjeld Landro for their contributions to this document as well as to all the people who have contributed to HTML5 over the years for improving the Web!

    References

    [CSS-UI]
    CSS3 Basic User Interface Module, T. Çelik. W3C.
    [DOCTYPE]
    Activating Browser Modes with Doctype, H. Sivonen.
    [DOM2HTML]
    Document Object Model (DOM) Level 2 HTML Specification, J. Stenback, P. Le Hégaret, A. Le Hors. W3C.
    [HTML4]
    HTML 4.01 Specification, D. Raggett, A. Le Hors, I. Jacobs, editors. W3C.
    [HTML5]
    HTML5, I. Hickson. W3C.
    HTML5 (editor's draft), I. Hickson. WHATWG.
    HTML5 (editor's draft), I. Hickson. W3C.
    [PSL]
    Public Suffix List, Mozilla Foundation.
    [SELECTORS]
    Selectors, D. Glazman, T. Çelik, I. Hickson. W3C.
    [XHTML1]
    XHTML™ 1.1 - Module-based XHTML (Second Edition), S. McCarron, M. Ishikawa. W3C.
    [XML]
    Extensible Markup Language (XML) 1.0 (Fifth Edition), T. Bray, J. Paoli, C. Sperberg-McQueen, E. Maler, F. Yergeau. W3C.
    Namespaces in XML 1.0 (Third Edition), T. Bray, D. Hollander, A. Layman, R. Tobin, H. S. Thompson. W3C.