1.0 Document Structure
Every HTML document shares the same four-part skeleton: a DOCTYPE declaration, a root
<html> element, a <head> metadata section, and a
<body> content section.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Page Title</title>
</head>
<body>
<!-- visible content goes here -->
</body>
</html>
The browser builds a tree of nodes from this markup called the Document Object Model
(DOM). Every element, attribute, and text node becomes a node in that tree.
1.1 The DOCTYPE Declaration
The first line of any HTML document must be the DOCTYPE declaration:
<!DOCTYPE html>
This single line tells the browser to use the HTML Living Standard (HTML5) parsing
mode. Without it, browsers fall back to "quirks mode" — a legacy compatibility mode with
different layout rules that will likely break modern CSS.
DOCTYPE notes
- Case-insensitive: <!DOCTYPE HTML> and <!doctype html> are equivalent.
- It is not an HTML tag; it is a processing instruction to the browser.
- Older HTML versions (4.01, XHTML 1.0) required long DOCTYPE strings referencing a DTD. HTML5 replaced all of them with <!DOCTYPE html>.
- Always include it — even if the page renders correctly without it in some browsers, omitting it is a standards violation.
1.2 The html Element
The <html> element is the root of the document. It wraps everything
except the DOCTYPE. Its most important attribute is lang:
<html lang="en">
...
</html>
The lang attribute declares the primary language of the document. Screen
readers use it to select the correct voice; search engines use it for language-specific
indexing. Common values are "en", "fr", "de",
"ja", and "zh".
1.3 The head Element
The <head> element holds machine-readable metadata. Its children
are not displayed as page content.
Common head children
| Element | Purpose |
<title> | Text shown in the browser tab and used as the default bookmark name. |
<meta charset> | Declares the character encoding. Always use UTF-8. |
<meta name="viewport"> | Controls layout on mobile devices. Set content="width=device-width, initial-scale=1". |
<meta name="description"> | Short page summary used by search engines in results snippets. |
<link rel="stylesheet"> | Attaches an external CSS file. |
<link rel="icon"> | Sets the favicon. |
<script> | Loads or embeds JavaScript. Prefer placing scripts at end of body or using defer. |
<style> | Embeds CSS directly in the document. |
1.4 The body Element
The <body> element contains all visible page content: headings,
paragraphs, images, links, forms, and everything else the user sees and interacts with.
Two attributes on <body> are commonly used in older sites:
onload to run initialization code after the DOM is ready, and
id to expose a CSS hook for page-level selectors. Modern code typically
uses document.addEventListener('DOMContentLoaded', ...) instead of
onload.
<body id="page" onload="init()">
<header><h1>Site Title</h1></header>
<main>
<p>Primary content</p>
</main>
<footer>© 2026</footer>
</body>
1.5 Nesting Rules
HTML has a small number of critical nesting constraints:
- Block elements may not be nested inside inline elements. E.g., a <div> inside a <span> is invalid.
- <p> cannot contain block elements. If you nest a <div> inside a <p>, the browser closes the <p> before opening the <div>.
- List items (<li>) must be direct children of <ul> or <ol>.
- Table cells (<td>, <th>) must be direct children of <tr>, which must be inside <thead>, <tbody>, or <table>.
- Interactive elements (e.g., <button>) may not be nested inside each other.
Browsers silently repair most violations, but the repaired DOM may differ across browsers.
Always write valid HTML.
1.6 References
Structure References