Understanding how browsers work is a game-changer to become a better developer, problem solver, and all-around tech badass.
Step-by-Step: How the Browser Works
1. When we hit a domain e.g. codespoetry.com, the browser looks for DNS resolution. So, it goes through a DNS lookup:
-
Browser Cache
-
OS Cache
-
Router
-
ISP DNS server
-
Authoritative DNS server
In case of a valid domain, the domain is resolved to an IP address like 123.12.13.14
2. Then the browser makes an HTTP/HTTPS request and opens a TCP connection (or TLS for HTTPS). The HTTP request contains some necessary info as below:
GET / HTTP/1.1
Host: codespoetry.com
User-Agent: Chrome/123.0.0
3. After receiving the HTTP/HTTPS request, the server responds with an HTML Document:
HTTP/1.1 200 OK
Content-Type: text/html
<html>
<head><title>Welcome</title></head>
<body>Hello world</body>
</html>
4. The Browser starts rendering HTML, and this contains a few steps:
Parsing the DOM
-
HTML is parsed top to bottom.
-
Tags become nodes in a tree (DOM)
-
Assets are fetched (CSS, JS, Images)
-
The browser pauses HTML parsing when it finds a stylesheet ( or script ( unless defer or async is used) since those are render blocking.
DOM construction is incremental. The HTML response turns into tokens, which turn into nodes, which turn into the DOM Tree. A single DOM node starts with a startTag token and ends with an endTag token. Nodes contain all relevant information about the HTML element. The information is described using tokens. Nodes are connected into a DOM tree based on the token hierarchy. If another set of startTag and endTag tokens comes between a set of startTag and endTags, you have a node inside a node, which is how we define the hierarchy of the DOM tree.
Note: The script with the defer attribute downloads the script in parallel with HTML parsing, but executes after finishing the parsing of HTML. And a script with the async attribute executes immediately after download finishes – even if the HTML is still being parsed.
Style Calculation: CSSOM
- CSS is parsed into another tree called CSSOM, similar to the DOM
- DOM and CSSOM are merged into the render tree
The DOM contains all the content of the page. The CSSOM contains all the information on how to style the DOM. CSSOM is similar to the DOM, but different. While the DOM construction is incremental, CSSOM is not. CSS is render blocking: the browser blocks page rendering until it receives and processes all the CSS. CSS is render-blocking because rules can be overwritten, so the content can’t be rendered until the CSSOM is complete. The render tree captures both the content and the styles: the DOM and CSSOM trees are combined into the render tree. To construct the render tree, the browser checks every node, starting from the root of the DOM tree, and determines which CSS rules are attached.
Here is MDN reference to know more about the rendering path.
Layout
- Each node is given coordinates (where it appears in the viewport)
Painting
- Nodes are drawn on the screen pixel by pixel
Composition
- Browser layers things (like z-index, transforms etc) and paints them efficiently
Now the actual first render happens, and we see something on the screen.
5. JavaScript Execution
- The Browser parses and runs JS
- JS can:
- Change the DOM
- Respond to user events (clicks, inputs)
- Make new network requests (AJAX/fetch)
Here, I wrote an article on JavaScript execution context.
6. Ongoing stuff
- User interacts
- Event trigger
- DOM changes
- CSS changes
- Browser re-calculates layout/paint
Render Flow Visualized:
URL → DNS → HTTP Request → HTML Response
↓
Parse HTML → DOM Tree
↓
Fetch CSS → CSSOM Tree
↓
DOM + CSSOM → Render Tree
↓
Layout → Paint → Composite → Show on Screen
All this work is produced and processed by Browser Engines. Here are the Key Browser Engines under the hood.
Browser | Engine |
---|---|
Chrome, Edge | Blink + V8 |
Firefox | Gecko + SpiderMoneky |
Safari | Webkit + JavaScriptCore |
Here is the MDN reference on how browsers work and a deep dive.