Browsers do a lot of heavy lifting to give a seamless experience to the users. Making network requests with unreliable connections, rendering the content, handling user interactions, data, storage, passwords etc. and a lot more. In this article, lets discuss the high level structure of a typical browser and how does it optimize things along the way.
-
User Interface: Includes the address bar, back/forward button, bookmarking menu, etc. Every part of the browser display except the window where you see the requested page.
-
Browser Engine: Marshals actions between the UI and the rendering engine.
-
Rendering Engine: Responsible for displaying requested content. For eg. the rendering engine parses HTML and CSS, and displays the parsed content on the screen.
-
Networking: For network calls such as HTTP requests, using different implementations for different platforms (behind a platform-independent interface).
-
UI Backend: Used for drawing basic widgets like combo boxes and windows. This backend exposes a generic interface that is not platform specific. Underneath it uses operating system user interface methods.
-
JavaScript Engine: Interpreter used to parse and execute JavaScript code.
-
Data Storage: This is a persistence layer. The browser may need to save data locally, such as cookies. Browsers also support storage mechanisms such as localStorage, IndexedDB and FileSystem.
-
User Interface Interaction: This includes back and forward buttons, bookmarks, refresh, stop buttons etc. The browser UI must be responsive and should not block the rendering of the content. Most of these interactions come under the domain of the javascript engine.
-
Rendering Engine: This includes parsing HTML and CSS, rendering parsed content on the screen, etc. The rendering engine will parse the HTML and CSS to build the DOM tree. The render tree contains the visual information of the elements on the screen. The render tree is then laid out to determine the visual formatting model. The render tree is then painted to display the content on the screen.
-
Networking: For network calls such as HTTP requests. A server may accept any version of HTTP and the browser should be able to handle it. The browser should also handle different types of content such as HTML, CSS, JavaScript, images, videos, etc. with maximum efficiency.
-
JavaScript Interpreter: Used to parse and execute JavaScript code. This must be done in isolation from the host environment as the code can be malicious. The JavaScript engine should be fast and efficient.
Google Chrome browser uses the V8 JavaScript engine which is written in C++. Firefox uses SpiderMonkey, Safari uses JavaScriptCore, and earlier versions of Microsoft Edge uses Chakra (now it uses the V8 engine). The engine does a lot of optimizations to make the code performant like just-in-time compilation, garbage collection, etc.
-
Device Interaction: Browsers can access device hardware such as the camera, microphone, GPS etc. These are exposed as Web APIs and are accessed using JavaScript.
-
Storage: Browsers can store data locally, such as cookies, localstorage, sessionstorage, indexedDB, fileSystem API etc. This data is stored in a sandboxed environment and is not accessible by other websites.
-
DevTools: Browsers come with developer tools to help developers debug their code. This includes inspecting the DOM, network activity, storage, performance, privacy, SEO etc. DevTools also allow developers to run JavaScript code in the context of the page.
-
Extensions: Browsers can be extended with extensions to add functionality. For example, ad blockers, password managers, etc. These extensions can access the browser's APIs to interact with the browser, based on user-permissions and the browser's security model.
-
Cache: Browsers cache resources to reduce load times. This includes caching images, CSS, JavaScript, etc. It is controlled by the both the browser and the server using cache-control headers.