Back

B-Trees and B+Trees in Databases - High Level Design

MD Rashid Hussain
MD Rashid Hussain
Oct-2024  -  8 minutes to read
Image source attributed to: https://www.exasol.com

Databases are the backbone of modern applications. They are the storage layer for the data that the application needs to store and retrieve. The data is stored in tables, and the tables are stored in files. The files are stored on disk.

The data is stored in the form of rows and columns. The rows are the individual records, and the columns are the individual fields in the record.

A good database system should be able to store and retrieve data quickly and efficiently. The data should be stored in such a way that it can be retrieved quickly and efficiently. Meaning thereby, we want to minimize the time it takes to retrieve/manipulate the data, and also minimize the amount of space it takes to store the data.

There are certain ways to search for data in a table:

  1. Full Table scan Here we scan the entire table to find the data we are looking for, one by one, for each entry in the database. This is the slowest way to search for data.
  2. Optimized full table scan Here we try to reduce the search time by Paralellizing the search into multiple machines or multiple threads, or partitioning the table into multiple chunks based on key.
  3. Create an Index Here we try to reduce the search time by creating an index on the table. An index is a data structure that is used to speed up the search of data in a table. It is like a table of contents in a book. It is a separate data structure that is stored in memory and is used to quickly locate the data in the table.

In all the scenarios above, we work to reduce the search space. There is a saying

When you need to work with a table with a billion rows, you need to avoid working with a billion rows

  • Find any way to avoid working with large datasets
  • Try to segment and eliminate things which is not going to give you the results

Indexing is a way to reduce the search space. An index is a data structure that keeps track of the values in a table.

It is a way to organize the data in such a way that it can be searched quickly and efficiently. Search only the things that you are sure, that the value youre looking at, is in that space.

Lets say you want to distribute the data such that you half the search space at each step. You can use a binary search tree. Good start! You can search/insert/delete for a value in O(log n) time.

A simple binary search tree

flowchart TB A[5] B[2] C[7] D[1] E[3] F[6] G[8] A --> B A --> C B --> D B --> E C --> F C --> G

Notice, all the values to the left of a node are smaller than the node, and all the values to the right of a node are greater than the node.

But the problem is that the binary search tree is not balanced in real-life scenarios. It can be skewed to one side, and in that case, the search time increases to O(n).

Take this example: You have a table with an auto-increment id as the key.

  • Insert first record: key = 1, added the root of the tree
  • Insert second record: key = 2, inserted to the right of 1 (2 is bigger than 1)
  • Insert third record: key = 3, inserted to the right of 2 (3 is bigger than 2)
  • ... and so on

This way, the tree becomes a full table linear scan.

Also, in case of range queries, you got the first value, then you have to hop back and forth for the next data untill you hit the end of range. It would be good if we should be able to balance this tree, introducing B-Tree

B-Tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree is a generalization of a binary search tree in that a node can have more than two children.

A B-Tree with order 3

flowchart TB A["25"] B["10"] C["35, 50"] D["5"] E["15, 20"] F["30"] G["39"] H["60, 70"] A --> B A --> C B --> D B --> E C --> F C --> G C --> H

Notice, the values are sorted in the node, and the children are sorted in the node. The values in the left child are smaller than the value in the node, and the values in the right child are greater than the value in the node.

We can have a B-Tree with a higher order, say 100. This way, we can have a lot of values in a single node, and we can reduce the number of hops to find the value.

Searching

  • Start from the root
  • If the value is less than the root, go to the left child
  • If the value is greater than the root, search in the right elements of the node
  • Similarly go down the tree untill you find the value
  • If the value is present, it will be found in one of the internal or the leaf nodes

B-Trees often suck in range queries. You have to hop back and forth to find the next value in the range. This is where B+ Trees come into play.

B+Trees are a variation of B-Trees. They are optimized for range queries. They are used in databases and file systems. They are used to store and retrieve data quickly and efficiently.

The main difference between a B-Tree and a B+Tree is that the data is stored in the leaf nodes of the tree. The internal nodes of the tree are used to store pointers to the leaf nodes. This makes it easier to search for data in the tree. The leaf nodes form a linked list, which makes it easier for linear sequential scan data for range queries in the tree.

A B+Tree of order 3

                [25]
           /           \
       [10]          	[35,      50]
      /    \         	/     |      \
[5,10]->[15,20]->[25,30]->[35,40]->[50,60,70]

Notice, the values are sorted in the leaf nodes, and the leaf nodes are linked together. The internal nodes are used to store pointers to the leaf nodes.

Searching in a B+Tree is similar to searching in a B-Tree. You start from the root and go down the tree untill you find the value. If the value is present, it will be found in one of the leaf nodes.

B+Trees are optimized for range queries. You can easily scan the leaf nodes of the tree to find the values in a range. This makes them ideal for databases and file systems.

The operating system stores data in the form of pages in the disk. A page is a fixed-size block of data that is read from or written to the disk. The size of a page is typically four or eight kilobytes.

When you read data from the disk, you read a page at a time. When you write data to the disk, you write a page at a time.

This is where the B+Tree shines. The leaf nodes of the B+Tree are stored in the pages of the disk. This makes it easy to read and write data from the disk. You can read a page at a time and search for the value in the page.

The internal nodes of the B+Tree are stored in memory. This makes it easy to search for the value in the tree. You can search for the value in the internal nodes and follow the pointers to the leaf nodes.

The B+Tree is a good way to reduce the search space and speed up the search for data in a table.

But in case of B-Trees, the internal nodes also store the data, which makes it difficult to search for the value in the tree. You have to search for the value in the internal nodes and follow the pointers to the leaf nodes.

B+Trees are a variation of B-Trees that are optimized for range queries. They are used in databases and file systems to store and retrieve data quickly and efficiently. They are a good way to reduce the search space and speed up the search for data in a table.