How do sql indexes work

 

How do sql indexes work


In this video we will discuss how indexes actually work and help improve the performance of our SQL queries. We will discuss how both the index types work - Clustered and Non-clustered.

Clustered Index Structure

Consider the following Employees table

sample employees table

EmployeeId is the primary key, so by default a clusterd index on the EmployeeId column is created. This means employee data is sorted by EmployeeId column and physically stored in a series of data pages in a tree like structure that looks like the following.

how sql server indexing works

  • The nodes at the bottom of the tree are called data pages or leaf nodes and contain the actual data rows, in our case employee rows.
  • These employee rows are sorted by EmployeeId column, because EmployeeId is the primary key and by default a clusterd index on this column is created.
  • For our example, let's say in Employees table we have 1200 rows and let's assume in each data page we have 200 rows.
  • So, in the first data page we have 1 to 200 rows, in the second 201 to 400, in the third 401 to 600, so on and so forth.
  • The node at the top of the tree is called Root Node.
  • The nodes between the root node and the leaf nodes are called intermediate levels.
  • The root and and the intermediate level nodes contain index rows.
  • Each index row contains a key value (in our case Employee Id) and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf node. 
  • So this tree like structure has a series of pointers that helps the query engine find data quickly.
  • How SQL Server finds a row by ID

    For example, let's say we want to find Employee row with EmployeeId = 1120

    database index tutorial

    1. So the database engine starts at the root node and it picks the index node on the right, because the database engine knows it is this node that contains employee IDs from 801 to 1200.
    2. From there, it picks the leaf node that is present on the extreme right, because employee data rows from 1001 to 1200 are present in this leaf node.
    3. The data rows in the leaf node are sorted by Employee ID, so it's easy for the database engine to find the employee row with Id = 1120.

    Notice in just 3 operations, SQL Server is able to find the data we are looking for. It's making use of the clustered index we have on the table. Let's look at this in action.

How SQL Server finds a row by ID

For example, let's say we want to find Employee row with EmployeeId = 1120

database index tutorial

  1. So the database engine starts at the root node and it picks the index node on the right, because the database engine knows it is this node that contains employee IDs from 801 to 1200.
  2. From there, it picks the leaf node that is present on the extreme right, because employee data rows from 1001 to 1200 are present in this leaf node.
  3. The data rows in the leaf node are sorted by Employee ID, so it's easy for the database engine to find the employee row with Id = 1120.

Notice in just 3 operations, SQL Server is able to find the data we are looking for. It's making use of the clustered index we have on the table. Let's look at this in action.

The actual execution plan is below.

clustered index seek

Notice, the operation is Clustered Index Seek, meaning the database engine is using the clustered index on the employee Id column to find the employee row with Id = 932000

  • Number of rows read = 1
  • Actual number of rows for all executions = 1

Number of rows read, is the number of rows SQL server has to read to produce the query result. In our case Employee Id is unique, so we expect 1 row and that is represented by Actual number of rows for all executions.

With the help of the index, SQL server is able to directly read that 1 specific employee row we want. Hence, both, Number of rows read and Actual number of rows for all executions is 1.

So the point is, if there are thousands or even millions of records, SQL server can easily and quickly find the data we are looking for, provided there is an index that can help the query find data.



Share this

Previous
Next Post »