Image courtesy of Grant Cochrane / FreeDigitalPhotos.net
SQL: Story of Indexes
I know many people know everything about this topic but as this is one of my favorite topics, I will make sure that I will give you some additional information or knowledge.
My first thought about Indexes is that ‘It is a solution’. I assume that you will ask me ‘What was the problem behind it?’
The problem lies in the way our regular data is stored. We know that data is stored on the heap.
What is the Heap?
A heap file is an unordered set of records where Records can be inserted and deleted. Also, Records are uniquely identified by a record-id. We can use this record-id to retrieve the specific data.
Heap file supports sequential scan of records. We can run our SQL query to retrieve the data but efficiency will always be on the lower side.
If we make some explicit correction and introduce some kind of order to this unordered set of data, we can reach a certain level of efficiency.
What do we need to do?
We need to locate a column based on some specific requirements and used it as a reference to retrieve the entire row of data. When we fire a SQL query to search for a row of data by providing a conditional value, SQL Server first finds that value in this special column, and retrieve the corresponding entire row of data.This column is normally called as Index.
What are the specific requirements?
Indexes can be created on most columns in a table except if the data-type is large object (LOB) data types, such as image, text, and varchar(max). It may also uniquely identify the row. if not, the output result might contain more that one row.
How Index-Seek happens?
We need some knowledge of B-tree architecture to understand Index-seek.
Indexed column values are arranged as per leaf nodes shown above. Each of the group is referenced as pages and also pointed as index level pages. When a SQL query is run having some conditional value on index. The query engine starts from the root level and follows down till the leaf node, narrowing down the result set. Once query engine reaches the index level, it identifies and retrieves the exact result set expected. For example, let’s search for a value 60 in indexed column, the query engine starts from the root level and traverse down towards the right side and finally reaches the leaf node level. Here, the query engine identifies the value 60.
This operation is called index-seek where instead of going for sequential scan of the whole table, a B-tree traversal is done to seek the value on an indexed column.
Next in line:
1) Clustered Index and Non-Clustered Index