News

Tip of the Week by Tiago Oliveira - Queries in databases with a lot of information

November 7, 2020
In large database scenarios we should be careful about how queries are executed.

In large database scenarios we should be careful about how queries are executed.

I share some scenarios to consider:

Queries without upper limit of information return
Let's say we have a query that lets you get all the records in a table. If the table has 100 records OK, if it has 1000 records beware, if it has 10 million chaos.

Action - Ensure that there is always a maximum number of records to return regardless of the customer's request. Limiting for example the maximum number of 100 records to return.

Queries that perform a full table scan

For those who don't know, a full table scan is an action that requires access to all the information in a table. Let's say we have a table with 10 million customer info records. If we try to get the list of Oporto customers, the database will have to go through all 10 million records in the table to find Oporto customers. Scrolling through all records in a table is called a “full table scan”. A query that does full table scan will increase the execution time as the table volume grows, so these problems are not usually discovered at the beginning of implementations, because at that time we do not work with large volumes of information.

Action - To address the execution of “full table scans”, we must create indexes, which are more or less organized shortcuts that make it easier to obtain information. As an example for the scenario above, we would create an index with the customer's city. The database would create a city-organized structure with a pointer to each record in the table. So when we asked for Oporto customers, the database would go first to the index to get the eligible customers, avoiding having to go through all 10 million records.

We can validate the execution of actions performed by the database after executing a query, obtaining its execution plan. An execution plan shows the sequence of actions performed by the database to obtain the desired information.

It will be in the execution plan that we can get the information whether or not a full table scan is being performed.

queries_table_1.webp

 

More info:

https://en.wikipedia.org/wiki/Full_table_scan

https://en.wikipedia.org/wiki/Database_index