Improper indexing in databases can significantly slow down query performance, leading to frustrating user experiences and inefficient application behavior. Indexes are essential for optimizing data retrieval, but when used incorrectly, they can cause more harm than good. Below are three practical examples illustrating how improper indexing can create performance bottlenecks.
Context: In an e-commerce application, customer orders are often searched by the ‘customer_id’. Without an appropriate index, the database must scan the entire ’orders’ table for every query, leading to slow response times.
To illustrate:
orders
(order_id, customer_id, product_id, order_date)Query:
SELECT * FROM orders WHERE customer_id = 12345;
Result:
Without an index on customer_id
, the database performs a full table scan, which can take considerable time as the number of records grows.
Notes:
customer_id
would optimize this query, enabling the database to quickly locate all orders for a specific customer.Context: A financial application maintains a table for transactions with indexes on multiple columns, including ‘transaction_id’, ‘user_id’, and ‘timestamp’. However, having redundant indexes can increase write operation times and slow down read queries.
For example:
transactions
(transaction_id, user_id, amount, timestamp)user_id
timestamp
user_id
and timestamp
Query:
SELECT * FROM transactions WHERE user_id = 67890 AND timestamp > '2023-01-01';
Result:
While the composite index may help, the database may struggle to choose the best index among the redundant ones, leading to slower performance.
Notes:
Context: In a social media application, user posts are stored in a database, and an index is created on the ‘visibility’ column, which only has a few distinct values (e.g., public, private, friends). This non-selective index does not help with performance and can actually degrade it.
Consider the following:
posts
(post_id, user_id, content, visibility)Query:
SELECT * FROM posts WHERE visibility = 'public';
Result:
Since the visibility
column has low cardinality, the database retrieves a vast number of rows, and the index does not sufficiently narrow down the results, causing a performance hit.
Notes:
user_id
or a combination of user_id
and timestamp
, to enhance query efficiency.