Optimising database performance is crucial for maintaining the efficiency and responsiveness of applications. Certain SQL queries, when not carefully crafted, can significantly degrade database performance, especially as data volumes grow. This blog post explores common SQL queries that can lead to performance issues and how to manage and optimise these queries using DBmarlin.
1. Unoptimised JOINs:
Example:
1
2
3
| SELECT * FROM orders
JOIN customers ON orders.customer_id = customers.id
WHERE customers.city = 'New York';
|
- Issue: Unoptimised JOINs can lead to large intermediate result sets, causing excessive memory and CPU usage. Indexes on join columns are crucial to improve performance.
- Solution: Ensure that join columns are indexed and consider using INNER JOIN instead of LEFT OUTER JOIN when possible to reduce unnecessary data retrieval.
1
| CREATE INDEX IDX_ORDER_CUST_ID ON orders.customer_id;
|
Here is an example of the impact of adding an index to a join column shown in DBmarlin. The red bars are the time spent sorting which is very CPU intensive. After creating the index, sorting is no longer needed since the index is already sorted.
2. Sub-queries in WHERE clause:
Example:
1
2
| SELECT * FROM employees
WHERE department_id IN (SELECT id FROM departments WHERE name = 'Sales');
|
- Issue: Subqueries in the WHERE clause can result in inefficient execution plans, especially if the subquery returns a large number of rows.
- Solution: Use JOINs or EXISTS instead of IN with subqueries for better performance.
1
2
3
| SELECT e.* FROM employees e
JOIN departments d ON e.department_id = d.id
WHERE d.name = 'Sales';
|
3. Non-sargable queries:
Non-Sargable refers to a type of SQL query that cannot efficiently utilise indexes to speed up the retrieval of results. SARGable stands for “Search ARGument ABLE,” which means that a query can use an index effectively.
Example:
1
2
| SELECT * FROM orders
WHERE YEAR(order_date) = 2023;
|
- Issue: Non-Sargable queries often result in full table scans or index scans, leading to slower performance.
- Solution: Rewrite queries to use indexed columns directly.
1
2
| SELECT * FROM orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';
|
4. Wildcard searches with leading percent:
Example:
1
2
| SELECT * FROM users
WHERE email LIKE '%@example.com';
|
- Issue: Leading wildcard searches result in full table scans, significantly degrading performance.
- Solution: Avoid leading wildcards or use full-text search capabilities if available.
1
| SELECT id, name, email FROM users;
|
5. ‘SELECT *’ queries:
Example:
1
2
| SELECT * FROM orders
WHERE order_date > '2024-06-01';
|
- Issue: Retrieving all columns can lead to excessive data transfer and processing, especially if many columns are unnecessary.
- Solution: Specify only the required columns in the SELECT statement.
1
| SELECT id, order_date FROM orders;
|
6. Cartesian products:
Example:
1
| SELECT * FROM products, categories;
|
- Issue: Missing join conditions can create Cartesian products, resulting in a combinatorial explosion of rows.
- Solution: Always include appropriate join conditions.
1
2
| SELECT * FROM products p
JOIN categories c ON p.category_id = c.id;
|
7. Functions on indexed columns:
Example:
1
2
| SELECT * FROM employees
WHERE LOWER(first_name) = 'john';
|
- Issue: Functions on indexed columns can negate the benefits of indexing, leading to full table scans.
- Solution: Store data in a normalised format or use functional indexes if supported.
8. No WHERE clause - query returning too many rows:
Example:
1
| SELECT id, product_name, product_category FROM products;
|
- Issue: Returning all the rows in a table when you only need a subset of them is bad-practice. If the table is only small then the overhead many only be small but as data volumes grow this won’t scale well. This will impact on the network and application if you have to transfer and process data that isn’t required by the application.
- Solution: Always use a WHERE clause to restrict the result set to only the records that are actually needed.
1
2
| SELECT id, product_name FROM products
WHERE product_category = 'Phone';
|
How DBmarlin Can Help
DBmarlin offers powerful tools to identify and optimise performance-degrading queries:
- Query Performance Monitoring: Feature: Monitor real-time query performance, highlighting slow and resource-intensive queries. Benefit: Quickly identify problematic queries and take corrective actions.
- Query Analytics: Feature: Analyse execution plans and query performance metrics. Benefit: Gain insights into how queries are executed and identify optimisation opportunities.
- Automated Recommendations: Feature: Receive automated recommendations for query and index optimisation. Benefit: Implement best practices and optimise query performance effortlessly.
- Historical Data Analysis: Feature: Compare current query performance with historical data to detect performance regressions. Benefit: Ensure continuous performance improvements and avoid performance degradation over time.
Conclusion
Understanding and optimising SQL queries is essential for maintaining database performance. By recognising common performance-degrading queries and utilising tools like DBmarlin, you can ensure efficient and responsive database operations. Monitor query performance, implement optimisation strategies, and leverage DBmarlin’s insights to keep your database running smoothly.