Top 10 causes of database performance problems and how to troubleshoot
When a client tells you that their database is running slowly, it can be impossible to know where to start. It’s just like the old joke when someone asks for directions and is given the answer that ‘you don’t want to start here!’—you really don’t want to start diagnosing this issue without knowing where to look.
Ideally, you’d start with a monitoring tool, such as DBmarlin, already in place that has been building up a history of performance data so you can see the trend over time. This means that you can start to overlay today’s performance with your baseline performance and start to unpick what today’s issues really are.
Here is my top 10 things to look for:
- Which users are affected? - Is this a single user issue or are all users affected?
- Changes Have there been any software releases or schema changes on the system today?
- Disk space - Are any of your filesystems or drives full?
- Redo/Archive logs - has your log production jumped?
- CPU - has your database CPU consumption jumped?
- Disk I/O - Is your database suddenly doing far more I/O than before?
- SQL - what are your top resource-consuming queries?
- Locks - have you any blocking locks?
- Object sizes - Have any key tables or indexes suddenly jumped in size?
- Sessions - has the number of sessions suddenly spiked?
Taking these in turn, let’s expand a bit.
Which users are affected? Is this a single user issue or are all users affected?
- If this is a single user problem, not only is it easier to identify what is going on by talking to them, but also in reality this is a lower priority than something affecting all users. So, prioritise accordingly and grab a nice cup of coffee if you can. ☕️
- If all users are affected, then it’s a much bigger problem and you might be firefighting the problem under pressure. In this case, ensuring that you have all the information readily available and presented clearly before the event will ease the stress.
- DBmarlin lets you drill down into Users, Clients, Sessions, Programs and Database or Schema so you can easily see who is affected.
Releases - have there been any software releases or changes on the system today?
- Knowing about the software releases is key to understanding variable system performance, especially in today’s world of agile and rapid release cycles.
- Have any new database objects been dropped in?
- Have your key queries changed execution paths because of new objects or changes in object size? (i.e. have 3,000 row tables becomes 3,000,000 overnight?)
- DBmarlin can help you here with it’s change tracking function showing changes along the timeline on the landing page for your database.
Disk - are any of your filesystems or drives full?
- It may seem obvious, but servers do not react well to having key filesystems full, and neither do databases. This doesn’t just include tablespace and transaction log area, but audit file space is required as looking at trace files you may find that suddenly the app is core dumping and causing unexpected issues.
- Remember to check for space in:
- Server root mounts and drives
- Software mounts and drives
- Online transaction/redo log areas
- Archived transaction/redo log areas
- Database Temporary locations
- Audit trail locations
- Trace file locations
Transaction logs - has your log production jumped?
- An unexpected increase in transaction load can stress the transaction log system and cause issues with archived/saved logs too.
- Each RDBMS vendor has its own name for these logs.
- Oracle Redo/Archive logs
- MS-SQL Transaction Log
- PostgreSQL WAL logs
- MySQL binary logs
- Ensure your logs are sized correctly for the transaction throughput so that log switches do not affect your system.
CPU - has your database CPU consumption jumped?
- Knowing your system’s performance during key times of day is critical to diagnosing issues.
- DBmarlin’s Time Comparison feature allows you to easily see Total DB Time differences between two periods. By default it will compare the last hour with the previous one allowing you to see if something has just happened.
Disk - is your database suddenly doing far more I/O than before?
- Again - having an historical view is key to troubleshooting.
- DBmarlin’s Time Comparison feature allows you to easily see the Top Waits in your database and what has changed. Again - by default, it will compare the last hour with the previous one allowing you to see if something has just happened.
SQL - what are your top consuming queries?
- Being able to quickly determine what your top resource-consuming queries are is critical to getting to the bottom of that issue.
- DBmarlin’s ability to show you a graphical representation of what is occurring now on the landing page of your database, but once again, the Time Comparison feature allows you to pick a baseline time from this hour last week, when everything was peachy.
Locks - have you any blocking locks?
- Blocking locks can quickly hang up your key applications.
- DBmarlin can help start you in the right direction by showing the wait time on the Wait Events pie chart on your database’s landing page.
Object sizes? Have any key tables or indexes suddenly jumped in size?
- Keeping a record of the size of your database objects is a subject all on its own. Being able to track them over time allows you to see if there has been a sudden increase, altering your execution plans and killing the performance of your application’s key queries.
- DBmarlin’s Change History facility allows you to see if a release could be causing your current issue. After selecting the change you are interested in, examine the Statements to find the ones with multiple execution plans, and use the Execution Plans drop-down to compare the plans.
Sessions - has the number of sessions suddenly spiked?
- Being able to see at a glance if you are being hit by an application server that has an issue is of real value here.
- On DBmarlin’s database landing page, the DB Time will be the first thing you see in the top left corner indicating the trend. If you see it rocketing upwards, change the data of the pie chart to Sessions and Clients to identify the source of the issue.
In Summary
So in summary, troubleshooting is rarely a simple exercise but it can be made a lot easier if you have a structured approach to troubleshooting supported by a tool which visualises database performance. Having been involved in many major incidents over the years, the technical teams will communicate well and reach a common understanding of issues and deciding next steps, but I have often found it very useful to have pictures to show ‘The Management’ as a way of increasing their understanding of complex issues and gaining their buy-in as to where focus should be put.
As in common with standalone tuning exercises, don’t try and boil the ocean in fixing things - you are here to restore service. Having an historical comparison available to show you ‘what good looks like’ is incredibly useful here.
And - when you’ve restored service - stop!
Ready to try DBmarlin?
If you would like to find out more about DBmarlin and why we think it is special, try one of the links below.
- Get hands-on without an installation at play.dbmarlin.com
- Download DBmarlin from www.dbmarlin.com, with one FREE standard edition license, which is free forever for 1 target database.
- Follow the latest news on our LinkedIn Community at linkedin.com/showcase/dbmarlin
- Join our community on Slack at join-community.dbmarlin.com