Kelsey Uebelhor is Director of Product Marketing at VividCortex.
Editor's Note: In this three-part series of articles, we look at various approaches to database monitoring that can improve app performance and availability, online customer experience, and engineering team productivity. In this article, we address poor app performance.
Many components contribute to an app's performance, but the database is at the foundation of these technology stacks. When an app is functioning poorly, it is often due to problems at the database level that include: server stalls, bad query performance or poor latency.
These are just three of the common database-related challenges engineering teams encounter every day. Good database performance monitoring can help DBAs, developers, and engineers quickly diagnose and resolve these issues. But what do you need to know so you can choose the right approach?
To answer this question, you need to start by understanding where application monitoring ends and database monitoring begins.
Application Monitoring is Not Enough
Today, businesses build modern apps by deliberately making their multi-tier architecture mostly stateless, which makes those apps easy to manage and scale. But this also makes them highly demanding—sending countless, and sometimes arbitrary, queries against the databases and assuming they will perform well. Because they are “stateful,” all the heavy lifting is delegated to the databases.
While the primary concern for most businesses is application performance, that does not mean you can focus only on monitoring the application using Application Performance Monitoring (APM) tools. APM tools can help you identify slow application transactions, but they typically cannot help to diagnose or resolve the problem. Issue identification is obviously important, but to quickly diagnose and fix the problem you need to drill down into the database.
Database Monitoring Essentials
Database monitoring involves much more than graphing counters and CPU utilization trends over time. In a complex, modern architecture, databases can be a leading cause of system performance problems, and getting to the source of issues requires digging a bit deeper. It starts with query monitoring and workload analysis, deeper drill down, and anomaly detection.
- Query Monitoring and Workload Analytics — The database’s sole reason for existing is to run a workload—to execute queries. Therefore, the only way to know if the database is healthy and performing well is to measure what matters: whether its workload is completing successfully, quickly, and consistently. Queries need to be measured overall as well as individually, and they need to be deeply analyzed. Query workloads are huge and complex datasets. DBAs and engineering teams need to be able to: drill down to individual statement executions to examine the specifics; automate capture and analysis of query execution plans (EXPLAIN plans); aggregate and segment queries to find the big problems fast; and compare workload changes over separate time periods.
- Drill Down and Zoom-in— To monitor large, distributed, hybrid systems effectively, monitoring tools must present an aggregate view and enable rapid zoom-in and drill-down to the finest level of detail. Without the high-level view, monitoring isn’t scalable; but without the deep dive, you can’t solve problems effectively. You need rapid drill-down and zoom-in by multiple dimensions that include: hosts, tags, users, roles and time ranges. For high volume, highly scalable environments you need at least one-second granularity and microsecond precision, as query performance is often sub-millisecond. You must also be able to drill down to individual queries, disks, or processes.
- Anomaly Detection— For database monitoring, the volume of data is typically several orders of magnitude greater than that generated by basic system monitoring. Humans cannot begin to process this volume and complexity of data. Traditional monitoring tools let you generate alerts on static thresholds, such as “CPU exceeded 90 percent.” But this approach does not scale. Systems are constantly changing; what’s normal at 2 a.m. is very different from what you should expect at 2 p.m., and what was once a meaningful threshold can become inconsequential in another context. That’s why all modern monitoring tools offer some form of anomaly detection. However, for database monitoring this capability is particularly important, due to the variability and volume of the data being transmitted. With anomaly detection, you can do things like automatic baselining and seasonality prediction using “lateral logic” metrics that measure “time-to-disk-full” instead of simply alerting when the disk is already full.
As the engine of high-performance systems, databases need instrumentation that will enable fast discovery of issues and the fine tuning necessary to maintain uptime and scalability. Facing constant growth and complexity, the job for developers, engineers, and DBAs is only getting tougher. Having complete visibility into performance metrics helps users better understand the many ways databases affect overall app performance and availability and what to optimize to keep apps running at peak performance.
In part two of the series, “How Database Monitoring Can Eliminate Problems Before Customers Notice,” we will discuss how you can use proactive monitoring to prevent issues before they ever occur.
Opinions expressed in the article above do not necessarily reflect the opinions of Data Center Knowledge and Penton.