Vic Nyman is the co-founder and COO of BlueStripe Software. Vic has more than 20 years of experience in systems management and APM, and has held leadership positions at Wily Technology, IBM Tivoli, and Relicore/Symantec.
If you’ve ever watched a home improvement show, there’s one truism that always comes out: “Use the right tool for the right job.” With that context in mind, it seems logical for a server administrator to use a server management tool when trying to manage their servers.
But servers don’t operate in vacuums, and they don’t operate simply for the sake of running. Infrastructure systems are designed and deployed to run business services through complex sets of applications and transactions. In these complex systems, a server management tool is no longer the “right tool” for managing server performance and availability. Server management requires a broader view that places the server being managed in the context of the applications and business services that are being provided.
So what to use? The best tool for managing server performance and availability in production distributed applications is a transaction monitoring tool that bridges across the servers, the distributed applications and the transactions that those applications deliver. Server administrators need a tool that will show them which distributed applications run on their servers, which applications rely on their server, what the server’s back-end dependencies are, where the server has failed or poorly performing connections and what rogue applications and processes are running.
Know Which Applications are Running on the Server
Systems administrators need to know which applications are deployed on the managed server and what connections they’re serving or making. Measuring the performance of these connections is impossible if you only look at operating system resources. This information provides a reality check into how the server is actually behaving.
Of course, the list shouldn’t just be created and shown in a vacuum. Tracking how each application on a managed server performs, and with which servers the individual processes communicate, are critical to understanding if a server is configured properly. It also helps system administrators determine if there’s a performance issue and whether other administrators should be alerted.
Know What Servers and Ports Rely On the Server
From a server perspective, front-end connections are the specific servers and ports sending requests to the applications running on the server. To understand the impact of a server to the broader service, it’s critical to know which services rely on the server in the first place.
Once front-end connections are known, seeing the performance of each request to the server is the key component of understanding a server’s impact on business services. Resource-centric server monitoring tools can’t even see the broader context, let alone identify where a request was initiated, or which system initiated a given request.
Having this knowledge empowers server administrators and IT Operations teams:
- Server owners can instantly tell if any of their servers are involved in a given application.
- IT Operations teams use visibility to front-end connections to know which servers (and applications) are impacted if a given server were to go off-line.
- These features allow any IT team member to quickly know whether or not they should be involved in a bridge call to fix an outage.
Know the Back End Dependencies
Back-end dependencies for a server are exactly what they sound like – the processes, servers, and systems that the managed server calls in the course of executing its functions. These systems can be anything from a secure Lightweight Directory Access Protocol (LDAP) system to a large database or a mainframe.
The server administrator/owner will want to know specifically which servers and processes any given server is talking to, when they’re called, and the performance of those calls:
- Knowing which other systems any given server is dependent on (application-by-application) provides a checklist of where to check for problems when an application slows down at the server.
- Having response time data linked to specific technical protocols (such as SQL or MQ Series) provides additional data needed to solve problems.
See Problem Connections and Failed Connections
About this time, you’re thinking: “Failed connections? I’ve got that covered with my network monitoring system.” If all problem connections were simply that a hub or router stopped working, then a network tool would be sufficient. Unfortunately, failed connections occur all the time on working network channels.
More than just seeing network connections, you have to understand the application flow and track each process-to-process connection for proper monitoring. When application connections fail but the network connections are working, a network tool cannot know that a problem even exists, let alone solve it.
Two specific causes based on connection monitoring can be quickly isolated:
- DNS Issues – DNS server changes can take a while to propagate. Every request could end up adding a DNS change, which often creates hidden latencies that can add up to big slow-downs.
- Load Changes – understanding differences in the scaled number of requests in between any two infrastructure systems allows IT Operations to cut through the haze and know whether something is even wrong.
Handling Rogue Applications and Processes
When anti-virus is running in the datacenter, signature file updates are usually configured to occur in off peak hours (usually in the middle of the night). You’d like to think the attitude could be “set it and forget about it,” but too many times, for one reason or another, all the servers start updating their profiles at 2:45 p.m., right before the peak login rush.
If IT Operations isn’t able to detect when a rogue application starts depleting critical resources, well, that’s the definition of a “spin your wheels” problem. The biggest risk, naturally, isn’t being hit by an application process that can be planned for (like ensuring virus updates occur at the right time). No, the problem with rogue processes is that you don’t actually know that they are depleting your resources.
Transaction Monitoring – Tying Together Infrastructure, Applications and Transactions
IT Operations and server owners need to manage their servers from a service perspective. The best place to find these features is an end-to-end transaction monitoring solution. That way, we can manage our servers’ performance from the point-of-view of our users.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.