In my last installment, I discussed a few different areas where data center monitoring automation can not only make life in the data center more convenient but also become a force multiplier. I ran out of space, however, before I ran out of ideas (the story of my life). The one thing I didnt cover was the automation you can implement in response to an alert.
As a data center professional, you probably have a solid understanding of monitoring and alerting already, but to truly appreciate how automation can relieve an enormous burden, it may be helpful to review a few examples.
What follows are some clippings from my garden of automationalert responses that have had a huge impact on the environments where they were implemented.
Example 1: Disk Full
Disk-full alerting is a simple concept with a deceptively large number of moving parts. So, I want to break it down into specifics. First, get the alert right. As my fellow SolarWinds Head Geek Thomas LaRock and I discussed in a recent episode of SolarWinds Lab, simplistic disk alerts help nobody. If you have a 2TB disk, alerting when its 90 percent used translates to having204.8GBs of disk space remaining.
A good solution to this problem is to check for both percent used and also remaining space. A better solution is to include logic in the alert that tests for the total space of the drive, so that drives with less than 1TB of space have one set of criteria and drives with greater than 1Tb have another. These tests should all be in the same alert, if possible, because who wants to manage hundreds of alert rules? Nevertheless, you want to ensure you are monitoring disk space in a way that is reasonable for the volumes in question, and only create necessary alerts.
Next, clear unnecessary disk files out of various directories. For the purpose of this article, Ill just say that all systems have a temporary directory and that you can delete all files out of that folder with impunity. The challenge in doing so easily comes down to a problem of impersonation. Many monitoring solutions run on the server as the system account. As a result, performing certain actions requires the script to impersonate a privileged user account. There are a variety of ways to do so, which is why Ill leave the problem here for you to solve in a way that best fits your individual environment.
Once the impersonation issue is resolved, theres another challenge specific to the disk-full alert: knowing that the correct directories for the specific server are being targeted. The best approach is to use a common shared folder that maps to all servers and place a script file there. That script can be set up to first detect the proper directories and then clear them out with all the necessary safeguards and checks in place to avoid accidental damage.
Example 2: Restart an IIS Application Pool
Sadly, restarting application pools is often the easiest and best fix for website-related issues. Im not saying that running appcmd stop... and then appcmd start... from the server command line is a quick kludge that ignores the bigger issues. Im saying that often, resetting the application pool is the fix.
If your web team finds itself in this situation, waking a human being to do the honors is absolutely your most expensive option. But automatically restarting the application pool becomes slightly more challenging because one server could be running multiple websites, which in turn have multiple application pools. Or you could have one big application pool controlling multiple websites. It all depends on how the server and websites were configured and you have no way of knowing.
If your monitoring solution can monitor the application pool, it will provide the name for you. Most mature monitoring solutions do so already. Once you have the name, you can do the following:
Example 3: Restart IIS
Running a close second behind restarting application pools is resetting IIS. Doing so is, of course, the nuclear option of website fixes since you are bouncing all websites and all connections. Even though its drastic, its a necessary step in some cases.
As with restarting application pools, getting a human involved in this incredibly simple action is a waste of everyones time and the companys money. Its far better to automatically restart and then recheck the website a minute or two later. If all is well, the server logs can be investigated in the morning as part of a postmortem. If the website is still down, its time to send in the troops.
You can restart the IIS web server in a number of ways:
Example 4: Restart a Server
If restarting the IIS service is the nuclear option, restarting the entire server is akin to nuclear Armageddon. Yet we all know there are times when restarting the server is the best option, given a certain set of conditions that you can monitor.Assuming your monitoring solution doesn't support a built-in capability for this function, some options include the following:
Example 5: Restart a Service
Occasionally, services stop. They are sometimes even services that you, as a data center professional who needs to monitor your infrastructure, care about, such as SNMP.So, you are cutting dozens of service-down alerts. Have you thought about restarting them? In some cases, a restart doesnt really help much. But in far more situations it does. Computers are funny things. After all, Screws fall out all the time. The world is an imperfect place. (From The Breakfast Club.)
Sometimes, they just need a gentle nudge. If this is the case, you can do the following:
Example 6: Backup a Network-Device Configuration
Everything Ive gone over so far covers direct remediation-type actions. But in some cases, automation can be defensive and informational. Network-device configurations are a good example, in that they dont fix anything, but instead gather additional information to help you fix the issue faster.
Its important to note that between 40 and 80 percent of all corporate-network downtime is the result of unauthorized or uncontrolled changes to network devices. These changes arent always malicious. Often, the change simply went unreviewed by another set of eyes or an otherwise simple error slipped past the team.
So, having the ability to spontaneously pull a device configuration based on an event trigger is super helpful. To do so, you can use the following approach:
There are two general cases when you may want to execute this automatic action. The first is when your monitoring solution receives a config change trap. Although the details of SNMP traps are beyond the scope of this article, you can configure your network devices to send spontaneous alerts on the basis of certain events. One of these events is a configuration change. The second is when the behavior of a device changes drastically, such as when ping success drops below 75 percent or ping latency increases. In either case, often the device is in the process of becoming unavailable. But in some situations, its wobbly, and theres a chance to grab the configuration before it drops completely.
In both of those situations, having the latest configuration provides valuable forensic information that can help troubleshoot the issue. It also gives you a chance to restore the absolutely last-known-good configuration, if necessary. And if it leads you to think, Well, if I have the last known good configuration, why cant I just push that one back? Then you, my friend, have caught the automation bug! Run with it.
Example 7: Reset a User Session
Somewhere in the murky past, the first computer went online and became Node 1 in the vast network we now call the Internet. The next thing that probably happened, mere seconds later, was that the first user forgot to log off their session and left it hanging.
For any system that supports remote connectionswhether its in the form of telnet/ssh, drive mappings or RDP sessionshaving the ability to monitor and manage remote-connection user sessions can make running weekly, if not daily, restarts unnecessary. Or at least much smoother.
For Linux, use the who command to discover current sessions, or with greater granularity by remotely running netstat -tnpa | grep 'ESTABLISHED.*sshd. Once you have the process ID, you can kill it. For Windows, you get the active sessions on a system using the query session
Example 8: Clear DNS Cache
At times, a server and/or application will misbehave because it cant contact an external system. This misbehavior is either because the DNS cache (the list of known systems and their IP addresses) is corrupt, or because the remote system has moved. In either case, a really easy fix is to clear the DNS cache and let the server attempt to contact the system at its new location.
In Windows, use the command ipconfig /flushdns. In Linux, the command varies from one distribution to another, so its possible that sudo /etc/init.d/nscd restart will do the trick, or /etc/init.d/dns-clean, or perhaps another command. Research may be necessary for this one.
Hopefully at least a few of things Ive shared here and in this series on automation as a whole have inspired you to give automation a try in your data center. If so, or if youre already well on your way to automating all the things. Id love to hear about your experiences and perspective in the comments section.
Leading article image courtesy ofLeonardo Rizzi under a Creative Commons license
Leon Adato,SolarWindsHead Geek and long-time IT systems management and monitoring expert, discusses all things data center in this ongoing series.
Automations Impace on Data Center Monitoring Alerts was last modified: February 13th, 2017 by Leon Adato
Read this article:
Automation's Impace on Data Center Monitoring Alerts - The Data Center Journal
- The Automation Conference [Last Updated On: December 9th, 2016] [Originally Added On: December 9th, 2016]
- The Best Home Automation Systems of 2016 | Top Ten Reviews [Last Updated On: December 24th, 2016] [Originally Added On: December 24th, 2016]
- Compact Automation - Actuators, Hydraulic Cylinders, Linear ... [Last Updated On: December 24th, 2016] [Originally Added On: December 24th, 2016]
- What is Home Automation? | Home Automation Systems [Last Updated On: December 24th, 2016] [Originally Added On: December 24th, 2016]
- Job Seekers - Automation Personnel Services [Last Updated On: December 24th, 2016] [Originally Added On: December 24th, 2016]
- iAutomation [Last Updated On: December 25th, 2016] [Originally Added On: December 25th, 2016]
- Beyond Automation - hbr.org [Last Updated On: December 25th, 2016] [Originally Added On: December 25th, 2016]
- Automation The Car Company Tycoon Game on Steam [Last Updated On: December 25th, 2016] [Originally Added On: December 25th, 2016]
- Automation - Wikipedia [Last Updated On: December 25th, 2016] [Originally Added On: December 25th, 2016]
- Build automation - Wikipedia [Last Updated On: December 26th, 2016] [Originally Added On: December 26th, 2016]
- Home - Enerwave Home Automation [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- Automation | Technologies | Systems | Integrator ... [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- Automation - DESHAZO [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- Custom Automation & Machine Design | Automation GT [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- IT Automation - BMC [Last Updated On: December 27th, 2016] [Originally Added On: December 27th, 2016]
- Werner Electric | Automation [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- Automationtechies | Automation Engineering Recruiting [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- Automation - Mazak Corporation [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- Automation | Food Engineering [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- Test Automation Services for Development of Regression ... [Last Updated On: January 28th, 2017] [Originally Added On: January 28th, 2017]
- UI Automation Overview - msdn.microsoft.com [Last Updated On: February 5th, 2017] [Originally Added On: February 5th, 2017]
- The Evolution of Automation and What It Means for the Integration Industry - Commercial Integrator [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Automation, robots could replace 250000 public sector workers in the next 15 years - Computer Business Review [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- New telecom transformation goals require service automation - TechTarget [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Automation expected to displace insurance underwriters, real estate brokers - CIO Dive [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- The Perks Of Automation And The Risks: Why To Think Twice About Getting Into That Driverless Uber - Forbes [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Voices Reinventing enterprise finance by overhauling AP automation - Accounting Today [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- DFLabs Launches the First Security Automation and Orchestration Platform based Upon Supervised Active Intelligence - Business Wire (press release) [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- VIDEO: Going Big on Automation in a Small Footprint Facility - ENGINEERING.com [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Building a better model of human-automation interaction - Phys.org - Phys.Org [Last Updated On: February 7th, 2017] [Originally Added On: February 7th, 2017]
- Cruise Automation Is Testing an App For Hailing Self-Driving Cars - Fortune [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- AlixPartners examines automation in manufacturing and logistics management - Logistics Management [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- Women need to look out for each other in automated workplaces - The Guardian [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- Automation vs. the H-1B visa program: Which matters to employees? - TechTarget [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- Automation is the unavoidable future of the economy - The Daily Cougar [Last Updated On: February 8th, 2017] [Originally Added On: February 8th, 2017]
- Speeders beware: Legislation would allow automation crackdown ... - SFGate [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- Robots versus bureaucrats: Why public sector work is ripe for automation - Financial Post [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- Rockwell Automation Surged 10% in January as Growth Picked Up Steam - Motley Fool [Last Updated On: February 9th, 2017] [Originally Added On: February 9th, 2017]
- Global Medical Automation Market to Reach Approximately $75.6 Billion by 2025 - By End User, Application ... - PR Newswire (press release) [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Automation 'key' to advancing Thai production - The Nation [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- WorkWave Releases New Lead Management And Marketing ... - PR Newswire (press release) [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- 'We employ insane levels of automation' Kris Canekeratne - Times of India [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Most people are optimistic about workplace automation, social data suggests - ZDNet [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Yes, there's a job creation argument for automation and technology ... - The Hill (blog) [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Technobabble: Automation and the modern worker - CIO Dive [Last Updated On: February 10th, 2017] [Originally Added On: February 10th, 2017]
- Improving Behavior Through Automation of Vehicle Systems - School Transportation News (blog) [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Automation Nightmare: Philosopher Warns We Are Creating a World Without Consciousness - Big Think [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Why Don't We See More Automation in Federal Networks? - Nextgov [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Automation can revitalize the US workforce - Fox News [Last Updated On: February 11th, 2017] [Originally Added On: February 11th, 2017]
- Readers Write (Feb. 12): The moose population; jobs, start-ups and automation; diversity in the funny pages - Minneapolis Star Tribune [Last Updated On: February 12th, 2017] [Originally Added On: February 12th, 2017]
- Automation can replace bureaucrats and save taxpayers money - Hot Air [Last Updated On: February 12th, 2017] [Originally Added On: February 12th, 2017]
- TigerStop hopes to ride automation to new heights - The Columbian [Last Updated On: February 12th, 2017] [Originally Added On: February 12th, 2017]
- Your Most Valuable Resource is Time Get More of it through Automation - CMS Critic (press release) (blog) [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- What Does Device Automation Mean for Users? - Medical Device and Diagnostics Industry (blog) [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- How To Beat Automation And Not Lose Your Job - Forbes [Last Updated On: February 13th, 2017] [Originally Added On: February 13th, 2017]
- Logistics firm gets automation boost - The Straits Times [Last Updated On: February 14th, 2017] [Originally Added On: February 14th, 2017]
- PP Control & Automation launch new video to kick-start exciting plans for 2017 - Manufacturer.com [Last Updated On: February 14th, 2017] [Originally Added On: February 14th, 2017]
- Hollysys Automation Technologies Reports Unaudited Financial Results for the First Half Year and the Second Quarter ... - PR Newswire (press release) [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- 4 Automation Hacks to Save You Money and Manpower - Yahoo Finance [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Istuary Innovation Group and Bluewrist Partner to Bring Robotics and Automation into China's Manufacturing Sector - Yahoo Finance [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Redwood Software Named a Strong Performer in Independent Robotic Process Automation (RPA) Report - Yahoo Finance [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Boeing ramps up automation, innovation as it readies 737MAX | The ... - The Seattle Times [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- Robots and AI are coming for our jobs, but can augmentation save us from automation? - Digital Trends [Last Updated On: February 15th, 2017] [Originally Added On: February 15th, 2017]
- The Impact of Bad Data in Automation: Why Quality Management is Critical - R & D Magazine [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Automation: Are We Empowering Human Interaction Or Displacing It? - Business 2 Community [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Life in the Fast LaneAutomation with Software-Defined Intelligence - InfoWorld [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Luddite Lefty Journalists Apparently Think Workplace Automation is Conservatives' Fault [VIDEO] - Daily Caller [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Will automation define the future of network technology? - TechTarget [Last Updated On: February 16th, 2017] [Originally Added On: February 16th, 2017]
- Editorial: Improving automation - The Motorship [Last Updated On: February 17th, 2017] [Originally Added On: February 17th, 2017]
- TigerText Unveils Role-based Scheduling Automation, Amazon Alexa integration - HIT Consultant [Last Updated On: February 17th, 2017] [Originally Added On: February 17th, 2017]
- 89% people want automation at workplace: Adobe - Economic Times [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Delta veers to EV parts, automation - Bangkok Post [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Robotic process automation makes nearshore outsourcing more ... - CIO [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- The working-class job that Trump could save from automation - Washington Post [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- China must be ready for automation - Basic Income News [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Bill Gates Says Robots Should Be Taxed Like Workers - Fortune [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Trump and automation challenge India's IT industry - VentureBeat [Last Updated On: February 18th, 2017] [Originally Added On: February 18th, 2017]
- Both Trump and Automation Are Challenging India's IT Industry - Fortune [Last Updated On: February 20th, 2017] [Originally Added On: February 20th, 2017]
- 89% people want automation at workplace: Adobe - ETCIO.com [Last Updated On: February 20th, 2017] [Originally Added On: February 20th, 2017]
- The art of balancing workplace automation - Retail Customer Experience (blog) [Last Updated On: February 20th, 2017] [Originally Added On: February 20th, 2017]