Server and Network Dashboard – Managing your servers can streamline the performance of your team by allowing them to complete complex tasks faster. Plus, it can enable them to detect problems early on before they get out of hand and compromise your business. As a result, the risk of experiencing operational setbacks is drastically lower.
But the only way to make the most of your server management is to perform it correctly. And to help you do so, this article will share nine tips on improving your server management and fix some problem about networking, visualization, windows, , .
We have a Network Operations Center with a dozen large, widescreen displays showing us various performance graphs, server and network equipment alarms, and status pages. I lot of the pages were obviously not designed for viewing on a static display. Does anyone have a similar setup where they have found a particular tool or package that excels at displaying data? I’m thinking that a bit of custom programming and maybe something that can scroll text, show dials, flashing lights, and whatnot would produce what I’m looking for, but I don’t know where to start. If anyone has any dos or don’ts or success with particular products, that would be a big help.
UPDATE: It seems what I am looking for is a dashboard creation tool.
Computers are far better than I at analyzing data. I personally prefer systems like OpsView that digest situations and offer a multifaceted interface. Monitoring stats are filtered for abnormal conditions, and individual alerts are delivered to admins responsible for the system. There’s an overall health dashboard that’s viewable by helpdesk and management that gives an impression of how bad an outage is and whether anyone who can fix it is working on it yet. They put it on rotation on the big screen as something you can see at a glance, not something you stare at all day. Scrolling text and flashing lights aren’t how salaried employees should interface with your monitoring systems.
Conrad Albrecht-Buehler has a Google Techtalk (“Making Monitoring Suck Less”) that discusses the merits and shortcomings he sees in current dashboard UI design, and proposes some improvements. I don’t know if he’s published code or even his thesis. The general idea is simple:
- You define situation monitoring as capturing a set of signals about a state. Load, free disk space, network traffic, or even higher level things like forum posts per hour.
- Then you define a heed function that maps the wide input signal from 0 to 1, with 0 being “ignore” and 1 being “zomg!”. In terms of Nagios, he replaces the WARNING state with a WARNING integer.
- Finally you define a a aggregator to summarize and prioritize those WARNING signals.
As far as specific tools you’d use to write your own monitoring system, Nagios scripts have a decent interface (probably this is where you’d glue in a HEED mapping if you like it), storing signals can be done with rrdtool, and you can generate graphs from that, and there’s a Django app called Graphite that renders rrd databases. There’s also Nagvis:
NagVis is a visualization addon for the well known network managment system Nagios.
NagVis can be used to visualize Nagios Data, e.g. to display IT processes like a mail system or a network infrastructure.
What I have done is I get as much of it into a web browser as I can. Then I use Firefox, IE-TAB, and Tab mix plus to display the data.
Tab mix plus allows you to auto update and rotate the tabs on a schedule.
IE-Tab allows you to display IE windows inside the tabs so that TMP can do the auto rotate and update.
Then you can display all of the MRTG, CATI, NAGIOS, What’sUp Gold, wireless monitors you want and it auto-rotates, auto-updates and is shiny…:)
We have a developer that builds WPF apps for fun so when I want shiny he builds those for me.
We had too many displays and not enough useful info, so we totally cheated. We found an interesting LCARS-based screen saver (looks like the displays from Star Trek) and ran it on one of the idle displays. That was the one the bosses watched most.
I wrote my own Nagios visualisation after finding out that none of the easily found versions can handle hundreds of hosts with tens of thousands of checks. (To release the code I need a few people who want to try it outside of my environment so I can convince the bosses)
Even the few that might not break required manual configuration that our nagios config generator couldn’t be perverted to do.
My visualisations are use on OS X and Linux, oddly the only OS X browser with a working fullscreen mode is Opera, neither Safari (and that includes webkit) nor FireFox do.
A few general tips though:
- Big fonts, to the point of automating layouts so they get bigger if there’s less to display
- Use sorting so the biggest problems are first
- Do your best to minimise the maintenence needed, better to be getting warned about a system not yet in production then finding out a year in that it was never added to the displays
- SVG can be wonderful, although they seem to get corrupted over time (we use a simple graphic of a state as an additional visual cue)