Use Case studies

An Event Control team in a big company
Description An event control team in a large corporation is monitoring thousands of network components in many data centers across the world. Event handling (escalation and notification) is done manually, all procedures are written in a word document or excel spreadsheet. When an operator gets an event on his console, he looks up the procedure in a spreadsheet and proceeds with an action (ignoring it as irrelevant or sending an email/calling the appropriate technical support team).
Problem Event control operators must be on call 24/7. An Excel spreadsheet is a bad place for keeping event handling procedures: there's no data integrity, no validations, no change history, and low security. Actions taken by the event control operators are not reported at all, or must be reported manually (possibly in a word document or yet another spreadsheet). Reporting is difficult or not possible at all. It is possible to make an error, e.g. ignoring an alert that should have been escalated or notifying the wrong person.
Solution Automate the event handling procedures using AlertGrid. Event control operators would then only adjust the rules in the system. They are no longer needed to be on call all the time (reduced costs). As the rules are now automated, the chance of human error is drastically reduced. Full reports for handling each event and other statistics are available immediately. Self-service becomes possible; people in organizations can easily configure notification policies for their resources (devices, applications) themselves, using the AlertGrid portal.
Online reservations system
Description A cron job or a windows task triggers a script once every 5 minutes. The script checks if reservations made in the system have been confirmed on time. If not the application automatically cancels the reservation.
Problem The script may not be triggered at all for some reason or it may fail before completing its tasks. No one knows about the service outage until it affects many customers and they flood the customer support with complaints.
Solution After the script completes its work it sends a signal to AlertGrid. An administrator will get an SMS if this signal is not received every 5 minutes. Note, that the signal may carry extra information, e.g. the number of reservations that have been confirmed or not confirmed. It is then possible to build extra rules around this information.
License server monitoring
Description Information about license usage is pulled from a license server every 5 minutes to generate statistics.
Problem There is no notification if the tool or script that pulls the data accidentally stops working. There is no notification if the license usage on a given server for given application crosses a limit (e.g. 80%).
Solution The signal is sent to AlertGrid each time the statistics are pulled. The signal carries key information about license usage. It is now possible to notify an administrator when the signal is not received in a given amount of time, or if the reported license usage exceeds a set threshold.
AlertGrid (self-monitoring)
Description Multiple AlertGrid agents execute user-defined workflows (rules) against received signals, or on a scheduled basis.
Problem In case of any failure rule processing is stopped, which yields with undelivered notifications for AlertGrid's clients.
Solution After processing all workflows, each AlertGrid agent sends a signal to… AlertGrid! This signal is then picked up and processed in the next iteration in the same way as any other signal. Because there are multiple agents, if one or two of them stop working – an AlertGrid administrator is immediately informed. We can also monitor easily all the necessary statistics and react if they meet certain conditions. So, AlertGrid is the happy customer of AlertGrid :)

Did you know?

AlertGrid is not limited to the IT world. You can use it in various scenarios - whenever you need to get instant notifications on something important.