htdocs/cms/features.xhtml

   1 <!--#include virtual="/doctype.inc" -->
   2   <head>
   3     <title>
   4       CMS Features
   5     </title>
   6 <!--#include virtual="/style.inc" -->
   7   </head>
   8   <body>
   9     <div id="container">
  10       <div id="main">
  11 <!--#include virtual="/header.inc" -->
  12         <div id="contents">
  13           <h1 class="top">
  14             CMS Features
  15           </h1>
  16           <h2>
  17             Problem Specification
  18           </h2>
  19           <h3>
  20             Original Problem
  21           </h3>
  22           <p>
  23             This is the original specification given to us when we
  24             started the project. The i-scream central monitoring system
  25             meets this specification, and aims to extend it further.
  26             This is, however, where it all began.
  27           </p>
  28           <h3>
  29             Centralised Machine Monitoring
  30           </h3>
  31           <p>
  32             The Computer Science department has a number of different
  33             machines running a variety of different operating systems.
  34             One of the tasks of the systems administrators is to make
  35             sure that the machines don't run out of resources. This
  36             involves watching processor loads, available disk space,
  37             swap space, etc.
  38           </p>
  39           <p>
  40             It isn't practicle to monitor a large number of machines by
  41             logging on and running commands such as 'uptime' on the
  42             unix machines, or by using performance monitor for NT
  43             servers. Thus this project is to write monitoring software
  44             for each platform supported which reports resource usage
  45             back to one centralised location. System Administrators
  46             would then be able to monitor all machines from this
  47             centralised location.
  48           </p>
  49           <p>
  50             Once this basic functionality is implemented it could
  51             usefully be expanded to include logging of resource usage
  52             to identify longterm trends/problems, alerter services
  53             which can directly contact sysadmins (or even the general
  54             public) to bring attention to problem areas. Ideally it
  55             should be possible to run multiple instances of the
  56             reporting tool (with all instances being updated in
  57             realtime) and to to be able to run the reporting tool as
  58             both as stand alone application and embeded in a web page.
  59           </p>
  60           <p>
  61             This project will require you to write code for the unix
  62             and Win32 APIs using C and knowledge of how the underlying
  63             operating systems manage resources. It will also require
  64             some network/distributed systems code and a GUI front end
  65             for the reporting tool. It is important for students
  66             undertaking this project to understand the importance of
  67             writing efficient and small code as the end product will
  68             really be most useful when machines start run out of
  69             processing power/memory/disk.
  70           </p>
  71           <p>
  72             John Cinnamond (email jc) whose idea this is, will provide
  73             technical support for the project.
  74           </p>
  75           <h2>
  76             Features
  77           </h2>
  78           <h3>
  79             Key Features of The System
  80           </h3>
  81           <ul>
  82             <li>A centrally stored, dynamically reloaded, system wide
  83             configuration system
  84             </li>
  85             <li>A totally extendable monitoring system, nothing except
  86             the Host (which generates the data) and the Clients (which
  87             view it) know any details about the data being sent,
  88             allowing data to be modified without changes to the server
  89             architecture.
  90             </li>
  91             <li>Central server and reporting tools all Java based for
  92             multi-platform portability
  93             </li>
  94             <li>Distribution of core server components over CORBA to
  95             allow appropriate components to run independently and to
  96             allow new components to be written to conform with the
  97             CORBA interfaces.
  98             </li>
  99             <li>Use of CORBA to create a hierarchical set of data entry
 100             points to the system allowing the system to handle event
 101             storms and remote office locations.
 102             </li>
 103             <li>One location for all system messages, despite being
 104             distributed.
 105             </li>
 106             <li>XML data protocol used to make data processing and
 107             analysing easily extendable
 108             </li>
 109             <li>A stateless server which can be moved and restarted at
 110             will, while Hosts, Clients, and reporting tools are
 111             unaffected and simply reconnect when the server is
 112             available again.
 113             </li>
 114             <li>Simple and open end protocols to allow easy extension
 115             and platform porting of Hosts and Clients.
 116             </li>
 117             <li>Self monitoring, as all data queues within the system
 118             can be monitored and raise alerts to warn of event storms
 119             and impending failures (should any occur).
 120             </li>
 121             <li>A variety of web based information displays based on
 122             Java/SQL reporting and PHP on-the-fly page generation to
 123             show the latest alerts and data
 124             </li>
 125             <li>Large overhead monitor Helpdesk style displays for
 126             latest Alerting information
 127             </li>
 128           </ul>
 129           <h3>
 130             An Overview of the i-scream Central Monitoring System
 131           </h3>
 132           <p>
 133             The i-scream system monitors status and performance
 134             information obtained from machines feeding data into it and
 135             then displays this information in a variety of ways.
 136           </p>
 137           <p>
 138             This data is obtained through the running of small
 139             applications on the reporting machines. These applications
 140             are known as "Hosts". The i-scream system provides a range
 141             of hosts which are designed to be small and lightweight in
 142             their configuration and operation. See the website and
 143             appropriate documentation to locate currently available
 144             Host applications. These hosts are simply told where to
 145             contact the server at which point they are totally
 146             autonomous. They are able to obtain configuration from the
 147             server, detect changes in their configuration, send data
 148             packets (via UDP) containing monitoring information, and
 149             send so called "Heartbeat" packets (via TCP) periodically
 150             to indicate to the server that they are still alive.
 151           </p>
 152           <p>
 153             It is then fed into the i-scream server. The server then
 154             splits the data two ways. First it places the data in a
 155             database system, typically MySQL based, for later
 156             extraction and processing by the i-scream report generation
 157             tools. It then passes it onto to real-time "Clients" which
 158             handle the data as it enters the system. The system itself
 159             has an internal real-time client called the "Local Client"
 160             which has a series of Monitors running which can analyse
 161             the data. One of these Monitors also feeds the data off to
 162             a file repository, which is updated as new data comes in
 163             for each machine, this data is then read and displayed by
 164             the i-scream web services to provide a web interface to the
 165             data. The system also allows TCP connections by non-local
 166             clients (such as the i-scream supplied Conient), these
 167             applications provide a real-time view of the data as it
 168             flows through the system.
 169           </p>
 170           <p>
 171             The final section of the system links the Local Client
 172             Monitors to an alerting system. These Monitors can be
 173             configured to detect changes in the data past threshold
 174             levels. When a threshold is breached an alert is raised.
 175             This alert is then escalated as the alert persists through
 176             four live levels, NOTICE, WARNING, CAUTION and CRITICAL.
 177             The alerting system keeps an eye on the level and when a
 178             certain level is reached, certain alerting mechanisms fire
 179             through whatever medium they are configured to send.
 180           </p>
 181         </div>
 182 <!--#include virtual="/footer.inc" -->
 183       </div>
 184 <!--#include virtual="/menu.inc" -->
 185     </div>
 186   </body>
 187 </html>