Previous   Next

1 Introduction

PerformanceVisor (PVSR) allows the monitoring of various IT equipment and applications – such as networks, computers, and services, and presents this data on a web UI.

 

PVSR is designed to serve all simple and complex facets of performance monitoring:

·       Diverse, distributed measurement system

o   Distributed collection server architecture, deployable on any combination of Linux nodes

o   Handling of measurement server groups, automatic load-balancing with failover functionality and self-testing ability within the groups

o   Easy-to-extend measurement protocol – collector server architecture:

§  Several data collectors, see section 14.1 for more details

§  Framework for support for user developed collection severs with easy-to-implement and documented APIs (programming interfaces)

o   Variable measurement intervals from 15 seconds to hourly cycles, the defined length of the interval can be changed later by the users

o   Irregular measurements: there are data collectors in the system (for example MQTT) which do not collect data periodically, but instead a value may arrive in the system at any given time

o   Real-time monitoring capability

o   Collector specific diagnostics pages

o   Support for segregated/firewalled network architecture. Measured devices only need to be accessible by the collection servers they are assigned to. Other servers including the central server may be in an isolated network, with inter-server communication (i.e. that between the central servers and the collectors) implemented only by a pair of secure and relatively narrow-bandwidth connection paths (SSH and SFTP).

o   Other than for starting up, the measurement servers do not require the central Oracle server, therefore no measurement is lost even under an Oracle database shutdown

o   Full support of SNMPv3

·       Distributed server architecture: similarly to measurement collection servers, the other server modules of the application also form a distributed system with remote manageability (e.g. start and stop functions). This results in a very scalable system, from simple deployments that occasionally monitor a couple of nodes, to systems doing tens of thousands of measurements on several thousands of devices

·       Built upon the robust market leader database management system (Oracle), which stores both configuration and measurement data. Data retention and compression options are selectable in the user interface on a measurement-by-measurement basis. The partitioned database feature of Oracle Enterprise Edition is also supported for even higher performance.

·       The standard Webservice SOAP interface of PVSR allows both the configuration and query of measurements and alarms, thus enables the easy integration of the system into other applications.

·       Simple configurability

o   Template aided object creation: equipment, alarms, and charts

o   Automatic discovery of measurements on the equipments and its display in a filterable list

o   Automatic creation of charts

o   HP Network Node Manager database synchronization

o   Bulk creation, modification and deletion

·       Customizable object hierarchy:

o   Base structure: a hierarchy of sites, equipment and measurements, where sites can be nested, forming a tree, of arbitrary depth. This hierarchy is created by the administrators and used by the measurements, the reporting module, and also by the users when browsing in the UI.

o   Virtual structure: besides the main hierarchy, it is possible to create arbitrary additional hierarchies of virtual sites, equipment and charts, which allow for presenting data along alternative hierarchical aspects.

o   The base and virtual hierarchies can be displayed simultaneously (overlapping) or individually

o   Public and private virtual structures: public objects created by administrators are seen by every entitled user, while they may also have additional private virtual objects.

o   Customizable personal and publicly available menu systems

·       Easy-to-use chart system:

o   JavaScript based charts which allow reach client-side interactivity: zooming, minimum/maximum settings, showing/hiding elements and trend lines

o   Arbitrary subset of the measurements can be added into a single public and/or private chart. More over, the values of the measurements can be used in arbitrary calculations, showing for example the sum of two measurements, the difference, …

o   Measurements from multiple pieces of equipment can be collected on a single page, and the obtained charts can be saved immediately as a virtual equipment by the user.

o   Any time period of measurements can be displayed.

o   Like nodes in the measurement hierarchy, charts may also be public or private.

o   Chart scaling (minimum and maximum values visible) can be specified in several ways (fixed value, automatic, percentage, etc.).

o   Chart templates can be used for the creation charts, simplifying the selection, assignment and naming of their measurements.

·       Threshold subsystem:

o   There is a feature-rich language provided for the definition of threshold conditions.

o   If a threshold condition is violated, the application generates an alarm by sending an e-mail message and/or SNMP trap and/or executing an arbitrary command. In the condition constant values and average and standard deviation values of previous measurements can be used. Not just the actual value can be used in a condition, but trend based forecasted values as well, like disc usage one week from now, …

o   Thresholds can be created based on templates for the ease of their management, and certain parameters of the thresholds created this way can be modified later together with a single operation.

o   It is possible to define time windows for threshold conditions (e.g. only on weekdays between 8 A.M. and 4 P.M.)

o   A threshold condition may reference measurements that belong to different equipment or even different measurement server types (such as SNMP and ASCII)

o   Filterable alarm window containing on-line or historical data that groups violations of to the same threshold condition into alarm intervals.

o   Coloring of the virtual and base object hierarchy based on current threshold violation status.

o   Threshold violation events and violation values may be displayed on the measurement charts. .

o   Alarms can be marked as acknowledged by users.

o   Alarm chart on the number of different alarms grouped by alarm levels.

o   The visibility of thresholds is configurable on a per-user basis.

·       Receiving events:

o   The system is able to receive and correlate traps and Syslog messages.

o   The system displays the received events together with the threshold alarms.

o   Depending on the settings it is able to handle the event raise and clear messages, and to assign them to the equipments and measurements within the PVSR

·       Mobile app: PVSR has a mobile application running on Android and iOS. It can display the current alarms and the measured values for the last couple of hours and the user can start a real-time measurement session as well.

·       Reporting subsystem:

o   Summary reports can be defined on arbitrary segments of the main hierarchy of sites and equipments. Reports can be ordered, filtered and displayed in a table or chart format, showing minimum, maximum, sum (integral), average values and count of the measurement data.

o   A report can aggregate values over time and along the hierarchy. It is also possible to aggregate the number and fill factor of threshold violations under any site or equipment. The aggregation can also be narrowed by time criteria, (e.g. only for weekdays and working hours), or by criteria on the hierarchy.

o   Diverse data collection methods and the complex measurement definitions, allow the calculation of daily, weekly, monthly and yearly SLA (service level agreement) reports.

o   All technical and/or SLA reports may be assigned limited access only for users individually authorized for viewing by the administrators.

o   The system is capable to send the reports automatically in HTML and XLS formats for the time interval and the object specified by the user. Since a separate email address can be set at each setting, the emails can be sent to users who are not even users of the PVSR system.

o   The users have the availability to generate not just so called pre-configured reports but on-demand reports as well, aggregating any measurements using any aggregation methods and any time period

·       Equipment templates:

o   Equipment templates make it possible to create one or several pieces of equipment and their measurements in a single step, after the specification of a minimal number of parameters. Templates may also configure the Availability Agent and set threshold values for the new objects. The system is able to discover the related equipments on its own, but it also makes it possible to synchronize its own database with the HP Network Node Manager

o   Templates can be used for all active and discoverable equipment (such as SNMP, Oracle, Unix/Linux)

o   Template based equipment requests can be placed in a job queue; the application automatically monitors this queue and creates each piece of equipment as it becomes available.

·       Multi-layered security system:

o   A five-level access-control scheme, where the access to sites, equipment, measurements, reports, charts and also to alarms (threshold types) can be controlled individually on a user-by-user basis. Naturally, to view a report on a given site or on a particular piece of equipment, the user must have access to both the report and the target object.

o   Secure communication channels

§  The server components of the application communicate all data and command sequences through SSH and SFTP connections, with password or public key based authentication

§  Database connections used by the servers can be encrypted.

§  The user interface access may be restricted to secure HTTPS communication only.

·       Modification monitoring: the system keeps an audit-trail of all changes together with the date and user information. The changes are kept in a journal, that can be accessed by administrators later

·       Help system: for each measurement type defined as well for public and private charts, a help URL can be specified, shown as a link on the UI. When clicked, the contents of this help text will pop up in a separate window.

·       Data export: most data can be exported and formatted into Microsoft XLS files:

o   Charts (base, public and private) along with selectable threshold information

o   Reports in table and chart format

o   Threshold violation interval lists

·       All user interface text (messages, commands, labels, tooltips) are displayed from translation files, consequently the user interface is easy to internationalize (currently, English and Hungarian languages are available).