5.5 Server eventsSome of the PVSR modules, for example the application manager module (or modules if the secondary application manager is installed too) periodically perform different verifications in connection with the database and the machines running PVSR. Under this menu point, the results of these verifications and the log entries created when modules are started or stopped can be viewed. According to the settings of the ADMIN_EMAIL_ADDRESSES configuration parameter, an e-mail message is also sent about the event. An SNMP trap is sent also if the ADMIN_SNMP_TRAP_ADDRESSES parameter is set. The application remembers for each user those messages which were seen by the user and those which were not marks with a yellow star: 17. ábra Server events If PVSR determines that a new server message was produced since the user has visited the Server events page then it diplays a notification icon on the top of the page next to the alarm summary. The color of the icon is red if there are new error messages or orange if there are only non-error level messages. When the mouse pointer is placed above the icon then the number of new messages is displayed and if the user clicks on the icon then the Server events page is displayed. The events appear counter-chronologic order in a browseable table. The colorings of the lines are the following: · Normal: server status change after server start or stop. If the status of the server changes to Unknown then it is displayed as an error message · Red: occurrence of error · Orange: warning level message or a previous error which has been cleared, in this case the clearing message is displayed as well. It is important to note that not all of the possible error messages have clearing pairs and if an error is cleared then it does not necessary mean that the error scenario is resolved. For example if an error was sent by the application manager module and that module is restarted then the restart message will clear the error message. However when the application manager first checks the same condition then it might send a new error message because the problem still exists. Obviously transient states like this can only exists for a couple of minutes · Green: end of error The application manager performs the following verifications: · From each PVSR directory of each machine it tries to access the Oracle database with the local settings. · It examines the logs, tmp, tmp/done, tmp/sftp directories under each PVSR directory as well as the root (/) directory of each machine and signals a server event if the free space on the disk partition holding these directories is smaller than 1000 MB. The notification can be turned off for every server - partition pair separately with the DISABLE_LOW_DISK_SPACE_ERROR parameter in the CONFIG_INI.pm file, for example: $DISABLE_LOW_DISK_SPACE_ERROR{localhost}{/opt}=1; · It examines on the machine running the data loader (SQLLDR) whether the number of files in the tmp/done directory exceeds the set value of the application manager component (see server configuration). · It examines on the machine running the data loader (SQLLDR)whether the sqlldr program is available. · It searches the Oracle error messages in the log file of the data loader (SQLLDR) component. · It examines the time differences among servers and indicates if it is bigger than 10 seconds. · It examines the amount of free space of the tablespaces of PVSR and indicates if any of them is less than 5%. The notification can be turned off separately for every tablespace with the DISABLE_LOW_TABLESPACE_SPACE_ERROR parameter in the CONFIG_INI.pm file, for example: $DISABLE_LOW_TABLESPACE_SPACE_ERROR{PVSR_DATA}=1; If the value is 2 instead of 1 then PVSR will check the available space not against the current size of the tablespace, but against the possible max size including the auto tablespace extend feature. The checking can be disabled for all tablespaces in just one line, using the __all__ key. This way PVSR will not even try to determine any tablespace usage, so its Oracle user will not require the SELECT_CATALOG_ROLE Oracle security role: $DISABLE_LOW_TABLESPACE_SPACE_ERROR{__all__ }=1; · If the Oracle database server is running on the same machine as the Application manager module and the ORACLE_HOME parameter in the PVSR CONFIG_INI.pm file points to the ORACLE_HOME of the server then the module checks the size of the SQL*Net log file. If the size is too big then it sends a server event. In that case it is highly recommended to stop the Oracle Listener, delete or move the file and then restart the Oracle Listener · If the SQLLDR or threshold or the report module or the data migration module is running on the same server as the application manager component and its data file wasn’t modified in the past two hours then the application manager raises an alarm · It tests the quick evaluation threshold module –collector modules connection The data moving module performs the following verifications: · Did any Oracle error occurred during the last cycle The data loader (SQLLDR) module performs the following verifications: · For every running data collectors it checks for every collection interval done by the collector: o Was a new result file created in time: if the collector cannot wait for the new file to appear (according to the parameter set for the collector server) then the data loader raises an alarm o If the file doesn’t contain enough successful measurement results (see the REQUIRED_SUCCESSFUL_MEASUREMENTS parameter) o If raises an alarm if the collector has skipped over one or more collection cycle. For example this could happen if the data collection cycle is 1 minute and each collection cycle takes 1 minute and 10 seconds to finish, since sooner or later the collector will skip a cycle · If the data loading cycle takes more than the specified amount of time (see 5.3) two times in succession · If the application manager (if there are two of this type then the primary one) is running on the same server as the data loader component and its data file wasn’t modified in the past two hours then the data loader raises an alarm · The module calculates for every measurement file the number of lines in it using the wc command line utility. If the wc utility does not give back a valid result then the module sends a server event and automatically restarts itself. If the problem occurs then the automatic restart should correct the problem. However if it does not then the module must be stopped and started manually The threshold module performs the following verifications: · Looks for gaps in the received quick evaluation data, which can typically happen if the data is received over UDP |