Hitachi Energy
NEM Supervision Service
The NEM Supervision Service scope is to monitor NEM server resources and generate notifications / alarms in case the resources are passing given thresholds.
The threshold can be configured for
minor level,
major level,
critical level.
The supervision service is generating system alarms which are user clearable. If an alarm is cleared by the user and the resource situation persists, the alarm is again generated.
The following resources are monitored:
partition usage,
database connections.
The NEM Supervision Service is implemented based on the linux systemd timer services (nem-supervision.timer, set to 20 seconds) and the supervision service (nem-supervision.service). The status of the NEM Supervision Service is shown with the lsnem command.
The configuration file /opt/nem/etc/nem_supervision.yaml is read by the service each time the nem-supervision.service is executed by the nem-supervision-timer. Therefore, changes can be made at any time (NEM administration permissions required). The default content of the nem_supervision.yaml file is as follows:
# nem supervision logging to file
supervision_LOG: False
# nem Supervision configuration file
supervision_DEBUG: False
# Data Base connections Supervision
DB_connections_DEBUG: False
DB_LOW_LEVEL_CNT: 250
DB_HIGH_LEVEL_CNT: 270
DB_CRITICAL_LEVEL_CNT: 290
 
# Disk partition usage Supervision
DISK_partitions_DEBUG: False
DISK_partitions_db_name: 'NEM_DATABASE'
DISK_partitions: [
'/': [70, 85, 98] ,
'/var': [70, 85, 98],
'/opt': [70, 85, 98]
]
SERVICES_SUPERVISED_DEBUG: False
SERVICES_SUPERVISED_CLEAR: True
SERVICES_SUPERVISED: [
‘telegraf’,
‘nem-influxdb’,
‘nem-voyager-pm’,
‘nem-pb-collector’,
‘nem-bp-pmasyncmgr’
]
Example of generated alarms:
The supervision_LOG can be set to True. This will generate a nem-supervision.log file containing results / status information for each run of the nem-supervision.
The DEBUG settings can be set to true to expand the logging information.
The Database connections supervision settings need to be adjusted according to the scaling of the system and should not be changed by the user.
The disk partitions settings define the used space percentage of the partition at which the alarms are generated.
To configure the NEM-Supervision timer:
The following systemd configuration file can be modified to execute the nem_supervision at 
intervals:
file: /usr/lib/systemd/system/nem-supervision.timer
add/modify to run each interval: OnUnitActiveSec=120
modify to run based on a calendar definition: OnCalendar=*-*-* *:*:00,20,40
To activate the changes in nem-supervision.timer execute the following systemd command (root permissions required):
systemctl daemon-reload
systemctl restart  nem-supervision.timer