Prozesse überwachen mit OpenNMS

Posted: 3rd Juli 2013 by marcel in OpenNMS
Tags: , , ,

Es soll überwacht werden, ob Prozess XY auf einem Linux Server läuft oder auch nicht. Wie alles in OpenNMS ist die Konfig eigentlich recht einfach… Wenn man weiß wie es geht ;)

In /etc/opennms/datacollection/netsnmp.xml muss man ein paar Einträge hinzufügen:

   <resourceType name="procIndex" label="Process Table Index (UCD-SNMP MIB)"
                 resourceLabel="${prNames} (index ${index})">
     <persistenceSelectorStrategy class="org.opennms.netmgt.collectd.PersistAllSelectorStrategy"/>
     <storageStrategy class="org.opennms.netmgt.dao.support.IndexStorageStrategy"/>
   </resourceType>

   <group name="net-snmp-proc" ifType="all">
       <mibObj oid=".1.3.6.1.4.1.2021.2.1.1" instance="procIndex" alias="prIndex" type="integer" />
       <mibObj oid=".1.3.6.1.4.1.2021.2.1.2" instance="procIndex" alias="prNames" type="string" />
       <mibObj oid=".1.3.6.1.4.1.2021.2.1.3" instance="procIndex" alias="prMin" type="gauge" />
       <mibObj oid=".1.3.6.1.4.1.2021.2.1.4" instance="procIndex" alias="prMax" type="gauge" />
       <mibObj oid=".1.3.6.1.4.1.2021.2.1.5" instance="procIndex" alias="prCount" type="gauge" />
       <mibObj oid=".1.3.6.1.4.1.2021.2.1.100" instance="procIndex" alias="prErrorFlag" type="integer" />
       <mibObj oid=".1.3.6.1.4.1.2021.2.1.101" instance="procIndex" alias="prErrMessage" type="string" />
     </group>

Bei den existierenden Einträgen

<systemDef name="Net-SNMP (UCD)">

und

<systemDef name="Net-SNMP">

muss ein Include rein:

    <includeGroup>net-snmp-proc</includeGroup>

Dann benötigen wir Thresholds dafür:

 <threshold type="low" ds-type="procIndex" value="0"
          rearm="1.0" trigger="2" ds-label="prNames"
          triggeredUEI="uei.opennms.org/custom/proc-down"
          rearmedUEI="uei.opennms.org/custom/proc-down-rearmed" ds-name="prCount"/>
      <threshold type="high" ds-type="procIndex" value="1.0"
          rearm="0.0" trigger="2" ds-label="prNames"
          triggeredUEI="uei.opennms.org/custom/proc-error"
          rearmedUEI="uei.opennms.org/custom/proc-error-rearmed" ds-name="prErrorFlag"/>

In der Datei /etc/opennms/events/programmatic.events.xml diese Events hinzufügen:

   <event>
       <uei xmlns="">uei.opennms.org/custom/proc-down</uei>
       <event-label xmlns="">Process Down</event-label>
       <descr xmlns="">Threshold exceeded for %service% datasource %parm[ds]% on interface %interface%, parms: %parm[all]</descr>
       <logmsg dest="logndisplay">Threshold exceeded for %service% datasource %parm[ds]% on interface %interface%, parms: %parm[all]%</logmsg>
       <severity xmlns="">Minor</severity>
       <alarm-data reduction-key="%uei%!%nodeid%!%parm[label]%" alarm-type="1" auto-clean="false" />
   </event>
   <event>
       <uei xmlns="">uei.opennms.org/custom/proc-down-rearmed</uei>
       <event-label xmlns="">Process Down - Re-Armed</event-label>
       <descr xmlns="">Threshold rearmed for %service% datasource %parm[ds]% on interface %interface%, parms: %parm[all]</descr>
       <logmsg dest="logndisplay">Threshold rearmed for %service% datasource %parm[ds]% on interface %interface%, parms: %parm[all]%</logmsg>
       <severity xmlns="">Normal</severity>
       <alarm-data
              clear-key="uei.opennms.org/custom/proc-down!%nodeid%!%parm[label]%"
              reduction-key="%uei%:%nodeid%:%parm[label]%" alarm-type="2" auto-clean="true" />
   </event>
   <event>
       <uei xmlns="">uei.opennms.org/custom/proc-error</uei>
       <event-label xmlns="">Process Error</event-label>
       <descr xmlns="">Threshold exceeded for %service% datasource %parm[ds]% on interface %interface%, parms: %parm[all]</descr>
       <logmsg dest="logndisplay">Threshold exceeded for %service% datasource %parm[ds]% on interface %interface%, parms: %parm[all]%</logmsg>
       <severity xmlns="">Minor</severity>
       <alarm-data reduction-key="%uei%!%nodeid%!%parm[label]%" alarm-type="1" auto-clean="false" />
   </event>
   <event>
       <uei xmlns="">uei.opennms.org/custom/proc-error-rearmed</uei>
       <event-label xmlns="">Process Error - Re-Armed</event-label>
       <descr xmlns="">Threshold rearmed for %service% datasource %parm[ds]% on interface %interface%, parms: %parm[all]</descr>
       <logmsg dest="logndisplay">Threshold rearmed for %service% datasource %parm[ds]% on interface %interface%, parms: %parm[all]%</logmsg>
       <severity xmlns="">Normal</severity>
       <alarm-data
              clear-key="uei.opennms.org/custom/proc-error!%nodeid%!%parm[label]%"
              reduction-key="%uei%:%nodeid%:%parm[label]%" alarm-type="2" auto-clean="true" />
   </event>

Danach ist ein OpenNMS Neustart nötig:

service opennms restart

Auf Clientseite muss man in der snmpd.conf einfach nur folgenden Eintrag hinzufügen:

proc PROZESSNAME 0 1

und den snmpd neustarten

service snmpd restart

Die beiden Zahlen definieren ein MAX/MIN für laufenden Instanzen dieses Prozess. D.h. in meinem Fall: Minimum 1, Maximum 0=unendlich.

PS: Ist fast 1:1 aus dem Wiki von OpenNMS übernommen: http://www.opennms.org/wiki/Process_Monitoring_and_Collection