SG BBmSAMeX
From EGEE-see WIki
This page describes BBmSAMeX tool developed as a part of SA1 and JRA1 components of SEE-GRID-SCI project and is contributed by Faculty of Electrical Engineering Banja Luka.
Contents |
Introduction
BBmSAMeX stands for BBmSAM eXtensions and is a collection of extensions to BBmSAM monitoring system. It allows for easier integration and data exchange between BBmSAM portal and different operational tools, application services or any other automated end-point.
Main components
There are two classes of extensions:
- Portal extensions - SLA XML export
- Independent extensions - Filtering service and Uptime service
SLA XML Export
In a standard configuration BBmSAM Portal produces SLA data in HTML format that is easy to use but is not optimized for further data analysis. That is the main reason for introducing XML export function. There are two different format that can be used:
- XML Format
- Microsoft Excel compatible XML format
Pure XML has the following format:
<?xml version="1.0"?> <bbmsla> <service abbr="CE" datefrom="2009-05-14" dateto="2009-05-20"> <serviceinstance> <sitename>BA-01-ETFBL</sitename> <nodename>c01.grid.etfbl.net</nodename> <serviceabbr>CE</serviceabbr> <totaltime>6.88</totaltime> <availability>93.35</availability> <downtime>0</downtime> <natime>0</natime> <reliability>93.35</reliability> </serviceinstance> </service> <service abbr="SE" datefrom="2009-05-14" dateto="2009-05-20"> <serviceinstance> <sitename>BA-01-ETFBL</sitename> <nodename>c02.grid.etfbl.net</nodename> <serviceabbr>SE</serviceabbr> <totaltime>7.00</totaltime> <availability>100</availability> <downtime>0</downtime> <natime>0</natime> <reliability>100</reliability> </serviceinstance> </service> ... </bbmsla>
As can be see, everything is contained in bbmsla element, with every service and serviceinstance containing sitename, nodename, abbreviated service name, total time, availability for given time, declared downtime, time with no defined status (natime) and reliability (takes into account declared downtime).
This output is meant to be consumed by extrenal tools that further process available SLA data (ie, Nagios probes, "Tactical infrastructure overview", etc).
Also, there is a Microsoft Excel compatible XML output. This output, although valid XML, is meant to be used by persons requiring preformatted data for further processing in human understandable way (results are color-coded to indicate availablity, data can be sorted, etc). This output can be also used in various other spredsheet software applications.
Filtering service
Filtering service uses REST-like approach to ease filtering of service instances in infrastructure according to their monitoring status and various other paramteres.
This service is located at http://c01.grid.etfbl.net/bbmsam/sam_xml.php and can be accessed as in following example:
Request URL: http://c01.grid.etfbl.net/bbmsam/sam_xml.php/latest/country/ba/service/se Result:
<SAMTESTDATA VOID="1" VONAME="seegrid" SERVER="c01.grid.etfbl.net" DATE="2009-05-21 10:19:33"> <SITE SITEID="3" SITENAME="BA-01-ETFBL" COUNTRYID="2" COUNTRYNAME="Bosnia and Herzegovina"> <NODE NODEID="166" NODENAME="c02.grid.etfbl.net"> <SERVICE SERVICEID="2" SERVICEABBR="SE" SERVICENAME="Storage Element"> <TEST NAME="SE-lcg-cp" ABBR="cp" ISCRITICAL="Y" TIMESTAMP="1242900443" DATE="2009-05-21 10:07:23" STATUSID="10" STATUSNAME="ok" SUMMARYDATA=""/> <TEST NAME="SE-lcg-cr" ABBR="cr" ISCRITICAL="Y" TIMESTAMP="1242900444" DATE="2009-05-21 10:07:24" STATUSID="10" STATUSNAME="ok" SUMMARYDATA=""/> <TEST NAME="SE-lcg-del" ABBR="del" ISCRITICAL="Y" TIMESTAMP="1242900444" DATE="2009-05-21 10:07:24" STATUSID="10" STATUSNAME="ok" SUMMARYDATA=""/> <CRITICAL STATUSID="10" STATUSNAME="ok"/> <NONCRITICAL STATUSID="0" STATUSNAME="na"/> </SERVICE> </NODE> </SITE> <SITE SITEID="69" SITENAME="BA-03-ETFSA" COUNTRYID="2" COUNTRYNAME="Bosnia and Herzegovina"> <NODE NODEID="173" NODENAME="n02.grid.etf.unsa.ba"> <SERVICE SERVICEID="2" SERVICEABBR="SE" SERVICENAME="Storage Element"> <TEST NAME="SE-lcg-cp" ABBR="cp" ISCRITICAL="Y" TIMESTAMP="1242900448" DATE="2009-05-21 10:07:28" STATUSID="10" STATUSNAME="ok" SUMMARYDATA=""/> <TEST NAME="SE-lcg-cr" ABBR="cr" ISCRITICAL="Y" TIMESTAMP="1242900448" DATE="2009-05-21 10:07:28" STATUSID="10" STATUSNAME="ok" SUMMARYDATA=""/> <TEST NAME="SE-lcg-del" ABBR="del" ISCRITICAL="Y" TIMESTAMP="1242900449" DATE="2009-05-21 10:07:29" STATUSID="10" STATUSNAME="ok" SUMMARYDATA=""/> <CRITICAL STATUSID="10" STATUSNAME="ok"/> <NONCRITICAL STATUSID="0" STATUSNAME="na"/> </SERVICE> </NODE> </SITE> <SITE SITEID="26" SITENAME="BA-04-PMFSA" COUNTRYID="2" COUNTRYNAME="Bosnia and Herzegovina"> <NODE NODEID="653" NODENAME="se.grid.pmf.unsa.ba"> <SERVICE SERVICEID="2" SERVICEABBR="SE" SERVICENAME="Storage Element"> <TEST NAME="SE-lcg-cp" ABBR="cp" ISCRITICAL="Y" TIMESTAMP="1242900455" DATE="2009-05-21 10:07:35" STATUSID="50" STATUSNAME="error" SUMMARYDATA=""/> <TEST NAME="SE-lcg-cr" ABBR="cr" ISCRITICAL="Y" TIMESTAMP="1242900455" DATE="2009-05-21 10:07:35" STATUSID="50" STATUSNAME="error" SUMMARYDATA=""/> <TEST NAME="SE-lcg-del" ABBR="del" ISCRITICAL="Y" TIMESTAMP="1242900455" DATE="2009-05-21 10:07:35" STATUSID="50" STATUSNAME="error" SUMMARYDATA=""/> <CRITICAL STATUSID="50" STATUSNAME="error"/> <NONCRITICAL STATUSID="0" STATUSNAME="na"/> </SERVICE> </NODE> </SITE> </SAMTESTDATA>
This example shows latest test results for storage elements in Bosnia and Herzegovina for all tests together with critical and non-critical test status ids and names.
Available parameters are (all values are case insensitive ba-01-etfbl == BA-01-ETFBL):
- countryid - id of the country (/countryid/12)
- country - accepts both full names (/country/greece) and 2 letter abbreviated form (/country/gr)
- siteid - ID of the site (/siteid/3)
- sitename - name of the site (/sitename/ba-01-etfbl)
- nodeid - ID of the node (/nodeid/3)
- nodename - FQDN of node (/nodename/c01.grid.etfbl.net)
- serviceid - ID of the service (/serviceid/1)
- service, servicename, serviceabbr - (abbreviated) name of the site (/servicename/ce)
- uptime - hide (default) or show time since last status change (/uptime/yes)
- details - show results of individual tests (default) or not (/details/no)
Examples:
http://c01.grid.etfbl.net/bbmsam/sam_xml.php/latest/country/ba/service/se/sitename/BA-03-ETFSA/details/no/uptime/yes
Returns:
<SAMTESTDATA VOID="1" VONAME="seegrid" SERVER="c01.grid.etfbl.net" DATE="2009-05-21 10:29:55"> <SITE SITEID="69" SITENAME="BA-03-ETFSA" COUNTRYID="2" COUNTRYNAME="Bosnia and Herzegovina"> <NODE NODEID="173" NODENAME="n02.grid.etf.unsa.ba"> <SERVICE SERVICEID="2" SERVICEABBR="SE" SERVICENAME="Storage Element"> <UPTIME STRICT="1347" UNSTRICT="1290183"/> <CRITICAL STATUSID="10" STATUSNAME="ok"/> <NONCRITICAL STATUSID="0" STATUSNAME="na"/> </SERVICE> </NODE> </SITE> </SAMTESTDATA>
Uptime service
Uptime Web Service is located at http://c01.grid.etfbl.net/bbmsam/uptimews.php with WSDL accessible at http://c01.grid.etfbl.net/bbmsam/uptimews.php?wsdl and sample web client located at http://c01.grid.etfbl.net/bbmsam/uptimeclient.php.
This service allows for filtering service instances that have current test status OK according to test VO, service and required minimum time since last status change. Time is calculated either since last non-OK status (strict) or since last ERROR, CRIT or MAINT time (non-strict). Result is a list of nodenames that satisfy requirements ordered by random order.
Service ce in VO ops.vo.egee-see.org UP for 86400 strictly - no
grid-ce.ii.edu.mk grid-lab-ce.ii.edu.mk grf-see-grid-r5.grf.hr grid01.rcub.bg.ac.yu ce.fit.upt.al ce02.grid.acad.bg c01.grid.etfbl.net ... grid-ce.feit.ukim.edu.mk
XML SLA export
- Available through BBmSAM portal
Example output:
<bbmsla> <service abbr="CE" datefrom="2009-05-12" dateto="2009-05-18"> <serviceinstance> <sitename>BA-01-ETFBL</sitename> <nodename>c01.grid.etfbl.net</nodename> <serviceabbr>CE</serviceabbr> <totaltime>7.00</totaltime> <availability>100</availability> <downtime>0</downtime> <natime>0</natime> <reliability>100</reliability> </serviceinstance> </service> … </bbmsla>
EGEE compatible SAMDB XSQL data export for service instances
Available at: http://c01.grid.etfbl.net/bbmsam/sqldb/last_service_status_per_site.xsql Parameters: vo_name, site_name Example: http://c01.grid.etfbl.net/bbmsam/sqldb/last_service_status_per_site.xsql?vo_name=SEEGRID
<page> <ROWSET> <ROW num="1"> <SITENAME>AEGIS01-IPB-SCL</SITENAME> <SERVICEABBR>CE</SERVICEABBR> <NODENAME>ce64.ipb.ac.rs</NODENAME> <LASTRESULT>error</LASTRESULT> <LASTSUBMISSION>2010-02-17 09:36:13</LASTSUBMISSION> </ROW> … </ROWSET> </page>
EGEE compatible SAMDB XSQL data export for services
Aggregates service instance statuses across the site (when multiple instances exist) Available at: http://c01.grid.etfbl.net/bbmsam/sqldb/service_status_per_site.xsql Parameters: vo_name, site_name Example: http://c01.grid.etfbl.net/bbmsam/sqldb/service_status_per_site.xsql?vo_name=SEEGRID
<page> <ROWSET> <ROW num="1"> <SITENAME>AEGIS01-IPB-SCL</SITENAME> <SERVICEABBR>CE</SERVICEABBR> <STATUS>down</STATUS> <TIMESTAMP>2010-02-17 09:36:13</TIMESTAMP> </ROW> … </ROWSET> </page>
Conclusion
Presented extensions allow for easier integration of monitoring results in operational and application specific tools. These tools are used in production but if there are any bug/feature requests, we kindly ask you to contact extension developer Mihajlo Savic (m at etfbl dot net).
