Document toolboxDocument toolbox

Incidents

Incidents detection

Until now, we have used the map to get a detailed picture of the status of our service. Status reporting gets into that specific aspect by looking at each and every entity discovered using the rules defined in the map as we configured it, calculating its entire status, and correlating all that information to form a summarized view of its overall health. The concept of incidents in Service Operations focuses more on trying to determine the causes of the detected issues, as well as their respective impact—if any—on the service.

In the map we created and that we have been using throughout these sections of the documentation there are two main points to look at when dealing with incidents. Firstly, any detected incident will be listed in the upper right corner of the service overview module, just as the status of the map is presented on the left-hand side of the screen. Furthermore, there is the specific incidents viewer module accessible from the main menu of Service Operations.

Incidents in Service Operations are associated with an impact level based upon the how those entities potentially affect a number of elements (end-users, for example). For that impact level to be properly calculated, impact queries have to be defined in the map so that Service Operations can determine how many and which ones are affected by any reported issue. To illustrate this, we will define impact queries for the current map following the next steps:

  • Head to the administration section of Service Operations and load the configuration of the e-commerce map.

  • On the map, click on the application module and, in the entity details form, click on the impact header to show that subsection.

  • Set the following configuration:

    • Generate incidents to this entity. The switch activates the incident viewer.

    • Query

      from demo.ecommerce.data where isnotnull(clientIpAddress) select str(clientIpAddress) as clientIp select decode(true, uri->"addtocart","addtocart", uri->"purchase","purchase", uri->"product.screen","product_details", uri->"category.screen","category_details", uri->"view","checkout", "browse") as applicationModule select ifthenelse(statusCode>=500,1.0,0.0) as applicationError group every 1h by applicationModule,clientIp,userAgent select sum(applicationError) as isError where isError>0 group every 1h by applicationModule,userAgent,clientIp select last(isError) as detectedErrors



  • Unit: end users

Apply and then publish to save the configuration. Then, run the map to see how this configuration takes effect.