Incident Details

The Incidents Details page provides a detailed report of the selected incident.

Complete the following steps to view the Incident Details page.
  1. On the Navigation bar, click AI Assurance > AI Analytics > Incidents. The Incidents page is displayed.
  2. Click on the Severity or Date attribute of the specific incident. The Incident Details page is displayed.
    Incident Details Page (Upper Portion Only)
The Incidents Details page displays the severity level of a selected incident beside the Incidents Details title and the description of the incident below the Incidents Details title. The Incidents Details page has the following components:
  • Incident Information tile
  • Insights tile
  • Network Impact tile
  • Metrics Graphs

Incident Information Tile

The Incident Information tile displays the client impact count, AP impact count, incident category, incident sub-category, type, scope, duration, event start time, and event end time.

To view the impacted clients, click the value under Client Impact Count. The Impacted Client dialog box is displayed. This impacted clients' hostname, MAC address, username, manufacturer, OS Type, and network are displayed in the table. You can use the search option to search the required client by MAC address, manufacturer, or network.
Note: The hostname is displayed only if the user has successfully obtained an IP address from DHCP. If not, the MAC address is displayed in the Hostname column.
Note: The username is displayed only if the user has successfully passed authentication. If not, the MAC address is displayed in the Username column.
Impacted Clients Dialog Box

To troubleshoot client, click Hostname in the Impacted Client dialog box. The Troubleshooting page is displayed. For more information, refer to Wireless Client Troubleshooting and Reports.

To view the impacted APs, click the value under AP Impact Count. The Impacted APs dialog box is displayed. The impacted AP name, model, MAC Address, and version are displayed in the table. You can use the search option to search the for a specific AP by its name, model, or MAC address.
Impacted APs Dialog Box

To view the affected AP details, click AP Name in the Impacted APs dialog box. The AP Details Report page is displayed. For more information, refer to AP AI Analytics and Reports.

Insights Tile

The Insights tile comprises Root Cause Analysis and Recommended Action panes.

Root Cause Analysis Pane

The Root Cause Analysis pane displays the root cause of the incident. The root cause varies based on the incident type, impacted area, data events, and reason codes.

Recommended Action Pane

The Recommended Action pane displays the recommended actions to remediate the problems.

Network Impact Tile

The Network Impact tile consists of various donut charts that represent the areas of the network that were impacted by the incident. Each incident category and sub-category has a different set of network impact donut charts, but it is common to see AP model, AP firmware, reason by AP, reason by event, WLAN, client OS types, radio bands, reason, and event type, which all help to explain some of the common questions: who is impacted, which devices are contributing, what are the reason codes, and more. Every donut chart is divided into segments of different colors. If you pause your cursor over any portion of the donut chart, an information box displays the impacted area of the incident and the clients or APs affected by it. Under each donut chart is a line summarizing the impact. At the center of the donut chart, the total impacted count is displayed.

Table 1. Attributes of the Network Impact Tile
Incident Type Donut Charts Metrics Graphs
User Authentication
  • Radio: The distribution of impacted clients connected to 5 GHz and 2.4 GHz radios.
  • WLAN: The different WLANs to which the impacted clients are connected.
  • Client Manufacturer: The distribution of device manufacturers.
  • Reason: The breakdown of various failure reasons experienced by the impacted clients.
  • Authentication Failures: A time-series chart that shows the failure percentage over time. The chart includes data for 6 hours before and 6 hours after (if available) the incident.
  • Clients: A time-series line chart showing three color-coded client metrics grouped by: new clients, connected clients, and impacted clients.
  • Failures: A time-series chart with three types of raw failure counts: Authentication Failures, Authentication Attempts, and Total Failures, which includes the total of all types of connection failures (authentication, association, EAP, DHCP, and so on) that were observed during this period.
EAP
  • Radio: The distribution of impacted clients who connected to 5 GHz and 2.4 GHz radios.
  • WLAN: The different WLANs to which the impacted clients are connected.
  • Client Manufactures: The distribution of device manufacturers.
  • Reason: The breakdown of various failure reasons experienced by the impacted clients.
  • EAP Failures: A time-series chart that shows the failure percentages over time. The chart includes data for 6 hours before and 6 hours after (if available) the incident.
  • Clients: Three types of time-series data: a line for new clients, a line for connected clients, and an area chart for impacted clients.
  • Failures: A time-series chart with three types of raw failure counts: EAP Failures, EAP Attempts, and Total Failures, which includes the total of all types of connection failures (authentication, association, EAP, DHCP, and so on) that were observed during this period.
Association
  • Radio: The distribution of impacted clients who connected to 5 GHz and 2.4 GHz radios.
  • WLAN: The different WLANs to which the impacted clients are connected.
  • Client Manufactures: The distribution of device manufacturers.
  • Reason: The breakdown of various failure reasons experienced by the impacted clients.
  • Configuration Change: Chart with drop-down table displaying configuration changes that are relevant to the specific incident.
  • Association Failures: A time-series chart that shows the failure percentages over time. The chart includes data for 6 hours before and 6 hours after (if available) the incident.
  • Clients: Three types of time-series data: a line for new clients, a line for connected clients, and an area chart for impacted clients.
  • Failures: A time-series chart with three types of raw failure counts: Association Failures, Association Attempts, and Total Failures, which includes the total of all types of connection failures (authentication, association, EAP, DHCP, and so on) that were observed during this period.
DHCP
  • Radio: The distribution of impacted clients connected to 5 GHz and 2.4 GHz radios.
  • WLAN: The different WLANs to which the impacted clients are connected.
  • Clients Manufacture: The distribution of device manufacturers.
  • Reason: The breakdown of various failure reasons experienced by the impacted clients.
  • DHCP Failures: A time-series chart that shows the failure percentages over time. The chart includes data for 6 hours before and 6 hours after (if available) the incident.
  • Clients: Three types of time-series data: a line for new clients, a line for connected clients, and an area chart for impacted clients.
  • Failures: A time-series chart with three types of raw failure counts: DHCP Failures, DHCP Attempts, and Total Failures, which includes the total of all types of connection failures (authentication, association, EAP, DHCP, and so on) that were observed during this period.
RADIUS
  • Radio: The distribution of impacted clients who connected to 5 GHz and 2.4 GHz radios.
  • WLAN: The different WLANs to which the impacted clients are connected.
  • Client Manufactures: The distribution of device manufacturers.
  • Reason: The breakdown of various failure reasons experienced by the impacted clients.
  • Configuration Change: chart with drop-down table displaying configuration changes that are relevant to the specific incident.
  • Radius Failures: A time-series chart that shows the failure percentages over time. The chart includes data for 6 hours before and 6 hours after (if available) the incident.
  • Clients: Three types of time-series data: a line for new clients, a line for connected clients, and an area chart for impacted clients.
  • Failures: A time-series chart with three types of raw failure counts: RADIUS Failures, RADIUS Attempts, and Total Failures, which includes the total of all types of connection failures (authentication, association, EAP, DHCP, and so on.) that were observed during this period.
Time to Connect
  • Radio: The distribution of impacted clients who connected to 5 GHz and 2.4 GHz radios.
  • WLAN: The different WLANs to which the impacted clients are connected.
  • Client Manufactures: The distribution of device manufacturers.
  • Reason: The breakdown of various failure reasons experienced by the impacted clients.
  • Configuration Change: Chart with drop-down table displaying configuration changes that are relevant to the specific incident.
  • Time to Connect Failures: A time-series chart that shows the failure percentages over time. The chart includes data for 6 hours before and 6 hours after (if available) the incident.
  • Clients: Three types of time-series data: a line for new clients, a line for connected clients, and an area chart for impacted clients.
  • Time to Connect (By stage): A time-series chart that displays the time to connect based on various stages of the connection such as authentication, association, EAP, Radius, and DHCP. Pause the pointer over the graph for more information.
RSSI
  • WLAN: The different WLANs to which the impacted clients are connected.
  • OS: The operating systems impacted by the incident.
  • AP Model: The AP model impacted by the incident.
  • AP Version: The AP version impacted by the incident.
  • Radio: The distribution of impacted clients who connected to 5 GHz and 2.4 GHz radios
  • RSSI Quality by Clients: Three types of time-series data: a line for new clients, a line for connected clients, and an area chart for impacted clients.
  • RSSI Distribution: The RSSI distribution over a period of time.
Network Latency
  • Ping Latency: Average time, in milliseconds, for the controller nodes to transmit and receive the packets. Maximum, average, and minimum latency trends are plotted on the graph.
  • Controller-1: CPU, memory and input-output usage of the controller node over time is displayed.
  • Controller-2: CPU, memory and input-output usage of the other controller node over time is displayed.
Reboot
  • AP Model: Distribution of impacted AP models.
  • AP Firmware: Distribution of impacted AP versions.
  • Reason by AP: Distribution of reasons for failure that caused the AP reboot.
  • Reason by Event: Distribution of reasons for failure that caused the AP reboot and triggered related events.
  • Reboot by System: A time-series chart that displays the number reboot events.
  • Connected Clients: A time-series chart that displays the number of clients connected at that point in time.
  • Rebooted APs: A time-series chart that displays the number of APs that were rebooted at a point in time.
SmartZone CPU overload insight
  • SZ Applications: Distribution of CPU usage by individual SmartZone applications.
  • SZ Applications Group: Distribution of CPU usage by individual SmartZone application groups.
  • Normalized CPU Usage: A time-series chart that displays the CPU usage in percentage.
  • Memory and I/O Usage: A time-series chart that displays the memory and I/O usage in percentage. You can select the check-box to displays only one or both of the usage metrics.
  • CPU Usage by Application Groups: A time-series chart that displays the CPU usage in percentage, for the various SmartZone application groups. You can select the check-box to displays only one or more of the usage metrics.
High AP-controller connection failures
  • AP Model: Displays the percentage of failure that impacted various AP models
  • AP Firmware: Displays the number of failures that impacted various AP firmware versions
  • Event Type: Displays the percentage of failures that were caused by various events
  • Reason: Lists the reasons that caused the incident
  • AP-Controller Disconnections: A time-series chart that displays the number of disconnections between the AP and controller over time.
  • Event Count: A time-series chart that displays the total event count for the following events: Heartbeat Lost, Connection Lost, Reboot By System, and Reboot By User. When an event is generated for the above mentioned conditions, it is plotted in this graph. You can select the check-box to displays only one or more of the events.
Channel Distribution
  • AP Distribution by Channel: Heatmap that displays the AP count over time, across channels.
  • Rogue Distribution by Channel: A time-series chart that displays the number of rogue APs across channels.
VLAN Mismatch
  • Impacted Switch: Displays the number of switches impacted by VLAN mismatch
  • Mismatched VLANs: Displays the number of VLANs that are mismatched

Incident identifies incorrect VLAN configurations between switches and wired devices due to which data transmission could be impaired.

  • Impacted Switches table: Displays detailed information about the switch name, MAC address, mismatched VLANs, mismatched ports, and mismatched device information where the VLAN mismatch occurred.

    Mismatched VLAN numbers are highlighted red.

Memory Utilization Incident identifies memory leaks within the switch. The time-series chart displays high memory utilization by a switch against the threshold set. Pause the pointer over the graph to determine the switch memory used against the threshold set, at a time.

The Detected Time identifies when the memory leak happened and based on the threshold set, a Projected Time is calculated and plotted on the graph. Projected time is predicted; it is the time by when the switch will run out of available memory. Contact RUCKUS Support for assistance.

You can select the check-box to displays only Memory Used or Threshold graphs.
PoE Power
  • Impacted Switch: Displays the number of switches impacted by the denial of PoE power
  • Impacted PoE Port: Displays the number of PoE ports that are impacted by the denial of PoE power.
The Impacted Switches table displays detailed information about the switch (name, MAC address, port) for which PoE power was denied.
AP PoE Underpowered
  • AP Model: Displays the percentage of failure due to insufficient PoE power that impacted an AP model
  • AP Firmware: Displays the percentage of failure due to insufficient PoE power that impacted an AP firmware version
  • AP POE impact: Displays the number of APs impacted at a time, due to insufficient power available on the PoE port
  • Impacted AP: Displays the list of APs impacted by failure due to insufficient PoE power within the network
AP Ethernet Auto-negotiation
  • AP Model: Displays the percentage of failure due to Ethernet WAN link mismatch that impacted an AP model
  • AP Firmware: Displays the percentage of failure due to Ethernet WAN link mismatch that impacted an AP firmware version
  • APs WAN Throughput Impact: Displays the number of APs impacted at a time, due to Ethernet WAN link mismatch
  • Impacted AP: Displays the list of APs impacted by failure due to Ethernet WAN link mismatch within the network
SZ Cluster
  • WLAN: The different WLANs to which the impacted clients are connected.
  • Reason: The breakdown of various failure reasons experienced by the impacted clients.
  • Client Manufacturer: The distribution of device manufacturers.
  • Radio: The distribution of impacted clients connected to 5 GHz and 2.4 GHz radios.
Time Incidents: a time-series chart that shows when the controller cluster sends data with an incorrect timestamp.
Airtime Busy
  • Average Airtime Busy: Displays the average percentage of airtime busy and peak percentage of airtime busy. Pause the pointer over the sliced segments of the donut chart to view average airtime Rx, Tx, Idle, and Busy over the incident period.
  • Rogue APs: Rogue APs: Displays the distribution of rogue APs across channels. It also displays the channel with the highest number of rogue APs. Pause the pointer over the sliced segments of the donut chart to view the number of rogue APs in each channel. If the donut chart is empty, you may need to enable Rogue AP detection in the zone.
  • Rx PHY Errors: Displays the distribution of PHY errors across impacted APs over the incident period.
  • AP Model: Displays the distribution of impacted AP models.
Airtime Utilization for <radio bands: Displays a timeline graph to indicate average airtime utilization for the respective radio band (2.4 GHz, 5 GHz, or 6 GHz) over the incident period. The data is aggregated over impacted APs only. Pause the pointer at any instance on the timeline graph to view the airtime utilization details at a specific date and time during the incident.
Airtime Tx
  • Average Airtime Tx: Displays the average time and peak time spent by the radio in sending the data. Pause the pointer over the sliced segments of the donut chart to view average airtime Rx, Tx, Idle, and Busy values over the incident period.
  • Average % of Mgmnt. frames: Displays the distribution of data frames as against management frames over the incident period.
  • Average % of MC/BC traffic: Displays the distribution of Unicast traffic, Multicast traffic, and Broadcast traffic over the incident period.
  • Average Peak No. of Clients per AP: Displays the number of clients per AP. Pause the pointer over the sliced segments of the donut chart to view the number of APs in each.
    • Less than 30 APs over the incident period.
    • 31 through 50 APs over the incident period.
    • More than 50 APs over the incident period.
Airtime Utilization for <radio bands: Displays a timeline graph to indicate average airtime utilization for the respective radio band (2.4 GHz, 5 GHz, or 6 GHz) over the incident period. The data is aggregated over impacted APs only. Pause the pointer at any instance on the timeline graph to view the airtime utilization details at a specific date and time during the incident period.
Airtime Rx
  • Average Airtime Rx: Displays the average and peak time spent by the radio in receiving the data. Pause the pointer over the sliced segments of the donut chart to view average airtime Rx, Tx, Idle, and Busy values over the incident period.
  • Average Peak No. of Clients per AP: Displays the number of APs in each bin. Pause the pointer over the sliced segments of the donut chart to view the number of APs in each bin.
    • Less than 30 APs over the incident period.
    • 31 through 50 APs over the incident period.
    • More than 50 APs over the incident period.
  • AP Model: Displays the distribution of impacted AP models.
Airtime Utilization for <radio band>: Displays a timeline graph to indicate average airtime utilization for the respective radio band (2.4 GHz, 5 GHz, or 6 GHz) over the incident period. The data is aggregated over impacted APs only. Pause the pointer at any instance on the timeline graph to view the airtime utilization details at a specific date and time during the incident.

Metrics Graphs

At the bottom of the page are various graphs representing the related metrics and impacted network areas for the time before, during, and after the incident. These data, used in conjunction with Insights, are intended to assist in troubleshooting and remediation. For descriptions of the various graphs, refer to attributes in Table 1.

Mute and Unmute an Incident

Click the icon at the top right of the Incident Details page. The Mute Incident dialog box is displayed. By default, the incident is unmuted. To mute the incident, toggle the Switch to ON. When an incident is muted, it is hidden in the Incidents Table, and notifications via email and webhook are also muted. To unmute the incident, toggle the Switch to OFF. When an incident is unmuted, it is visible in the Incidents Table, and notifications via email and webhook are enabled.

To view the muted incident in the Incidents Table, refer to View the Muted Incident in the Incidents Table.