Monitoring Wired Network Health

On the Health page, the Wired tab provides switch-related performance by giving insights into key metrics and showing aggregate values for critical network performance indicators.

On the Health page, the Wired tab is designed to facilitate KPI and service level agreement (SLA) tracking, allowing for real-time monitoring and management of network health. KPIs are organized into four categories: Overview, Connection, Performance, and Infrastructure. Each category provides access to specific health data relevant to the KPI metrics in each section, enabling you to pinpoint and address issues efficiently. Additionally, custom SLA thresholds can be configured for a subset of these KPIs, offering tailored monitoring and alerting based on specific network requirements.

Note: You can only view and manage network data for the domains to which you have access, based on the resource group creation and the role assigned to you as a user. For more information, refer to User Management and Roles and Resource Groups.

Accessing the Health Page for Wired Network

To access the Health page for the wired portion of your networks, navigate to AI Assurance > Network Assurance > Health > Wired tab. The Wired page is displayed.
Note: Data is displayed for switches running FastIron firmware version 10.0.10d or later or SmartZone version 7.x or later.
Health - Wired Tab

The Network Hierarchy filter and Date and Time filter are displayed in the upper-right corner of the Content panel. These options control the elements displayed within the Content Panel. For more information, refer to Network Hierarchy Filter and Date and Time filter.

High-level Network Health Metrics

The multi-colored metrics panel provides real-time data on critical network performance indicators such as DHCP Success, Congested Uplink Ports, Multicast Storm Ports, and High CPU. Each tile displays an aggregated value, enabling quick assessment and proactive management of network health. Note that the colors of the tiles (green, red, yellow, and gray) are fixed and do not indicate the health status of the metrics. Clicking the More Details option on each tile opens a sidebar that displays more details about the most impacted switches with respect to each metric.
High-level Network Health Metrics
Note: The calculation of the Congested Uplink Ports and Multicast Storm Ports health metrics remains independent of the custom SLA defined in the KPI charts below in the interface.
  • DHCP Success: Displays the percentage of successful DHCP bindings for switches that have DHCP snooping configured. A DHCP connection attempt is deemed successful when the device has received an IP address from the DHCP server. It is possible for a single device to have multiple DHCP connection attempts.

    Click the More details option to access the DHCP sidebar that displays information about the top 5 switches with the highest DHCP failure rates in a donut chart and table that shows the Top 10 Impacted Switches.

    The donut chart displays the distribution of switches with the highest number of DHCP failures. The table displays the details of impacted switches. You can access additional switch details by clicking the Name attribute in the table to view the switch report.

    Top Switches with High DHCP Failure Rate
  • Congested Uplink Ports: Displays the percentage of network traffic effectively transmitted through the uplink ports of wired switches. Click the More details option to access the Uplink Usage sidebar that displays information about the top 5 congested switches having the highest number of congested ports in a donut chart and a table that shows the list of Impacted Uplink Ports. You can access additional switch details by clicking the Name attribute in the table to view the switch report.
    Congested Switches with High Uplink Usage
  • Multicast Storm Ports: Displays the percentage of ports experiencing multicast storm, where the multicast traffic level exceeds the multicast traffic threshold defined by the system.
    Click the More details option to access the Multicast Storm Ports sidebar that displays information about the top 5 storm switches having the highest number of uplink ports that have excessive amount of multicast traffic in a donut chart and a table that shows the list of Impacted Uplink Ports. You can access additional switch details by clicking the Name attribute in the table to view the switch report.
    Switches with Ports Experiencing Multicast Storm
  • High CPU: Displays the percentage of switches with high CPU utilization in the network. Click More details option to access the High CPU sidebar that displays information about the top 5 switches with the highest CPU usage in a donut chart and table. The donut chart displays the distribution of switches with the highest CPU usage and the table displays the details of impacted switches. You can access additional switch details by clicking the Name attribute in the table to view the switch report.
    Switches with High CPU Usage

Health KPIs and SLAs

Health KPIs for wired networks are categorized into four sections:
  • Overview
  • Connection
  • Performance
  • Infrastructure
Upon accessing each sub-tab, only one KPI metric is displayed. If present, click View more to display all KPI metrics available under each category.

Each category provides access to specific health data relevant to its KPIs. Additionally, custom SLA thresholds can also be configured for select KPIs, enabling tailored monitoring and alerting based on specific network requirements.

The content area is graphically divided into three sections: a pill-shaped box depicting the average percentage for the metric or specifying the success rate of the KPI metric meeting a threshold within the larger sample set, a time-series graph depicting the metric value in percentage over time, and a bar chart.

The time-series graph has an interactive element which allows you to zoom in on specific time period by dragging the mouse across the graph. You can click the Reset Zoom button to revert to the full time period based on the time range selected in the Date and Time filter at the top of the page. Note that the numbers related to the time-series graph will change as you zoom in or zoom out of a time range, and these changes are reflected in the pill-shaped box data. However, the bar chart on the right remains fixed based on the time range selected at the top of the page.

There are two types of bar charts: a view-only bar chart that provides information about the threshold trends, and another configurable bar chart that allows you to set the threshold for a metric. The threshold you set for the metric is the value against the goal. You can change the SLA threshold by adjusting the slider below the bar chart. As you change the goal, the percentage is recalculated. Click Apply to set the new threshold for the metric or click Reset to revert to the default threshold value.

Overview Tab

The Overview tab displays information on Uplink Port Utilization Compliance KPI metrics.
KPI under the Overview Sub-Tab
  • Uplink Port Utilization Compliance: Measures the amount of network traffic effectively transmitted through the uplink ports of wired switches. It shows the percentage of uplink ports that meet the configured congestion threshold. It also provides information about the number of uplink ports operating in compliance with the configured congestion threshold goal.

    The time-series graph displays the percentage of uplink port utilization compliance over time, while the bar chart on the right shows the distribution of uplink port utilization across the number of ports relative to the configured SLA threshold.

Connection Tab

The Connection sub-tab displays information on authentication KPI metrics.
KPI under the Connection Sub-Tab
  • Authentication: Measures the rate of successful authentication events out of the total authentication attempts on network devices. An authentication attempt is deemed successful when a device is authenticated and granted access to the network. This metric tracks authentication events, categorizing them into successful and failed attempts.

    The time-series graph displays the percentage of successful authentication attempts over time. The bar chart on the right captures the daily percentage of successful authentications over the last seven days of the selected time range.

Performance Tab

The Performance sub-tab displays information on port utilization compliance and uplink port utilization compliance KPI metrics.
KPIs under the Performance Sub-Tab
  • Port Utilization Compliance: Measures the amount of traffic effectively transmitted through the ports within the network's capacity. It shows the percentage of ports that meet the configured congestion threshold. It also provides information about the number of ports operating in compliance with the configured congestion threshold goal.

    The time-series graph displays the percentage of port utilization compliance over time, while the bar chart on the right shows the distribution of port utilization across the number of ports relative to the configured SLA threshold.

  • Uplink Port Utilization Compliance: Measures the amount of network traffic effectively transmitted through the uplink ports of wired switches. It shows the percentage of uplink ports that meet the configured congestion threshold. It also provides information about the number of uplink ports operating in compliance with the configured congestion threshold goal.

    The time-series graph displays the percentage of uplink port utilization compliance over time, while the bar chart on the right shows the distribution of uplink port utilization across the number of ports relative to the configured SLA threshold.

Infrastructure Tab

The Infrastructure sub-tab displays information on the memory compliance, CPU compliance, temperature compliance, and PoE utilization compliance KPI metrics.
KPIs under the Infrastructure Sub-Tab
  • Memory Compliance: Measures the percentage of switches in the network with memory utilization below the specified threshold. This metric indicates the number of switches that are compliant with the predefined limit set in the SLA that specifies the acceptable level of memory utilization for switches.

    The time-series graph displays the percentage of switches with memory utilization below the threshold over time. The bar chart on the right shows the distribution of switches within different ranges of memory utilization relative to the configured SLA threshold.

  • CPU Compliance: Measures the percentage of switches in the network with CPU utilization below the specified threshold. This metric indicates the number of switches that are compliant with the predefined limit set in the SLA that specifies the acceptable level of CPU usage.

    The time-series graph displays the percentage of switches with CPU utilization below the threshold over time. The bar chart on the right shows the distribution of switches within different ranges of CPU utilization relative to the configured SLA threshold.

  • Temperature Compliance: Measures the percentage of switches within safe temperature operating conditions.

    The time-series graph displays the percentage of switches within safe temperature operating conditions over time. The bar chart on the right displays the daily percentage of switches within safe temperature operating conditions over the last seven days.

  • PoE Utilization Compliance: Measures the percentage of switches in the network consuming power within the allocated budget. This metric indicates the number of switches that are compliant with the predefined limit set in the SLA that specifies the acceptable level of PoE utilization.
    The time-series graph displays the percentage of switches with PoE utilization below the configured threshold over time. The bar chart on the right displays the distribution of switches within different ranges of PoE utilization relative to the configured SLA threshold.
    Note: The PoE Utilization Compliance KPI metrics under Wired Network Health is applicable for switches running FastIron firmware version 10.0.10d or later. However, PoE Utilization Compliance KPI metrics under Wireless Network Health is applicable for switches across all firmware.