When SCOM is showing multiple computer objects in a critical state it can be very time consuming to use the GUI to open Health Explorer for each computer and then drill down into what the individual problems are.
I adapted a portion of Cory Delamarter's amazing work on recursing the state hierarchy to show the monitors which have errors. His code was missing checks for the ExternalRollupMonitoringState. Mine includes that and so is a lot more complete.
In the example below fill in your own management group name, and a computer group to start with.
$ErrorActionPreference = "Stop"
Set-StrictMode -Version Latest
$scomManagementGroup = "XXX_XXX"
$scomGroup = "YYYY YYYY"
function Write-SCOMMonitoringState {
[CmdletBinding()]
param (
[Parameter(Position=0,ValueFromPipeline=$True,ValueFromPipelinebyPropertyName=$True)]
[PSObject] $MonitoringNode,
[Parameter(Position=1)]
[int] $Depth
)
[string] $whitespace = " " * $depth
if ($MonitoringNode.Item.HealthState -ne "Uninitialized"){
switch($MonitoringNode.Item.HealthState) {
"Warning" {
$color = "Yellow"
}
"Error" {
if ($MonitoringNode.Item.MonitorName -like "*SQLServer*") {
$color = "Gray"
} else {
$color = "Red"
}
}
default {
$color = "White"
}
}
if ($color -ne "White") {
$message = $whitespace + "[" + $MonitoringNode.Item.HealthState + "] --- " + $MonitoringNode.Item.MonitorDisplayName + " / " + $MonitoringNode.Item.MonitorName
Write-Host $message -ForegroundColor $color
if ($MonitoringNode.Item -is [Microsoft.EnterpriseManagement.Monitoring.ExternalRollupMonitoringState]) {
Write-ScomMonitoringStateHierarchy -MonitoringHierarchy $MonitoringNode.Item.GetExternalMonitoringStateHierarchies() -Depth ($depth + 2)
}
}
}
}
function Write-SCOMMonitoringStateHierarchy {
[CmdletBinding()]
param (
[Parameter(Position=0,ValueFromPipeline=$True,ValueFromPipelinebyPropertyName=$True)]
[PSObject] $MonitoringHierarchy,
[Parameter(Position=1)]
[int] $Depth=0
)
foreach ($Node in $MonitoringHierarchy) {
# Only work on nodes that have something in them
if ($Node){
# Write the state out to the screen for the object that we were handed
Write-SCOMMonitoringState -MonitoringNode $Node -Depth $depth
# Sort the child nodes in alphabetical order
$SortedChildNodes = $Node.ChildNodes | Sort-Object -Property Item
# loop through the child node and either recurse or write state
foreach ($ChildNode in $SortedChildNodes){
if ($ChildNode.ChildNodes.Count -ne 0) {
# It has child nodes so recurse
Write-SCOMMonitoringStateHierarchy -MonitoringHierarchy $ChildNode -Depth ($depth + 2)
} else {
# It has no child nodes so write the state to the screen
Write-SCOMMonitoringState -MonitoringNode $ChildNode -Depth ($depth + 2)
}
}
}
}
}
Import-Module OperationsManager
New-SCOMManagementGroupConnection $scomManagementGroup # SCOM Management Group Name
$computers = Get-SCOMGroup $scomGroup | Get-SCOMClassInstance | Where { $_.HealthState -eq "Error" } | Sort { $_."[Microsoft.Windows.Computer].NetbiosComputerName".Value }
foreach ($computer in $computers) {
$computer."[Microsoft.Windows.Computer].NetbiosComputerName".Value
Write-SCOMMonitoringStateHierarchy -MonitoringHierarchy $computer.GetMonitoringStateHierarchy() -Depth 2
Write-Host "" # Blank line between computer objects
}
In this case I colour errors in red and put (most) SQL Server errors in grey (because they're the ones that I would look at). As you can see from the sample below it's extremely easy to understand and useful to generate daily health check reports.