In RHV web portal, Select “Compute” -> “Cluster” -> Select one cluster object -> click “Edit” -> click “Migration Policy”, there is a control called “Resilience Policy”, this note explains how “Resilience Policy” controls VM live migration during an outage.
Basically “Resilience Policy” controls how virtual machines running on the host in “Non-operational” state in RHV web portal are live migrated to other hosts in the same cluster.
A host in outage may appear in “Non operational” state or “Non responsive” state the difference of which is,
“Non operational”:
Something is wrong with the configuration of the KVM host. The RHEVM engine can still communicate with the KVM host, though.
It can be a failure in the connectivity of KVM host to any of the cluster components, say one storage domain.
“Non responsive”:
The RHEVM-engine cannot communicate with the KVM host via vdsm. In simple terms, there is a break in RHEVM-engine->KVM host communication path. It may be due to network split, dead vdsm, firewall, etc.