Alarm after SDRS recommendation is automatically applied
When you operate a vSphere storage cluster, you will know you can set the automation level of Storage DRS automation to No Automation (Manual Mode) or Fully Automated. When set to automatic, the cluster applies recommendations automatically. When set to manual, recommendations can be applied manually. By default a alarm (name: Storage DRS recommendation) on cluster level pops up when a recommendation arises.
As you can see in the screenshot, alarm definition includes the possibility to set the alarm back to normal when Pending storage recommendations were applied. And here is the problem: this does not work in vCenter 6.7 any more. The alarm in a fully automated storage cluster does not get back to normal automatically – which it does in 6.5.
According to VMware support, it works as designed, which is hard to believe. There were some changes in alarm management – including alarm naming. This leads to this strange behavior:
- In a fully automated storage cluster, default alarm Storage DRS recommendation stays on warning status, even if recommendation was applied. It has to be set to green manually.
- In a storage cluster in manual mode, recommendations has to be applied manually and alarm gets set to normal automatically.
There is no plan to change this behavior back to 6.5-style in 6.7 U2.
Currently there are at least three options when running automatic mode:
- Leave it as it is.
- Set cluster to manual mode. Alarm is canceled automatically, but recommendations are not applied automatically.
- Disable alarm Storage DRS recommendation.
At first glance last option seems to be a good solution. But there is a problem with this option. When a recommendation is generated but it cannot be applied – e.g. because of insufficient space on other datastores within the cluster – there will be no alarm on this cluster. But – when enabled – alarm Storage DRS recommendation would stay on warning-level, because recommendation couldn’t be applied.
When disabling this alarm, I would strongly recommend to set percentage of Datastore Disk Usage in alarm Datastore usage on disk like Space threshold in storage cluster configuration. So you get at least alarms when datastores in storage cluster are at threshold level. Furthermore you should regularly have a look at cluster faults:
This can also be queried by PowerCLI: