Some GPU cluster VMs crashed

Resolved
Partial outage
Started 6 months ago Lasted about 1 hour

Affected

Virtual Machine Hosting
GPUs
Updates
  • Resolved
    Resolved

    This incident has been resolved.

  • Identified
    Identified

    Due to a problem during network maintenance, VMs on the departmental GPU cluster briefly lost access to their disks. This caused some VMs to crash. Affected VMs will be rebooted (if CPU VMs) or shut down (if GPU VMs); the latter can be started again from XO.