University of Cambridge Computer Laboratory - GN09 cooling fault – Incident details

GN09 cooling fault

Resolved
Major outage
Started about 1 month agoLasted about 3 hours

Affected

Datacentres

Major outage from 10:55 AM to 2:21 PM

GN09

Major outage from 10:55 AM to 2:21 PM

Virtual Machine Hosting

Major outage from 10:55 AM to 1:44 PM, Partial outage from 10:55 AM to 1:44 PM, Major outage from 1:44 PM to 2:21 PM

GPUs

Major outage from 10:55 AM to 2:21 PM

Secondary VM Hosts

Partial outage from 10:55 AM to 1:44 PM, Major outage from 1:44 PM to 2:21 PM

Updates
  • Resolved
    Resolved

    This incident has been resolved.

    Please contact service-desk@cst.cam.ac.uk if you are experiencing any ongoing issues.

  • Update
    Update

    Caelum users are free to turn their servers on again via the Caelum Console.

    VM users are free to turn their VMs on again via Xen Orchestra.

  • Update
    Update

    The chiller is operational again. We will start to bring affected services back online. This will take time and we will provide another update when this is complete.

  • Identified
    Identified

    Cooling for GN09 is currently inoperable. Engineers are on the way, but due to climbing temperatures, we will have to shut down all research and teaching servers shortly.