University of Cambridge Computer Laboratory - William Gates Building planned power outage – Maintenance details

William Gates Building planned power outage

Completed
Scheduled for August 17, 2024 at 7:00 AM – 1:34 PM

Affects

Internal Services

Under maintenance from 7:00 AM to 1:34 PM

Caelum Console (server management)

Under maintenance from 7:00 AM to 1:34 PM

Other Internal Services

Under maintenance from 7:00 AM to 1:34 PM

Datacentres

Under maintenance from 7:00 AM to 1:34 PM

GN09

Under maintenance from 7:00 AM to 1:34 PM

Virtual Machine Hosting

Under maintenance from 7:00 AM to 1:34 PM

Updates
  • Update
    August 19, 2024 at 4:50 PM
    Completed
    August 19, 2024 at 4:50 PM

    The issue affecting power control of servers in GN09 rack 6 has been rectified.

  • Completed
    August 17, 2024 at 1:34 PM
    Completed
    August 17, 2024 at 1:34 PM

    All infrastructure is believed to be operational again after this morning's electrical work, with the exception of power control for a small number of research servers in GN09 rack 6; a PDU has developed a hardware fault. Power is still being supplied, but cannot be turned off or on remotely. Servers can still be turned off or on via their BMCs, but if you need a server power-cycling please contact service-desk@cl.cam.ac.uk.

  • Update
    August 17, 2024 at 11:54 AM
    In progress
    August 17, 2024 at 11:54 AM

    The electrical supply has been restored. We are in the process of restoring infrastructure and then GN09 servers. This is likely to take an hour or two.

  • In progress
    August 17, 2024 at 7:00 AM
    In progress
    August 17, 2024 at 7:00 AM
    Maintenance is now in progress
  • Update
    August 17, 2024 at 7:00 AM
    Planned
    August 17, 2024 at 7:00 AM

    The William Gates Building will be without power for part of Saturday 17th August 2024, due to further planned work on our electrical switch gear on the connection to the building's new solar panels. This additional shutdown is needed to rectify a problem with one of the components installed during the January shutdown.

    Nearly all IT services in the William Gates Building will be unavailable for most of the day.

    Telephones, office networking and wifi will be unavailable all day (but the building is likely to be closed in any case). Please make sure that all computers in offices are shut down - not just asleep - when you leave on Friday.

    We will start shutting down servers at 8am ready for the power to be turned off at around 10am. We expect the power to come back on at approximately 1pm but it will then take some time to bring all systems back into operation.

    We will unfortunately need to shut down all servers in GN09 except for a very small number of critical services such as filer and network infrastructure (which will be powered from a temporary generator), as the cooling system will be offline for several hours and temperatures would otherwise climb to unsafe levels. This includes nearly all research servers and all GPU servers (including GPU VMs). GN09 holds almost all of our server hardware; if you are unsure where your server is located, it is probably in GN09 and will probably be affected. (A very small number of research systems are in the West Cambridge Data Centre, and will not be affected.)

    The outage is not expected to affect core infrastructure, administrative systems or small VMs as these are hosted in the West Cambridge Data Centre. However there is a risk that access to filer from these systems will be disrupted; we don't plan to turn filer off, but it is in GN09, its temporary electrical supply is at risk, and we may have to turn it off if it gets too hot. Where a service is replicated between multiple sites, only one instance of the service may be available (this affects most core services such as LDAP, Active Directory and VPN2).

    VMs hosted by the department will stay running unless they are on the GPU VM clusters (this applies both to VMs with GPUs, and VMs with a lot of CPU cores - generally with names that contain "gpu", "cpu" or "dev").

    Services hosted externally to the department, for example by UIS, will not be affected - for example Moodle, CamSIS, HPC, Exchange email, Fastmail email and the main departmental (CST) website.

  • Planned
    August 16, 2024 at 11:09 AM
    Planned
    August 16, 2024 at 11:09 AM

    Reminder that the William Gates Building's electrical supply will be turned off tomorrow.

    Please fully shut down office PCs before you leave today.

    Research and teaching servers in GN09 will be turned off tomorrow morning, except where already agreed.