University of Cambridge Computer Laboratory - William Gates Building planned power outage – Maintenance details

William Gates Building planned power outage

Completed
Scheduled for 17 August, 2024 at 07:00 – 13:34

Affects

Internal Services

Under maintenance from 7:00 AM to 1:34 PM

Caelum Console (server management)

Under maintenance from 7:00 AM to 1:34 PM

Other Internal Services

Under maintenance from 7:00 AM to 1:34 PM

Datacentres

Under maintenance from 7:00 AM to 1:34 PM

GN09

Under maintenance from 7:00 AM to 1:34 PM

Virtual Machine Hosting

Under maintenance from 7:00 AM to 1:34 PM

Updates
  • Update
    19 August, 2024 at 16:50
    Completed
    19 August, 2024 at 16:50

    The issue affecting power control of servers in GN09 rack 6 has been rectified.

  • Completed
    17 August, 2024 at 13:34
    Completed
    17 August, 2024 at 13:34

    All infrastructure is believed to be operational again after this morning's electrical work, with the exception of power control for a small number of research servers in GN09 rack 6; a PDU has developed a hardware fault. Power is still being supplied, but cannot be turned off or on remotely. Servers can still be turned off or on via their BMCs, but if you need a server power-cycling please contact service-desk@cl.cam.ac.uk.

  • Update
    17 August, 2024 at 11:54
    In progress
    17 August, 2024 at 11:54

    The electrical supply has been restored. We are in the process of restoring infrastructure and then GN09 servers. This is likely to take an hour or two.

  • In progress
    17 August, 2024 at 07:00
    In progress
    17 August, 2024 at 07:00
    Maintenance is now in progress
  • Update
    17 August, 2024 at 07:00
    Planned
    17 August, 2024 at 07:00

    The William Gates Building will be without power for part of Saturday 17th August 2024, due to further planned work on our electrical switch gear on the connection to the building's new solar panels. This additional shutdown is needed to rectify a problem with one of the components installed during the January shutdown.

    Nearly all IT services in the William Gates Building will be unavailable for most of the day.

    Telephones, office networking and wifi will be unavailable all day (but the building is likely to be closed in any case). Please make sure that all computers in offices are shut down - not just asleep - when you leave on Friday.

    We will start shutting down servers at 8am ready for the power to be turned off at around 10am. We expect the power to come back on at approximately 1pm but it will then take some time to bring all systems back into operation.

    We will unfortunately need to shut down all servers in GN09 except for a very small number of critical services such as filer and network infrastructure (which will be powered from a temporary generator), as the cooling system will be offline for several hours and temperatures would otherwise climb to unsafe levels. This includes nearly all research servers and all GPU servers (including GPU VMs). GN09 holds almost all of our server hardware; if you are unsure where your server is located, it is probably in GN09 and will probably be affected. (A very small number of research systems are in the West Cambridge Data Centre, and will not be affected.)

    The outage is not expected to affect core infrastructure, administrative systems or small VMs as these are hosted in the West Cambridge Data Centre. However there is a risk that access to filer from these systems will be disrupted; we don't plan to turn filer off, but it is in GN09, its temporary electrical supply is at risk, and we may have to turn it off if it gets too hot. Where a service is replicated between multiple sites, only one instance of the service may be available (this affects most core services such as LDAP, Active Directory and VPN2).

    VMs hosted by the department will stay running unless they are on the GPU VM clusters (this applies both to VMs with GPUs, and VMs with a lot of CPU cores - generally with names that contain "gpu", "cpu" or "dev").

    Services hosted externally to the department, for example by UIS, will not be affected - for example Moodle, CamSIS, HPC, Exchange email, Fastmail email and the main departmental (CST) website.

  • Planned
    16 August, 2024 at 11:09
    Planned
    16 August, 2024 at 11:09

    Reminder that the William Gates Building's electrical supply will be turned off tomorrow.

    Please fully shut down office PCs before you leave today.

    Research and teaching servers in GN09 will be turned off tomorrow morning, except where already agreed.