Completed
WCDC remedial works: ATS replacement

Status
Resolved after 38 minutes
Started
October 24, 2023 at 1:30 PM
Completed
October 24, 2023 at 2:08 PM
Affects
Network
Datacentres
WCDC
  • Completed
    October 24, 2023 at 2:07 PM
    Completed
    October 24, 2023 at 2:07 PM

    Maintenance has completed successfully.

  • In progress
    October 24, 2023 at 1:30 PM
    In progress
    October 24, 2023 at 1:30 PM

    Maintenance is now in progress

  • Planned
    October 24, 2023 at 1:30 PM
    Planned
    October 24, 2023 at 1:30 PM

    We are planning to replace an Automatic Transfer Switch in one of our racks in the West Cambridge Data Centre, to reduce the likely impact of future partial power outages.

    This device automatically switches other devices (servers, network and management infrastructure) between the data centre's two resilient electrical supplies to allow these to remain powered during an outage of one of the supplies. Though this would not have helped with last week's power outages (which affected both supplies simultaneously), in general the power systems in WCDC are designed to limit outages to a single supply at once. We know that the old ATS that we are currently using is not as reliable as it should be, and have a replacement ready.

    The replacement will result in a loss of power to a few servers (those without their own dual-input power supplies), all of which are part of a resilient service so no user-visible outage is expected:

    • adsrv07 (one of three Active Directory servers for DC.CL.CAM.AC.UK)
    • adsrv03 (one of three Active Directory servers for AD.CL.CAM.AC.UK)
    • sxp12 (one of two DHCP servers)

    It will also result in loss of networking to a few things for about 10 minutes, as the 1Gbps switches will be power-cycled:

    • verex01
    • cctv01
    • tfc-app{1,2,4,5}
    • management of servers in WCDC (BMCs etc.)

    Besides that, no user-visible outage is expected.