Affects
Under maintenance from 6:30 AM to 2:12 PM
- Completed11 April, 2023 at 14:12Completed11 April, 2023 at 14:12
The second floor switches have been fixed (in one case, replaced with spare hardware). All is now believed to be back to normal. As usual, contact sys-admin if anything seems wrong.
- Update11 April, 2023 at 12:43In progress11 April, 2023 at 12:43
The WC2D switch serving part of the SC corridor is up for the moment, now that we have manually rolled back its failed firmware update, but will reboot again (~15 minute outage) later this afternoon to redo the update.
Preparation of a replacement switch for WC2B (a few rooms on SN/SW) is ongoing.
- Update11 April, 2023 at 11:13In progress11 April, 2023 at 11:13
One of the WC2B switches serving about 50% of connections in the northwest corner of the second floor (SN, SW) appears to have completely failed; a replacement will be set up and installed but this will take a couple of hours.
The WC2D switch problem (part of SC) is still under investigation.
- Update11 April, 2023 at 10:44In progress11 April, 2023 at 10:44
A switch serving part of the SN and SW corridor also has a fault under investigation.
Servers in GN09 should now be back to normal.
- In progress11 April, 2023 at 10:33In progress11 April, 2023 at 10:33
A fault affecting the office network on parts of the SC corridor, arising after the power outage, is being investigated.
- Completed11 April, 2023 at 10:30Completed11 April, 2023 at 10:30
Maintenance has completed successfully
- Update11 April, 2023 at 10:20In progress11 April, 2023 at 10:20
The power maintenance has been completed; power was restored at around 10:49 (after a brief previous restoration).
The office network (including wifi and phones) is now starting back up in sequence: ground floor initially, then first floor, then second floor. Each switch will go through an update process which may take 20 minutes or so. We expect that office networking is starting to come back up now but may take a little longer in some parts of the building.
Servers in GN09 that were shut down can be restarted using the Caelum Console, https://console.caelum.cl.cam.ac.uk/ . Some servers known to be required are being started for you now.
- Update11 April, 2023 at 07:22In progress11 April, 2023 at 07:22
Generator transfer and power outage commencing shortly.
- Update11 April, 2023 at 07:03In progress11 April, 2023 at 07:03
Slight delay to the power outage as the engineer from UK Power Networks has been delayed. Please leave machines off, for now.
- Update11 April, 2023 at 06:31In progress11 April, 2023 at 06:31
Server shutdown in GN09 will begin shortly. Power outage expected in 30 minutes.
- In progress11 April, 2023 at 06:30In progress11 April, 2023 at 06:30
Maintenance is now in progress
- Update11 April, 2023 at 06:30Planned11 April, 2023 at 06:30
Power to the William Gates Building will be shut down for approximately 3 hours on the morning of Tuesday 11th April, for routine maintenance of the 11kV substation.
Generators will be connected to maintain power to the GN09 UPS-protected circuits and core University/Janet network infrastructure only.
Servers in GN09 which are powered solely from mains circuits will be shut down prior to the work and will remain powered off until the maintenance has completed. A list of affected servers is being prepared and will be published as soon as possible.
Servers in GN09 powered by the UPS are at risk of disruption as well, because the process of transferring the UPS to a generator supply runs a small risk of tripping RCDs protecting individual rack circuits within GN09. (Last time we tested this, one circuit out of approximately 48 tripped.)
Core services such as filer and the GN09 core network are expected to remain up, since they are protected by two independent UPSes.
The office network, wifi and phones will be down for the duration of the maintenance; the UPS batteries in the wiring cupboards will not last long enough to sustain service. However we anticipate that the building will be closed during this work anyway as there will be no lighting or other basic services.
We will take the opportunity to upgrade firmware on the office network switches, so there is a small chance of further disruption once the power returns if there is any problem with the firmware update on a switch.
Regardless, once power returns it may take 30 minutes or more for the office network, wifi and phones to be restored to service. This is because the switches take a long time to start up, especially during a firmware update, plus we may have to manually turn wiring cupboard circuits back on throughout the building.
- Update10 April, 2023 at 23:27Planned10 April, 2023 at 23:27
Reminder: the William Gates Building's electrical supply will be shut down at 08:00 BST. Please shut down computers in offices before that time. Affected servers in the datacentre, GN09, will be shut down for you starting at 07:30. The list of affected servers in GN09 can be found at http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/20230411AffectedMachines .
- Planned05 April, 2023 at 10:41Planned05 April, 2023 at 10:41
The list of affected machines in GN09 that will be powered down prior to the electrical maintenance can be found at:
http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/20230411AffectedMachines