Affects
Under maintenance from 6:30 AM to 2:12 PM
- CompletedApril 11, 2023 at 2:12 PMCompletedApril 11, 2023 at 2:12 PM
The second floor switches have been fixed (in one case, replaced with spare hardware). All is now believed to be back to normal. As usual, contact sys-admin if anything seems wrong.
- UpdateApril 11, 2023 at 12:43 PMIn progressApril 11, 2023 at 12:43 PM
The WC2D switch serving part of the SC corridor is up for the moment, now that we have manually rolled back its failed firmware update, but will reboot again (~15 minute outage) later this afternoon to redo the update.
Preparation of a replacement switch for WC2B (a few rooms on SN/SW) is ongoing.
- UpdateApril 11, 2023 at 11:13 AMIn progressApril 11, 2023 at 11:13 AM
One of the WC2B switches serving about 50% of connections in the northwest corner of the second floor (SN, SW) appears to have completely failed; a replacement will be set up and installed but this will take a couple of hours.
The WC2D switch problem (part of SC) is still under investigation.
- UpdateApril 11, 2023 at 10:44 AMIn progressApril 11, 2023 at 10:44 AM
A switch serving part of the SN and SW corridor also has a fault under investigation.
Servers in GN09 should now be back to normal.
- In progressApril 11, 2023 at 10:33 AMIn progressApril 11, 2023 at 10:33 AM
A fault affecting the office network on parts of the SC corridor, arising after the power outage, is being investigated.
- CompletedApril 11, 2023 at 10:30 AMCompletedApril 11, 2023 at 10:30 AM
Maintenance has completed successfully
- UpdateApril 11, 2023 at 10:20 AMIn progressApril 11, 2023 at 10:20 AM
The power maintenance has been completed; power was restored at around 10:49 (after a brief previous restoration).
The office network (including wifi and phones) is now starting back up in sequence: ground floor initially, then first floor, then second floor. Each switch will go through an update process which may take 20 minutes or so. We expect that office networking is starting to come back up now but may take a little longer in some parts of the building.
Servers in GN09 that were shut down can be restarted using the Caelum Console, https://console.caelum.cl.cam.ac.uk/ . Some servers known to be required are being started for you now.
- UpdateApril 11, 2023 at 7:22 AMIn progressApril 11, 2023 at 7:22 AM
Generator transfer and power outage commencing shortly.
- UpdateApril 11, 2023 at 7:03 AMIn progressApril 11, 2023 at 7:03 AM
Slight delay to the power outage as the engineer from UK Power Networks has been delayed. Please leave machines off, for now.
- UpdateApril 11, 2023 at 6:31 AMIn progressApril 11, 2023 at 6:31 AM
Server shutdown in GN09 will begin shortly. Power outage expected in 30 minutes.
- In progressApril 11, 2023 at 6:30 AMIn progressApril 11, 2023 at 6:30 AM
Maintenance is now in progress
- UpdateApril 11, 2023 at 6:30 AMPlannedApril 11, 2023 at 6:30 AM
Power to the William Gates Building will be shut down for approximately 3 hours on the morning of Tuesday 11th April, for routine maintenance of the 11kV substation.
Generators will be connected to maintain power to the GN09 UPS-protected circuits and core University/Janet network infrastructure only.
Servers in GN09 which are powered solely from mains circuits will be shut down prior to the work and will remain powered off until the maintenance has completed. A list of affected servers is being prepared and will be published as soon as possible.
Servers in GN09 powered by the UPS are at risk of disruption as well, because the process of transferring the UPS to a generator supply runs a small risk of tripping RCDs protecting individual rack circuits within GN09. (Last time we tested this, one circuit out of approximately 48 tripped.)
Core services such as filer and the GN09 core network are expected to remain up, since they are protected by two independent UPSes.
The office network, wifi and phones will be down for the duration of the maintenance; the UPS batteries in the wiring cupboards will not last long enough to sustain service. However we anticipate that the building will be closed during this work anyway as there will be no lighting or other basic services.
We will take the opportunity to upgrade firmware on the office network switches, so there is a small chance of further disruption once the power returns if there is any problem with the firmware update on a switch.
Regardless, once power returns it may take 30 minutes or more for the office network, wifi and phones to be restored to service. This is because the switches take a long time to start up, especially during a firmware update, plus we may have to manually turn wiring cupboard circuits back on throughout the building.
- UpdateApril 10, 2023 at 11:27 PMPlannedApril 10, 2023 at 11:27 PM
Reminder: the William Gates Building's electrical supply will be shut down at 08:00 BST. Please shut down computers in offices before that time. Affected servers in the datacentre, GN09, will be shut down for you starting at 07:30. The list of affected servers in GN09 can be found at http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/20230411AffectedMachines .
- PlannedApril 05, 2023 at 10:41 AMPlannedApril 05, 2023 at 10:41 AM
The list of affected machines in GN09 that will be powered down prior to the electrical maintenance can be found at:
http://www.wiki.cl.cam.ac.uk/rowiki/SysInfo/20230411AffectedMachines