Notice history

Feb 202599.92% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 202599.98% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 202599.95% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 202599.21% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 202598.49% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 202599.59% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Feb 2025100.0% uptime

Mar 2025100.0% uptime

Apr 2025100.0% uptime

Apr 2025

No notices reported this month

Mar 2025

Resolved
27 March, 2025 at 22:02
Resolved
27 March, 2025 at 22:02
We have implemented a workaround and have brought archive-smb back into service, with reduced resilience pending replacement of a failed system SSD.
Investigating
27 March, 2025 at 11:54
Investigating
27 March, 2025 at 11:54
Since the West Cambridge Data Centre electrical fault, a component in jerakeen/archive-smb (the "new" archive server, currently hosting all SMB/CIFS volumes plus a couple of NFS volumes) has failed. We are investigating.

Resolved
27 March, 2025 at 22:02
Resolved
27 March, 2025 at 22:02
This incident has been resolved.
Monitoring
27 March, 2025 at 11:55
Monitoring
27 March, 2025 at 11:55
We observe that both power feeds in WCDC have now been restored. However as we have had no information from UIS about this incident, we do not yet know whether power can be considered stable.
The archive-smb outage is ongoing and tracked in a separate incident. We believe that all other departmental systems are working again.
Identified
27 March, 2025 at 10:50
Identified
27 March, 2025 at 10:50
We observe that our equipment in the West Cambridge Data Centre lost power (both redundant feeds) around 10:50. Power has been partially restored (one feed) and most departmental systems are back online. However there are ongoing outages affecting multiple other University systems and the University Data Network.
archive-smb is still down and this is being investigated.
If any systems (in particular virtual machines) did not automatically start and are needed, please start them via https://xo.cl.cam.ac.uk/ or contact service-desk@cst.cam.ac.uk .

Resolved
07 March, 2025 at 21:00
Resolved
07 March, 2025 at 21:00
This incident has been resolved. GN09 is fully operational. Most servers that were previously running have been restarted.
If you have a physical server that is not running, you may be able to start it yourself via https://console.caelum.cl.cam.ac.uk as usual, or contact service-desk@cst.cam.ac.uk.
VMs that were not set to start automatically have not been restarted. You can start VMs when you need them via https://xo.cl.cam.ac.uk as usual.
Contact service-desk@cst.cam.ac.uk if there are any remaining issues.
Update
07 March, 2025 at 18:26
Update
07 March, 2025 at 18:26
Cooling has been restored and is expected to remain stable. The cause of the chiller shutting down was the chilled water circulation pumps stopping for some other reason, which will be investigated next week but which we expect to have been an isolated incident. The chiller still has one alarm present which is not preventing operation but is still being investigated.
We are taking the opportunity of GN09 being shut down to perform some routine firmware and software updates on network hardware and storage systems, so we will not start turning servers back on quite yet, but expect to be able to do so shortly.
Update
07 March, 2025 at 17:42
Update
07 March, 2025 at 17:42
Progress has been made; the chiller is running again but there is a problem still under investigation. We are hopeful that servers can be turned back on again today, but will await the all-clear from the chiller technician.
Update
07 March, 2025 at 17:03
Update
07 March, 2025 at 17:03
Most servers in GN09 are now off, and must remain off until further notice. The emergency technician has arrived and is investigating.
Identified
07 March, 2025 at 15:07
Identified
07 March, 2025 at 15:07
The William Gates Building's chiller has a fault and has stopped running. Temperatures in our on-site data centre GN09 are rising rapidly. Engineers have been called out but it is likely that we will have to start shutting down servers in order to protect them.

Resolved
07 March, 2025 at 06:34
Resolved
07 March, 2025 at 06:34
This incident has been resolved.
Identified
07 March, 2025 at 05:16
Identified
07 March, 2025 at 05:16
A network switch serving some users on the second floor of the William Gates Building (particularly around SC corridor, affecting both wired and wifi connections) has failed. We are working to rectify this.

Feb 2025

Resolved
26 February, 2025 at 12:59
Resolved
26 February, 2025 at 12:59
We have updated a component (uwsgi) of the web front end, which we hope will help with the deadlock problem.
Monitoring
26 February, 2025 at 11:55
Monitoring
26 February, 2025 at 11:55
The server has been rebooted (and one bug fixed in the SSH key management backend) to restore service.
We are continuing to investigate a recurring problem that has caused the web frontends to stop responding a few times now.
Investigating
26 February, 2025 at 11:25
Investigating
26 February, 2025 at 11:25
The Caelum Console application for managing servers in datacentres, along with the VPN2 password application and SSH key management application, are currently unavailable. We are investigating.

Resolved
13 February, 2025 at 18:12
Resolved
13 February, 2025 at 18:12
This incident has been resolved.
Investigating
13 February, 2025 at 17:29
Investigating
13 February, 2025 at 17:29
We are aware of delays affecting some email to @cl.cam.ac.uk and @cst.cam.ac.uk addresses. We suspect a problem at Forward Email. We don't think that any email will be lost, only delayed until the problem is fixed.

Feb 2025 to Apr 2025

University of Cambridge Computer Laboratory - Notice history

All systems operational

Notice history

Apr 2025

Mar 2025

Feb 2025