University of Cambridge Computer Laboratory - Notice history

100% - uptime

Caelum Console (server management) - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime

Request Tracker - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 98.65%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 202598.65% uptime

Other Internal Services - Operational

100% - uptime
Apr 2025 · 100.0%May · 99.98%Jun · 99.59%
Apr 2025100.0% uptime
May 202599.98% uptime
Jun 202599.59% uptime

External Services - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime

Network - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime
100% - uptime

GN09 - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime

WCDC - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime
100% - uptime

Main VM Pool (WCDC) - Operational

100% - uptime
Apr 2025 · 100.0%May · 99.95%Jun · 99.59%
Apr 2025100.0% uptime
May 202599.95% uptime
Jun 202599.59% uptime

GPUs - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime

Secondary VM Hosts - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime

Xen Orchestra - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime
100% - uptime

Filer - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime

Archive Server - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime

Data Replication - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime

Other Secondary Storage Systems - Operational

100% - uptime
Apr 2025 · 100.0%May · 100.0%Jun · 100.0%
Apr 2025100.0% uptime
May 2025100.0% uptime
Jun 2025100.0% uptime
100% - uptime

Third Party: Fastmail → General Availability - Operational

Third Party: Fastmail → Mail delivery - Operational

Third Party: Fastmail → Web client and mobile app - Operational

Third Party: Fastmail → Mail access (IMAP/POP) - Operational

Third Party: Fastmail → Login & sessions - Operational

Third Party: Fastmail → Contacts (CardDAV) - Operational

Notice history

Jun 2025

GPU VM cluster maintenance
  • Completed
    22 June, 2025 at 14:50
    Completed
    22 June, 2025 at 14:50
    Maintenance has completed successfully.
  • Update
    22 June, 2025 at 14:33
    In progress
    22 June, 2025 at 14:33

    Some capacity to run user VMs is now back online. You may try to start your VM again via Xen Orchestra if you need it. If it fails to start, try again after half an hour when there should be more capacity available.

  • Update
    22 June, 2025 at 13:51
    In progress
    22 June, 2025 at 13:51

    Storage server maintenance is complete. The shared server dev-gpu-2 is coming back up. VM hypervisor upgrades are now beginning so personal dev-* VMs will remain down.

  • In progress
    22 June, 2025 at 13:00
    In progress
    22 June, 2025 at 13:00
    Maintenance is now in progress
  • Planned
    22 June, 2025 at 13:00
    Planned
    22 June, 2025 at 13:00

    The GPU VM cluster which hosts dev-gpu-* and dev-cpu-* virtual machines, and the associated storage server, requires some urgent software updates and hardware maintenance in order to rectify a couple of known problems. We propose to do this on Sunday; however this could be rescheduled if this would be particularly disruptive (contact mas90 ASAP if so).

    All dev-gpu-* and dev-cpu-* VMs plus the shared servers dev-gpu-1, dev-gpu-2, dev-cpu-1, dev-gpu-acs and dev-cpu-acs must be shut down during this maintenance, as the storage server that holds VM disks and home directories will be unavailable for a short time. Capacity to host VMs will gradually be restored during the maintenance as each VM host is updated and brought back online.

VM storage server repair (xene-pool1)
  • Completed
    17 June, 2025 at 16:49
    Completed
    17 June, 2025 at 16:49
    Maintenance has completed successfully.
  • In progress
    17 June, 2025 at 16:00
    In progress
    17 June, 2025 at 16:00
    Maintenance is now in progress
  • Planned
    17 June, 2025 at 16:00
    Planned
    17 June, 2025 at 16:00

    Following on from the earlier unscheduled VM storage outage, we need to replace a failed memory module in the storage server that backs one of our departmental VM pools in order to restore performance and reliablity.

    This requires us to shut down all VMs on xene-pool1, which will affect the following departmental services:

    • cl-student-ssh - Undergraduate SSH server

    • MSA (partial outage, one of two servers affected)

    • Request Tracker

    • VPN2 (partial outage, one of two servers affected and new connections are already steered towards the other server)

    • Departmental database server (SQL Server / svr-win-db / db-*)

    • Windows Remote Desktop service

    • dbwebserver

    • WSUS (Windows Updates)

    And it will affect the following user VMs:

    • cl-teaching-ecad

    • dev-compilers0

    • egress

    • knot

    • lmserv-mentor

    • svr-papers

    • svr-www-ecad

    • svr-yg386-web

    These will be shut down soon after 5pm and will remain off for approximately an hour. The at-risk window is given as 2.5 hours due to uncertainty with the exact timing.

    We will take the opportunity to do some routine maintenance (software and firmware updates) of the storage system at the same time, in order to avoid a future need to do more scheduled maintenance.

VM storage fault (xene-pool1)
  • Resolved
    Resolved
    This incident has been resolved. However the same VMs will need to be shut down when a replacement part arrives. This will be communicated separately.
  • Monitoring
    Monitoring
    The fault has been mitigated and affected VMs are now back online. The VMs will have to be shut down again within a few days to replace a failed hardware component. Some users connected to VPN2 may be disconnected shortly as one of the VPN gateway servers needs rebooting even though it is still partially working. Besides this, please contact [service-desk@cst.cam.ac.uk](mailto:service-desk@cst.cam.ac.uk) in case of any remaining problems.
  • Investigating
    Investigating
    Overnight a hardware fault took down the storage server that backs one of our main departmental VM pools (xene-pool1). All VMs running on that pool failed, which included the departmental database server, dbwebserver, Request Tracker, cl-student-ssh, part of the MSA service and the Windows Remote Desktop service.

May 2025

Legacy CUPS printing from Macs disrupted
  • Resolved
    Resolved
    This incident has been resolved.
  • Monitoring
    Monitoring

    We implemented a fix and are currently monitoring the result.

    Nevertheless we suggest that you get DS-Print set up on your devices anyway, as this system will fully replace the legacy CUPS server soon.

  • Investigating
    Investigating

    We are investigating reports that printing to legacy printers from Macs is currently disrupted, possibly due to a Bonjour problem. We suggest using DS-Print as a workaround.

Apr 2025

No notices reported this month

Apr 2025 to Jun 2025

Next