Internal Services

100.0% uptime

Caelum Console (server management)

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Request Tracker

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Other Internal Services

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

External Services

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Network

99.99% uptime
Apr 2024 · 99.96%May · 100.0%Jun · 100.0%
Apr 202499.96% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Datacentres

100.0% uptime

GN09

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

WCDC

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Virtual Machine Hosting

100.0% uptime

Main VM Pool (WCDC)

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

GPUs

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Secondary VM Hosts

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Xen Orchestra

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Data Storage

100.0% uptime

Filer

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Archive Server

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Data Replication

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Other Secondary Storage Systems

100.0% uptime
Apr 2024 · 100.0%May · 100.0%Jun · 100.0%
Apr 2024100.0% uptime
May 2024100.0% uptime
Jun 2024100.0% uptime

Fastmail email

100.0% uptime

Third Party: Fastmail → General Availability

Operational

Third Party: Fastmail → Mail delivery

Operational

Third Party: Fastmail → Web client and mobile app

Operational

Third Party: Fastmail → Mail access (IMAP/POP)

Operational

Third Party: Fastmail → Login & sessions

Operational

Third Party: Fastmail → Contacts (CardDAV)

Operational

Notice history

Jun 2024

No notices reported this month

May 2024

WGB emergency network maintenance
  • Completed
    May 06, 2024 at 10:42 PM
    Completed
    May 06, 2024 at 10:42 PM
    Maintenance has completed successfully.
  • In progress
    May 06, 2024 at 10:15 PM
    In progress
    May 06, 2024 at 10:15 PM
    Maintenance is now in progress
  • Planned
    May 06, 2024 at 10:15 PM
    Planned
    May 06, 2024 at 10:15 PM

    We will be updating the software on the core router/switch in the William Gates Building (gatwick) in order to attempt to mitigate the ongoing crashes (https://cl.instatus.com/clvva1e4b43187b8n2hqywstc0). This upgrade cannot be performed "live", so there will be approximately 20-30 minutes' outage of the William Gates Building office network, and of filer. Other servers in GN09 should be largely unaffected.

WGB network problem under investigation
  • Resolved
    Resolved
    This incident has been resolved.
  • Monitoring
    Monitoring

    The core switch/router in the William Gates Building (gatwick) appears to have crashed and rebooted; perhaps a reoccurrence of issues a month ago (https://cl.instatus.com/clur1lte237417blopt08gvelk). Networking should have returned (initially via one switch of the redundant pair that constitutes gatwick, whilst the other switch restarts).

Apr 2024

gatwick (WGB core network) crashed
  • Resolved
    Resolved
    This incident has been resolved.
  • Monitoring
    Update

    gatwick crashed and rebooted again at around 06:22, again triggered by a routine configuration update.

    We had, earlier in the night, attempted to install a software update but due to an unrelated issue, the routers refused to do an 'In Service Software Upgrade' - i.e. the upgrade would have caused more disruption - so we chose to roll back and delay this update until Cisco published their notes about this particular version.

  • Monitoring
    Update

    The routers have remained stable since the crash, but we're going to do some further testing out-of-hours, and install a software update. There may be some further disruption whilst that happens.

  • Monitoring
    Update

  • Monitoring
    Monitoring

    The core router and switch in the William Gates Building (gatwick) seemingly crashed and rebooted at around 14:27.

    This appears to have been due to a software bug triggered by a routine configuration change. Although gatwick is a virtual switch/router comprising two independent physical systems, it seems that the entire virtual switch/router (both physical systems) rebooted simultaneously.

    This type of device takes a long time to reboot; in this case there would have been a little over 12 minutes during which the William Gates Building office network was cut off from the University network and the internet (followed by a further few minutes of instability). This would also have affected connectivity to filer and a few other core services hosted in the WGB.

    Investigation is ongoing into the reason for this outage.

Apr 2024 to Jun 2024