Research Developer Cloud - Flexible HPC access currently down

Incident Report for NeSI Status

Resolved

The cause of this issue was a coordinated crash of our controller infrastructure. We have escalated this with support partners to perform root cause analysis and determine any relevant workarounds, though unfortunately at this point we do not know what triggered it.
Posted Nov 05, 2024 - 11:34 NZDT

Update

It appears no compute hosting services were impacted by this outage, just access related services.
Posted Nov 03, 2024 - 20:04 NZDT

Monitoring

A fix has been implemented and services have been restored.
Posted Nov 03, 2024 - 19:57 NZDT

Identified

The issue has been identified and services are being restored.
Posted Nov 03, 2024 - 19:30 NZDT

Update

We are aware of the ongoing incident with Research Developer Cloud and Flexible HPC - this incident has occurred outside of business support hours, so it is being investigated on a best-efforts basis. It has been escalated for a critical response.
Posted Nov 03, 2024 - 19:24 NZDT

Investigating

There is an ongoing outage for the Research Developer Cloud and Flexible HPC hosted services. We are currently investigating.
Posted Nov 03, 2024 - 12:30 NZDT
This incident affected: Flexible High Performance Cloud Services (Virtual Compute Service, Bare Metal Compute Service, FlexiHPC Dashboard (web interface), FlexiHPC CLI interface, Public API of the FlexiHPC Service) and NeSI OnDemand, Flexible High Performance Cloud.