Get webhook notifications whenever REANNZ Advanced Computing and Data Services creates an incident, updates an incident, resolves an incident or changes a component status.
Monitoring - Summary: We have successfully deployed a system-wide mitigation across the Mahuika environment to address the recently disclosed ssh-keysign vulnerability. While the cluster remains fully operational and secure, the security measures implemented may affect certain user workflows, specifically those relying on system call tracing and process debugging.
What You Need to Know (User Impact): To effectively neutralize this vulnerability, we have implemented strict restrictions on ptrace capabilities across the environment (including on compute nodes). If your research or development workflow involves attaching to running processes, you will likely experience permission errors (e.g., Operation not permitted). Potentially Affected Tools: • Debuggers: gdb, Arm Forge (DDT), TotalView, or attaching to processes via IDEs. • System Tracers: strace, ltrace. • Performance Profilers: Intel VTune, Valgrind, Linux perf, and various MPI profiling utilities that rely on ptrace to hook into active jobs.
Note on Standard Execution: Running a program directly under a debugger (e.g., gdb ./my_program) may still work depending on the exact scope of the applied kernel restrictions, but attaching to an already running process (e.g., gdb -p ) will be blocked.
Next Steps & Support: For now we have implemented the strictest mitigation and will evaluate this further next week. We understand that process profiling and debugging are critical components of HPC development. If this mitigation breaks an essential part of your workflow, please contact the support team. We can discuss alternative profiling methods. Thank you for your cooperation as we work to maintain a secure environment for all users.
May 16, 2026 - 12:59 NZST
This page shares the system status of REANNZ's advanced computing platform and storage services, including impacts of any known outages or planned maintenance work.
Please note that these services are currently only supported within business hours, so there may be delays in communicating outages outside of these hours despite best efforts.
Because the security world loves catchy names almost as much as it loves ruining a perfectly good week, we have two new vulnerabilities to bring to your attention: CopyFail and DirtyFrag. While they sound like a forgotten 90s hacker movie duo, they are very real, and your cloud servers need an update to stay secure.
How to Mitigate: Please update your Research Developer Cloud instances at your earliest convenience. You have two paths to a clean bill of health:
* The Fresh Start (New Images): We have already created and deployed shiny new, pre-patched images. You can simply spin up new instances using these mitigated images and migrate your workloads over. * The Hands-On Approach (Patch In-Place): If you prefer to keep your current instances running, please apply the recommended OS-level updates and mitigations directly to your existing servers. See links to relevant information below.
Need a hand? If things go sideways during the update, or if you just need some assistance navigating the mitigations, our support team is ready to help. Please reach out to us!
Stay secure (and un-fragged), REANNZ Advanced Computing Team
Completed -
The scheduled maintenance has been completed.
May 14, 13:30 NZST
In progress -
Scheduled maintenance is currently in progress. We will provide updates as necessary.
May 14, 09:30 NZST
Scheduled -
Description: This maintenance includes applying mitigations for recent Linux CVEs and fix excessive Kernel messages outlined below: - Apply mitigations for Copy.Fail and Dirty Frag Linux vulnerabilities on Openstack control plane servers (controllers and hypervisors ) - Apply the fix for excessive logging messages in OpenStack controllers and compute nodes
Why: - To apply mitigations for the Copy.Fail and Dirty Frag Linux CVEs on OpenStack Controllers and Hypervisors - To reduce unnecessary error messages in the logs of controllers and compute nodes (this has been confirmed non-critical by StackHPC)
Potential impact: - For the CVE mitigations, the restart of the controllers might render the OpenStack dashboard and APIs unavailable while the node is being rebooted. There is no expected impact for Hypervisors as these won't be rebooted. - No impact is expected for the excessive logging fix. We have tested the mitigations and deployed already on some hosts, and there were no adverse impacts observed.
Duration: - Work still start from 9:30AM, 14 May 2026 and should be done before 2PM on the same day.
May 13, 17:01 NZST
Completed -
The scheduled maintenance has been completed.
May 4, 18:00 NZST
Update -
Scheduled maintenance is still in progress. We will provide updates as necessary.
May 4, 15:41 NZST
In progress -
Scheduled maintenance is currently in progress. We will provide updates as necessary.
May 4, 15:30 NZST
Scheduled -
We will updating our proxy: my.nesi.org.nz and our identification management system may be affected for a short period of time.
May 1, 13:54 NZST