Slow I/O Performance

Monitoring

With the implementation of a new autocleaning process on the scratch/nobackup filesystem last month we have now resolved the associated storage capacity issues related to this incident.

Unfortunately, some workloads will still be experiencing reduced read performance depending on other activity on the system - if you notice highly variable or poor IO performance then please report it to support as we may be able to assist with workarounds.

There are several pieces of work in progress to improve IO performance on the new platform based on learnings from the first 6 months of operations, including an expansion of SSD/NVMe capacity. We will make an announcement soon providing some additional details on this and also update documentation to reflect important considerations for users.
Posted Jan 15, 2026 - 11:57 NZDT

Update

There has been a period of IO stalls this morning as we've dealt with some storage hardware failures. That issue is now resolved, however space reclamation continues in the backend and is having a detrimental impact on read performance. We are working with WEKA support to look at mitigation options. Apologies for the performance impact - if your jobs are impacted and need a runtime limit extension then please reach out to support.
Posted Nov 24, 2025 - 12:44 NZDT

Identified

Our storage system is currently very full and this is forcing the backend object storage to undertake some urgent administration by way of defragmentation. This increased load is having a detrimental affect on I/O performance, especially read I/O, and this is likely to continue for some days. We are urgently looking at ways to mitigate this. Researchers could help alleviate this in the short term by cleaning up any unwanted files and data as soon as possible please.
Posted Nov 21, 2025 - 12:29 NZDT
This incident affects: Data Transfer and HPC Storage.