The filesystem has been stable today, however several users have reported degraded interactive IO experience. We expect this is caused by ongoing heavy metadata load as a result of the continuing background integrity check. Based on current progress we unfortunately expect this to continue into next week.
There have been no major impact to jobs, though some workloads paused when trying to write to the filesystem during the incident, and as a result a few jobs have timed out. If you see this and need help resolving it then please contact support.
Posted Aug 28, 2025 - 16:08 NZST
Update
We are continuing to monitor for any further issues.
Posted Aug 28, 2025 - 00:16 NZST
Monitoring
Full filesystem functionality was restored at approx 11pm NZST. The issue appears to have been triggered by a brief backend network disruption - WEKA support are investigating why the filesystem didn't recover automatically. Ongoing data integrity checks may impact IO performance for a while longer.
Thankfully there seem to be no widespread job impacts, however we will check this more thoroughly in the morning and contact any users who may have had work impacted. Apologies again for the disruption (and goodnight)!
Posted Aug 28, 2025 - 00:16 NZST
Investigating
We have identified an ongoing issue with our high performance filesystems. This is impacting scratch/nobackup, project, home and likely impacting any new logins to the HPC and OnDemand services. At present, existing jobs are continuing to run and complete, however we anticipate there may be job failures as a result of this problem. We are currently awaiting urgent vendor support. Apologies for the inconvenience and disruption, we'll update as soon as we know more.
Posted Aug 27, 2025 - 21:19 NZST
This incident affects: Submit new HPC Jobs, NeSI OnDemand, and HPC Storage.