Skip to content

Rename directories in multi-gpu regression tests

When running the regression tests on 4 and 8 GPUs at the same time, one of them fails due to missing files.

ℹ Copying files...
✔  Done
🚀  Launched job 2684546
✔  Wrote job ID 2684546 to file hpcrocket4GPU.log
ℹ Collecting files...
❌ FileNotFoundError: multigpu_test/output/4GPU/
✔  Done
ℹ Cleaning files...
❌ ResourceNotFound: resource '' not found
               ╷          ╷              
  ID           │ Name     │ State        
╶──────────────┼──────────┼─────────────╴
  2684546      │ Regr4GPU │ ❌ FAILED    
  2684546.bat+ │ batch    │ ❌ FAILED    
  2684546.ext+ │ extern   │ ✔ COMPLETED  
               ╵          ╵              

I think the cause is the clean stage in the rocketGPU.yml file.

clean:
  - multigpu_test/*

Since the directory had the same name for both tests, one test may have deleted the data of the other test. Therefore, I renamed the directories.

Merge request reports

Loading