当前位置:   article > 正文

运行pytorch作业出现错误 RuntimeError: unable to write to file_runtimeerror:unable to write to file

runtimeerror:unable to write to file :no space lef

运行pytorch作业出现错误 RuntimeError: unable to write to file </torch_xxx>

https://github.com/huaweicloud/dls-example/issues/26

pytorch将共享内存的临时文件保存在了/torch_xxx文件中,即容器中的根目录下。容器磁盘空间不足导致该问题的发生。目前可以通过以下代码暂时关闭pytorch的shared memory功能来规避

直接加在train.py的最前面就可以

  1. import sys
  2. import torch
  3. from torch.utils.data import dataloader
  4. from torch.multiprocessing import reductions
  5. from multiprocessing.reduction import ForkingPickler
  6. default_collate_func = dataloader.default_collate
  7. def default_collate_override(batch):
  8. dataloader._use_shared_memory = False
  9. return default_collate_func(batch)
  10. setattr(dataloader, 'default_collate', default_collate_override)
  11. for t in torch._storage_classes:
  12. if sys.version_info[0] == 2:
  13. if t in ForkingPickler.dispatch:
  14. del ForkingPickler.dispatch[t]
  15. else:
  16. if t in ForkingPickler._extra_reducers:
  17. del ForkingPickler._extra_reducers[t]
  18. ####以下是train的原始代码

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/article/detail/53361
推荐阅读
相关标签
  

闽ICP备14008679号