赞
踩
我来踩坑啦,最近在下载apex包,遇到的问题步骤记录一下
先尝试按照官网下载配置,不要直接pip install apex, 先从github上下载包,本地安装
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
接下来,我们可能会报错 No module named 'packaging'
ModuleNotFoundError: No module named 'packaging' error: subprocess-exited-with-error × Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> See above for output. note: This error originates from a subprocess, and is likely not a problem with pip. full command: 'D:\SOFTWARE\Anaconda\Anaconda3\envs\Dunit\python.exe' 'D:\SOFTWARE\Anaconda\Anaconda3\envs\Dunit\lib\site-packages\pip\_vendor\pep517\in_process\_in_process.py' get_requires_for_build_wheel 'C:\Users\admin\AppData\Local\Temp\tmpsb751068' cwd: C:\Users\admin\apex Getting requirements to build wheel ... error error: subprocess-exited-with-error × Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> See above for output. note: This error originates from a subprocess, and is likely not a problem with pip.
先安装 packaging包,conda install packaging
cd apex
pip install -v --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--deprecated_fused_adam" --global-option="--xentropy" --global-option="--fast_multihead_attn" ./
如果可以显示成功就完成,但是笔者还是遇到啦其他问题unsupported operand type(s) for +: 'NoneType' and 'str'
# error
Traceback (most recent call last):
File "setup.py", line 35, in <module>
_, bare_metal_major, _ = get_cuda_bare_metal_version(CUDA_HOME)
File "setup.py", line 14, in get_cuda_bare_metal_version
raw_output = subprocess.check_output([cuda_dir + "/bin/nvcc", "-V"], universal_newlines=True)
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
是因为conda安装的cuda是不全的,没有nvcc,解决方案conda install -c nvidia cuda-nvcc,安装成功!! 参考博客
最后还是运行,就可以啦
pip install -v --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--deprecated_fused_adam" --global-option="--xentropy" --global-option="--fast_multihead_attn" ./
补充小踩坑,之前我试过原本的安装会失败,出现报错
Attr ibuteError: module ' torch.distributed' has no attribute '_ reduce_ scatter_base‘
或者是
AttributeError: module 'torch.distributed' has no attribute '_all_gather_base'
我后面的解决方案是,下载22.04dev版本,这个参考博客可以提供解决
如果以上问题不能解决,还是出现AttributeError: module ‘torch.distributed‘ has no attribute ‘_all_gather_base‘
请参考以下博客代码,配置vi ~/.bashrc可行
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。