Skip to content

华为昇腾NPU部署DeepSeek-R1-Distill-Llama-70B

一、环境准备

相关环境准备工作参考华为昇腾NPU部署DeepSeek-R1中的相关配置,下载模型权重需要修改为deepseek-ai/DeepSeek-R1-Distill-Llama-70B, 并且不需要NPU侧权重转换

bash
modelscope download --model deepseek-ai/DeepSeek-R1-Distill-Llama-70B --local_dir /data/models/DeepSeek-R1-Distill-Llama-70B

二、加载镜像

前往昇腾社区/开发资源下载适配DeepSeek-R1-Distill-Llama-70B的镜像包:1.0.0-800I-A2-py311-openeulsr24.03-lts或1.0.0-300I-Duo-py311-openeuler24.03-lts

镜像中各组件版本配套如下:

组件版本
MindIE1.0.0
CANN8.0.0
PTA6.0.0
MindStudio7.0.0
HDK24.1.0

约束条件:

  • 部署DeepSeek-R1-Distill-Llama-70B模型至少需要1台Atlas 800I A2服务器或者1台插4张Atlas 300I DUO卡的服务器
  • 在使用Atlas 300I DUO推理卡部署模型时,需要修改权重目录下的config.json文件,"torch_dtype"字段改为"float16"
  • 支持TP=8推理

运行

如果使用的是root用户镜像(例如从Ascend Hub上取得),并且可以使用特权容器,请使用以下命令启动容器:

bash
docker run -it -d --net=host --shm-size=1g \
    --privileged \
    --name ds-70b \
    --device=/dev/davinci_manager \
    --device=/dev/hisi_hdc \
    --device=/dev/devmm_svm \
    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
    -v /usr/local/sbin:/usr/local/sbin \
    -v /data/models/DeepSeek-R1-Distill-Llama-70B:/data/models/DeepSeek-R1-Distill-Llama-70B \
    swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:1.0.0-300I-Duo-py311-openeuler24.03-lts bash

如果使用自行构建的普通用户镜像,并且规避容器相关权限风险,可以使用以下命令指定用户与设备:

bash
docker run -it -d --net=host --shm-size=1g \
    --name ds-70b \
    --device=/dev/davinci_manager \
    --device=/dev/hisi_hdc \
    --device=/dev/devmm_svm \
    --device=/dev/davinci0 \
    --device=/dev/davinci1 \
    --device=/dev/davinci2 \
    --device=/dev/davinci3 \
    --device=/dev/davinci4 \
    --device=/dev/davinci5 \
    --device=/dev/davinci6 \
    --device=/dev/davinci7 \
    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
    -v /usr/local/sbin:/usr/local/sbin \
    -v /data/models/DeepSeek-R1-Distill-Llama-70B:/data/models/DeepSeek-R1-Distill-Llama-70B \
    swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:1.0.0-300I-Duo-py311-openeuler24.03-lts bash
  • 设置权重路径权限
bash
chmod -R 750 /data/models/DeepSeek-R1-Distill-Llama-70B
  • 进入容器
bash
docker exec -it ds-70b bash
  • 设置环境变量
bash
# LD_LIBRARY_PATH配置的路径,可以通过find / -name libdrvdsmi_host.so命令查找
export LD_LIBRARY_PATH=/usr/local/Ascend/driver/lib64:$LD_LIBRARY_PATH
# 如果需要其他环境变量,可以将下面的内容加入到bashrc中
source /usr/local/Ascend/ascend-toolkit/set_env.sh
source /usr/local/Ascend/nnal/atb/set_env.sh 
source /usr/local/Ascend/atb-models/set_env.sh
source /usr/local/Ascend/mindie/set_env.sh
  • 修改相关配置
bash
# 容器中已经包含MIES_INSTALL_PATH环境变量, 一般值为/usr/local/Ascend/mindie/latest/mindie-service
cd $MIES_INSTALL_PATH
vi ./conf/config.json
json
{
...
  "ServerConfig" : {
  ...
  "port" : 1025, #自定义
  "managementPort" : 1026, #自定义
  "metricsPort" : 1027, #自定义
  ...
  "httpsEnabled" : false,
  ...
  },

  "BackendConfig": {
  ...
  "npuDeviceIds" : [[0,1,2,3,4,5,6,7]],
  ...
  "ModelDeployConfig": {
    "ModelConfig" : [
    {
    ...
    "modelName" : "llama",
    "modelWeightPath" : "/data/models/DeepSeek-R1-Distill-Llama-70B",
    "worldSize" : 8,
    ...
    }
    ]
  },
  ...
  }
}

npuDeviceIds数量要和worldSize一致

  • 运行服务
bash
cd $MIES_INSTALL_PATH
nohup ./bin/mindieservice_daemon > output.log 2>&1 &
  • OpenAI标准发起请求
bash
curl --location 'http://127.0.0.1:1025/v1/chat/completions' -w "请求耗时:%{time_total}秒\n" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen1.5-14b-chat",
    "messages": [
        {
            "role": "user",
            "content": "介绍一下自己"
        }
    ]
}'

Q&A

  • npu-smi: error while loading shared libraries: libdrvdsmi_host.so: cannot open shared object file: No such file or directory
bash
# 如果在容器中执行如下命令报上述错误
npu-smi info

# 查找文件所在的路径
find / -name libdrvdsmi_host.so
# 将文件路径添加到环境变量中
export LD_LIBRARY_PATH=/usr/local/Ascend/driver/lib64:$LD_LIBRARY_PATH
  • LLMInferEngine failed to init LLMInferModels
bash
# 如果出现该错误,检查npu-smi是否可以执行
npu-smi info
# 如果不能执行,那么设置环境变量,同上一个错误

参考文档