5.1. multilora

使用方法

python3 -m vllm_utils.multilora_inference --base-model=[path of base model] --device=[device type] --lora-config=[path of lora config file]

各参数含义如下:

  • --base-model:base model存储路径;

  • --device:设备类型,默认为gcu

  • --lora-config:lora配置文件存储路径,配置文件类型需为json。格式如下:

    {
      "lora_models":[
          {"id":[lora model 1 id],"model_path":[lora model 1 path]},
          {"id":[lora model 2 id],"model_path":[lora model 2 path]}
      ],
      "prompts":[
          {"text":[prompt text 1],"lora_id":[id of lora model]},
          {"text":[prompt text 2],"lora_id":[id of lora model]}
    }
    
    • lora_models:设定各lora model的id及存储路径;

    • prompts:设定各输入prompt的文本信息和对应的lora model id,若不设置lora id,则仅使用--base-model进行推理;

示例

python3 -m vllm_utils.multilora_inference --base-model=chatglm3-6b/ --device=gcu --lora-config=chatglm_6b_lora.config
  • chatglm3-6b/为模型的本地存储路径,下载自chatglm3-6b;

  • chatglm_6b_lora.config为lora配置文件,其内容为:

    {
      "lora_models":[
          {"id":1,"model_path":"chatglm3-6b-csc-chinese-lora"}
      ],
      "prompts":[
          {"text":"请介绍下你自己,包括你的主要功能和应用场景。"},
          {"text":"A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: 对下面文本纠错\n\n少先队员因该为老人让坐。 ASSISTANT:",
              "lora_id":"1"},
          {"text":"A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: 对下面文本纠错\n\n下个星期,我跟我朋唷打算去法国玩儿。 ASSISTANT:",
              "lora_id":"1"}
      ]
    }
    
    • chatglm3-6b-csc-chinese-lora为lora模型的本地存储路径,下载自chatglm3-6b-csc-chinese-lora;

    • chatglm3-6b-csc-chinese-lora在本轮推理时设置其id为1;

    • prompts部分设置的三个prompt,第一个仅使用base model进行推理,其余使用base modellora model共同进行推理;