Mistral狙击Google，开源24B多模态模型Mistral-Small-3.1 - 文章 - 开发者社区

早上刷到，Mistral也开源了，最近真是开源不断，大模型好起来了。

本次Mistral开源的是一个24B的多模态，专打Google前几天发布的Gemma3-27B模型。

HF: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503

blog: https://mistral.ai/news/mistral-small-3-1

本次开源的3.1-small模型特点如下：

Apache 2.0 许可证，可以随便用；
多语言：英语、法语、德语、中文等数十种语言，注意专门支持了中文；
支持的Agent功能，支持原生函数调用和 JSON 输出；
对系统提示的遵循和支持十分强；
有较强的推理能力；
上下文窗口128k，词表大小为 131k，采用Tekken 分词器；
支持多模态；
在法律咨询、医学诊断上做了专门的微调，效果更好；

来看一下榜单效果，你会发现，追着Gemma3-27B在打。

picture.image

文本

picture.image

多模态

picture.image

多语言和长文本

最后最有意思的是，官方提供的模型，现在没用transformers的版本，是直接vllm使用的。


        
        
            

          from vllm import LLM
          
   

 
          from vllm.sampling\_params import SamplingParams
          
   

 
          from datetime import datetime, timedelta
          
   

 
          
   

 
          SYSTEM\_PROMPT = 
          
 "You are a conversational agent that always answers straight to the point, always end your accurate response with an ASCII drawing of a cat."
 
          
   

 
          
   

 
          user\_prompt = 
          
 "Give me 5 non-formal ways to say 'See you later' in French."
 
          
   

 
          
   

 
          messages = [
          
   

 
              {
          
   

 
                  
          
 "role"
 
          : 
          
 "system"
 
          ,
          
   

 
                  
          
 "content"
 
          : SYSTEM\_PROMPT
          
   

 
              },
          
   

 
              {
          
   

 
                  
          
 "role"
 
          : 
          
 "user"
 
          ,
          
   

 
                  
          
 "content"
 
          : user\_prompt
          
   

 
              },
          
   

 
          ]
          
   

 
          
   

 
          
 # note that running this model on GPU requires over 60 GB of GPU RAM
 
          
   

 
          llm = LLM(model=model\_name, tokenizer\_mode=
          
 "mistral"
 
          )
          
   

 
          
   

 
          sampling\_params = SamplingParams(max\_tokens=512, temperature=0.15)
          
   

 
          outputs = llm.chat(messages, sampling\_params=sampling\_params)
          
   

 
          
   

 
          
 print
 
          (outputs[0].outputs[0].text)

PS：看到这里，如果觉得不错，可以来个点赞、在看、关注。给公众号添加【星标⭐️】不迷路！您的支持是我坚持的最大动力！

欢迎多多关注公众号「NLP工作站」，加入交流群，交个朋友吧，一起学习，一起进步！