Usage: ./server -m ... --chat-template llama2 mistralai/Mistral-7B-Instruct-v0.2 <s>[INST] hello [/INST]response</s>[INST] again [/INST]response</s> (Currently cannot ...
LMM-R1 is a fork of OpenRLHF, aimed at providing high-performance LMM Reinforcement Learning infrastructure for reproduction of DeepSeek-R1 on multimodal tasks. We currently support ...