Skip to content

Serve the AI Singapore SEA-LION model ⚛ with vLLM

License

Benachrichtigungen You must be signed in to change notification settings

aisingapore/sealion-vllm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Singapore SEA-LION model served by vLLM inference server with Docker Compose

Requirements

Quick Start

  • Download LLaMA3 8B CPT SEA-LIONv2.1 Instruct
  • Copy the model or add a symbolic link in the models directory. The path is ./models/llama3-8b-cpt-sea-lionv2.1-instruct. For example, if the model was downloaded to ~/models/llama3-8b-cpt-sea-lionv2.1-instruct, the symbolic link is added by:
    ln -s ~/models/llama3-8b-cpt-sea-lionv2.1-instruct models/
  • Start the service.
    docker compose up
  • vLLM is deployed as a server that implements the OpenAI API protocol. By default, it starts the server at http://localhost:8000. This server can be queried in the same format as OpenAI API. For example, list the models:
    curl http://localhost:8000/v1/models
  • Test the service.
    curl http://localhost:8000/v1/completions \
      -H "Content-Type: application/json" \
      -d '{
          "model": "llama3-8b-cpt-sea-lionv2.1-instruct",
          "prompt": "Artificial Intelligence is",
          "max_tokens": 20,
          "temperature": 0.8,
          "repetition_penalty": 1.2
      }'

Customisation

  • To use another model:
    • Download the model to the models directory.
    • Update the $MODEL_NAME environment variable. For example, if the model is downloaded to ./models/foo-model-30b:
      export $MODEL_NAME=foo-model-30b

Über uns

Serve the AI Singapore SEA-LION model ⚛ with vLLM

Topics

Ressourcen

License

Stars

Watchers

Forks

Languages