LLM-jp DPO (Direct Preference Optimization)

This repository contains the code for DPO of LLM-jp models.

Requirements

See pyproject.toml for the required packages.

poetry install
poetry shell

Here is the command to train a model using 8 GPUs.

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch --config_file accelerate_configs/zero2.yaml train.py

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
accelerate_configs		accelerate_configs
.flake8		.flake8
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
train.py		train.py