This repository contains the official code and data for the experiments carried out in our paper "From Showgirls to Performers: Fine-tuning with Gender-inclusive Language for Bias Reduction in LLMs", published at the GeBNLP Workshop at ACL 2024 in Bangkok.
The steps are as follows:
- Download dataset
- Rewrite dataset with gender-neutral terminology
- Fine-tune LLMs (GPT-2, Phi-1.5, RoBERTa) with rewritten vs. original data
- Evaluate with external metrics (RedditBias, CrowsPairs, HONEST)
Erstellen Sie external_libs
directory and clone the following repositories:
Then, install requirements, preferably into a virtual environment:
pip install -r requirements.txt
Run the following code, to download the Small Heap Corpus, consisiting of 250M tokens. The code will simultaneously also use Vanmassenhove et al.'s (2021) NeuTral Rewriter to create a version of the corpus with he/she pronouns replaced by singular they.
The original and neutral corpus will be saved as small_heap_[# tokens](-neutral)
in the data/
directory.
The code also creates a logs
directory to save the progress and processing time of the download process.
python code/dataset_download.py --no_tokens 250000000 --log_dir logs/
The --no_tokens
argument can be used to adjust the size of the downloaded dataset.
This script will do replacement of gender-marking with gender-neutral words based on the catalogue developed at this repository: github.com/marionbartl/affixed_words
The script works on the original and neutral version of the corpus simultaneously. After replacement, the corpus directory name will have an attached '-R'.
python code/word_replacement.py --corpus data/small_heap_50M
python code/fine_tune.py --model_name [huggingface model identifier] --data data/fine-tuning/tiny_heap-neutral.txt
For our experiments, we ran fine_tuning.ipynb
on Google Colab.
Fine-tuned models were not included, because they were too large for this repository.
We used Meade et al.'s (2022) implementation of CrowsPairs.
mkdir external_libs
cd external_libs
git clone https://github.com/McGill-NLP/bias-bench.git
We used Barikeri et al.'s (2021) implementation of RedditBias.
cd external_libs
git clone https://github.com/umanlp/RedditBias.git
For the HONEST evaluation, we used the python package from MilaNLP.
The code can be found at code/HONEST_eval.ipynb
.