This guide explain the process of employing GPT as a computational annotator within a PhiTag.
This project makes use of the DWUG EN dataset. For word usages, we only provide the dataId of the original usages. To obtain the full data, please refer to the original dataset and match the usage identifiers with our dataIds.
Initiating GPT as an annotator begins with data preparation.
Please provide uses.tsv files in the general format outlined in the Supported Tasks guide.
The instance file is a tab-separated file with the following columns:
instanceID: A unique ID for the instance.
dataIDs: A pair of dataIDs, corresponding to the dataID column in the uses.tsv file, for which the lemma is the same.
label_set: A scale, e.g. (1,2,3,4).
non_label: A non-label (-).
Follow this guideline Explained: Project to create a project in PhiTag.
Follow this guideline Explained: Phase to create a project in PhiTag.
Add GPT as a computational annotator |
Here's a visual representation of how to utilize GPT as an annotator. You have the option to either use your own custom prompt or leverage the automatic prompting feature. The automatic prompting feature utilizes guideline and guideline + tutorial as prompts automatically.
Annotate Phase using GPT |
To evaluate the model's accuracy using the tutorial phase, you need to first create a tutorial phase. Follow the guidelines provided in Explained: Create Tutorial Phase
Test model with tutorial |