Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation using GPT-4

S Yadav, T Choppa, D Schlechtweg - arXiv preprint arXiv:2407.04130, 2024 - arxiv.org
arXiv preprint arXiv:2407.04130, 2024arxiv.org
This paper explores using GPT-3.5 and GPT-4 to automate the data annotation process with
automatic prompting techniques. The main aim of this paper is to reuse human annotation
guidelines along with some annotated data to design automatic prompts for LLMs, focusing
on the semantic proximity annotation task. Automatic prompts are compared to customized
prompts. We further implement the prompting strategies into an open-source text annotation
tool, enabling easy online use via the OpenAI API. Our study reveals the crucial role of …
This paper explores using GPT-3.5 and GPT-4 to automate the data annotation process with automatic prompting techniques. The main aim of this paper is to reuse human annotation guidelines along with some annotated data to design automatic prompts for LLMs, focusing on the semantic proximity annotation task. Automatic prompts are compared to customized prompts. We further implement the prompting strategies into an open-source text annotation tool, enabling easy online use via the OpenAI API. Our study reveals the crucial role of accurate prompt design and suggests that prompting GPT-4 with human-like instructions is not straightforwardly possible for the semantic proximity task. We show that small modifications to the human guidelines already improve the performance, suggesting possible ways for future research.
arxiv.org