AI-generated estimates of familiarity, concreteness, valence, and arousal for over 100,000 Spanish words

Q J Exp Psychol (Hove). 2024 Dec 24:17470218241306694. doi: 10.1177/17470218241306694. Online ahead of print.

Abstract

This study investigates whether estimates of familiarity, valence, arousal, and concreteness based on artificial intelligence (AI) are useful alternatives to word counts and human ratings in Spanish. We replicate and extend previous findings in English and show that GPT-4o is effective in estimating these word features. Validity checks even suggest that AI-generated estimates sometimes outperform traditional measurements. The ability to generate AI estimates for large numbers of words at low cost simplifies the process of obtaining word features and provides a new resource for researchers working in Spanish. We provide Excel lists of the collected word features, which can be freely used for research and teaching.

Keywords: GPT-4; Spanish; Word norms; arousal; concreteness; large language model; multiword expressions; valence.