-
Quantitative Information Extraction from Humanitarian Documents
Authors:
Daniele Liberatore,
Kyriaki Kalimeri,
Derya Sever,
Yelena Mejova
Abstract:
Humanitarian action is accompanied by a mass of reports, summaries, news, and other documents. To guide its activities, important information must be quickly extracted from such free-text resources. Quantities, such as the number of people affected, amount of aid distributed, or the extent of infrastructure damage, are central to emergency response and anticipatory action. In this work, we contrib…
▽ More
Humanitarian action is accompanied by a mass of reports, summaries, news, and other documents. To guide its activities, important information must be quickly extracted from such free-text resources. Quantities, such as the number of people affected, amount of aid distributed, or the extent of infrastructure damage, are central to emergency response and anticipatory action. In this work, we contribute an annotated dataset for the humanitarian domain for the extraction of such quantitative information, along side its important context, including units it refers to, any modifiers, and the relevant event. Further, we develop a custom Natural Language Processing pipeline to extract the quantities alongside their units, and evaluate it in comparison to baseline and recent literature. The proposed model achieves a consistent improvement in the performance, especially in the documents pertaining to the Dominican Republic and select African countries. We make the dataset and code available to the research community to continue the improvement of NLP tools for the humanitarian domain.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
Automatic Detection of Moral Values in Music Lyrics
Authors:
Vjosa Preniqi,
Iacopo Ghinassi,
Julia Ive,
Kyriaki Kalimeri,
Charalampos Saitis
Abstract:
Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. The possibility to extract morality rapidly from lyrics enables a deeper understanding of our music-listening behaviours. Building on the Moral Foundations Theory (MFT), we tasked a set of transformer-based language models (BERT) fine-tuned on 2,721 synthetic lyr…
▽ More
Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. The possibility to extract morality rapidly from lyrics enables a deeper understanding of our music-listening behaviours. Building on the Moral Foundations Theory (MFT), we tasked a set of transformer-based language models (BERT) fine-tuned on 2,721 synthetic lyrics generated by a large language model (GPT-4) to detect moral values in 200 real music lyrics annotated by two experts.We evaluate their predictive capabilities against a series of baselines including out-of-domain (BERT fine-tuned on MFT-annotated social media texts) and zero-shot (GPT-4) classification. The proposed models yielded the best accuracy across experiments, with an average F1 weighted score of 0.8. This performance is, on average, 5% higher than out-of-domain and zero-shot models. When examining precision in binary classification, the proposed models perform on average 12% higher than the baselines.Our approach contributes to annotation-free and effective lyrics morality learning, and provides useful insights into the knowledge distillation of LLMs regarding moral expression in music, and the potential impact of these technologies on the creative industries and musical culture.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
A Novel Lexicon for the Moral Foundation of Liberty
Authors:
Oscar Araque,
Lorenzo Gatti,
Sergio Consoli,
Kyriaki Kalimeri
Abstract:
The moral value of liberty is a central concept in our inference system when it comes to taking a stance towards controversial social issues such as vaccine hesitancy, climate change, or the right to abortion. Here, we propose a novel Liberty lexicon evaluated on more than 3,000 manually annotated data both in in- and out-of-domain scenarios. As a result of this evaluation, we produce a combined l…
▽ More
The moral value of liberty is a central concept in our inference system when it comes to taking a stance towards controversial social issues such as vaccine hesitancy, climate change, or the right to abortion. Here, we propose a novel Liberty lexicon evaluated on more than 3,000 manually annotated data both in in- and out-of-domain scenarios. As a result of this evaluation, we produce a combined lexicon that constitutes the main outcome of this work. This final lexicon incorporates information from an ensemble of lexicons that have been generated using word embedding similarity (WE) and compositional semantics (CS). Our key contributions include enriching the liberty annotations, developing a robust liberty lexicon for broader application, and revealing the complexity of expressions related to liberty across different platforms. Through the evaluation, we show that the difficulty of the task calls for designing approaches that combine knowledge, in an effort of improving the representations of learning systems.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Resilience of mobility network to dynamic population response across COVID-19 interventions: evidences from Chile
Authors:
Pasquale Casaburi,
Lorenzo Dall'Amico,
Nicolò Gozzi,
Kyriaki Kalimeri,
Anna Sapienza,
Rossano Schifanella,
T. Di Matteo,
Leo Ferres,
Mattia Mazzoli
Abstract:
The COVID19 pandemic highlighted the importance of non-traditional data sources, such as mobile phone data, to inform effective public health interventions and monitor adherence to such measures. Previous studies showed how socioeconomic characteristics shaped population response during restrictions and how repeated interventions eroded adherence over time. Less is known about how different popula…
▽ More
The COVID19 pandemic highlighted the importance of non-traditional data sources, such as mobile phone data, to inform effective public health interventions and monitor adherence to such measures. Previous studies showed how socioeconomic characteristics shaped population response during restrictions and how repeated interventions eroded adherence over time. Less is known about how different population strata changed their response to repeated interventions and how this impacted the resulting mobility network. We study population response during the first and second infection waves of the COVID-19 pandemic in Chile and Spain. Via spatial lag and regression models, we investigate the adherence to mobility interventions at the municipality level in Chile, highlighting the significant role of wealth, labor structure, COVID-19 incidence, and network metrics characterizing business-as-usual municipality connectivity in shaping mobility changes during the two waves. We assess network structural similarities in the two periods by defining mobility hotspots and traveling probabilities in the two countries. As a proof of concept, we simulate and compare outcomes of an epidemic diffusion occurring in the two waves. Our analysis reveals the resilience of the mobility network across waves. We test the robustness of our findings recovering similar results for Spain. Finally, epidemic modeling suggests that historical mobility data from past waves can be leveraged to inform future disease spatial invasion models in repeated interventions. This study highlights the value of historical mobile phone data for building pandemic preparedness and lessens the need for real-time data streams for risk assessment and outbreak response. Our work provides valuable insights into the complex interplay of factors driving mobility across repeated interventions, aiding in developing targeted mitigation strategies.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
MoralBERT: A Fine-Tuned Language Model for Capturing Moral Values in Social Discussions
Authors:
Vjosa Preniqi,
Iacopo Ghinassi,
Julia Ive,
Charalampos Saitis,
Kyriaki Kalimeri
Abstract:
Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. Controversial topics, including vaccination, abortion, racism, and sexual orientation, often elicit opinions and attitudes that are not solely based on evidence but rather reflect moral worldviews. Recent advances in Natural Language Processing (NLP) show that mo…
▽ More
Moral values play a fundamental role in how we evaluate information, make decisions, and form judgements around important social issues. Controversial topics, including vaccination, abortion, racism, and sexual orientation, often elicit opinions and attitudes that are not solely based on evidence but rather reflect moral worldviews. Recent advances in Natural Language Processing (NLP) show that moral values can be gauged in human-generated textual content. Building on the Moral Foundations Theory (MFT), this paper introduces MoralBERT, a range of language representation models fine-tuned to capture moral sentiment in social discourse. We describe a framework for both aggregated and domain-adversarial training on multiple heterogeneous MFT human-annotated datasets sourced from Twitter (now X), Reddit, and Facebook that broaden textual content diversity in terms of social media audience interests, content presentation and style, and spreading patterns. We show that the proposed framework achieves an average F1 score that is between 11% and 32% higher than lexicon-based approaches, Word2Vec embeddings, and zero-shot classification with large language models such as GPT-4 for in-domain inference. Domain-adversarial training yields better out-of domain predictions than aggregate training while achieving comparable performance to zero-shot learning. Our approach contributes to annotation-free and effective morality learning, and provides useful insights towards a more comprehensive understanding of moral narratives in controversial social debates using NLP.
△ Less
Submitted 19 July, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Beyond the Headlines: Understanding Sentiments and Morals Impacting Female Employment in Spain
Authors:
Oscar Araque,
Luca Barbaglia,
Francesco Berlingieri,
Marco Colagrossi,
Sergio Consoli,
Lorenzo Gatti,
Caterina Mauri,
Kyriaki Kalimeri
Abstract:
After decades of improvements in the employment conditions of females in Spain, this process came to a sudden stop with the Great Spanish Recession of 2008. In this contribution, we analyse a large longitudinal corpus of national and regional news outlets employing advanced Natural Language Processing techniques to capture the valence of mentions of gender inequality expressed in the Spanish press…
▽ More
After decades of improvements in the employment conditions of females in Spain, this process came to a sudden stop with the Great Spanish Recession of 2008. In this contribution, we analyse a large longitudinal corpus of national and regional news outlets employing advanced Natural Language Processing techniques to capture the valence of mentions of gender inequality expressed in the Spanish press. The automatic analysis of the news articles does indeed capture the known hardships faced by females in the Spanish labour market. Our approach can be straightforwardly generalised to other topics of interest. Assessing the sentiment and moral values expressed in the articles, we notice that females are, in the majority of cases, concerned more than males when there is a deterioration in the overall labour market conditions, based on newspaper articles. This behaviour has been present in the entire period of study (2000--2022) and looked particularly pronounced during the economic crisis of 2008 and the recent COVID-19 pandemic. Most of the time, this phenomenon looks to be more pronounced at the regional level, perhaps caused by a significant focus on local labour markets rather than on aggregate statistics or because, in local contexts, females might suffer more from an isolation or discrimination condition. Our findings contribute to a deeper understanding of the gender inequalities in Spain using alternative data, informing policymakers and stakeholders.
△ Less
Submitted 11 February, 2024;
originally announced February 2024.
-
Political Context of the European Vaccine Debate on Twitter
Authors:
Giordano Paoletti,
Lorenzo Dall'Amico,
Kyriaki Kalimeri,
Jacopo Lenti,
Yelena Mejova,
Daniela Paolotti,
Michele Starnini,
Michele Tizzani
Abstract:
At the beginning of the COVID-19 pandemic, fears grew that making vaccination a political (instead of public health) issue may impact the efficacy of this life-saving intervention, spurring the spread of vaccine-hesitant content. In this study, we examine whether there is a relationship between the political interest of social media users and their exposure to vaccine-hesitant content on Twitter.…
▽ More
At the beginning of the COVID-19 pandemic, fears grew that making vaccination a political (instead of public health) issue may impact the efficacy of this life-saving intervention, spurring the spread of vaccine-hesitant content. In this study, we examine whether there is a relationship between the political interest of social media users and their exposure to vaccine-hesitant content on Twitter. We focus on 17 European countries using a multilingual, longitudinal dataset of tweets spanning the period before COVID, up to the vaccine roll-out. We find that, in most countries, users' endorsement of vaccine-hesitant content is the highest in the early months of the pandemic, around the time of greatest scientific uncertainty. Further, users who follow politicians from right-wing parties, and those associated with authoritarian or anti-EU stances are more likely to endorse vaccine-hesitant content, whereas those following left-wing politicians, more pro-EU or liberal parties, are less likely. Somewhat surprisingly, politicians did not play an outsized role in the vaccine debates of their countries, receiving a similar number of retweets as other similarly popular users. This systematic, multi-country, longitudinal investigation of the connection of politics with vaccine hesitancy has important implications for public health policy and communication.
△ Less
Submitted 1 March, 2024; v1 submitted 6 September, 2023;
originally announced September 2023.
-
Leave no Place Behind: Improved Geolocation in Humanitarian Documents
Authors:
Enrico M. Belliardo,
Kyriaki Kalimeri,
Yelena Mejova
Abstract:
Geographical location is a crucial element of humanitarian response, outlining vulnerable populations, ongoing events, and available resources. Latest developments in Natural Language Processing may help in extracting vital information from the deluge of reports and documents produced by the humanitarian sector. However, the performance and biases of existing state-of-the-art information extractio…
▽ More
Geographical location is a crucial element of humanitarian response, outlining vulnerable populations, ongoing events, and available resources. Latest developments in Natural Language Processing may help in extracting vital information from the deluge of reports and documents produced by the humanitarian sector. However, the performance and biases of existing state-of-the-art information extraction tools are unknown. In this work, we develop annotated resources to fine-tune the popular Named Entity Recognition (NER) tools Spacy and roBERTa to perform geotagging of humanitarian texts. We then propose a geocoding method FeatureRank which links the candidate locations to the GeoNames database. We find that not only does the humanitarian-domain data improves the performance of the classifiers (up to F1 = 0.92), but it also alleviates some of the bias of the existing tools, which erroneously favor locations in the Western countries. Thus, we conclude that more resources from non-Western documents are necessary to ensure that off-the-shelf NER systems are suitable for the deployment in the humanitarian sector.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
From Ukraine to the World: Using LinkedIn Data to Monitor Professional Migration from Ukraine
Authors:
Margherita Bertè,
Daniela Paolotti,
Kyriaki Kalimeri
Abstract:
Highly skilled professionals' forced migration from Ukraine was triggered by the conflict in Ukraine in 2014 and amplified by the Russian invasion in 2022. Here, we utilize LinkedIn estimates and official refugee data from the World Bank and the United Nations Refugee Agency, to understand which are the main pull factors that drive the decision-making process of the host country. We identify an on…
▽ More
Highly skilled professionals' forced migration from Ukraine was triggered by the conflict in Ukraine in 2014 and amplified by the Russian invasion in 2022. Here, we utilize LinkedIn estimates and official refugee data from the World Bank and the United Nations Refugee Agency, to understand which are the main pull factors that drive the decision-making process of the host country. We identify an ongoing and escalating exodus of educated individuals, largely drawn to Poland and Germany, and underscore the crucial role of pre-existing networks in shaping these migration flows. Key findings include a strong correlation between LinkedIn's estimates of highly educated Ukrainian displaced people and official UN refugee statistics, pointing to the significance of prior relationships with Ukraine in determining migration destinations. We train a series of multilinear regression models and the SHAP method revealing that the existence of a support network is the most critical factor in choosing a destination country, while distance is less important. Our main findings show that the migration patterns of Ukraine's highly skilled workforce, and their impact on both the origin and host countries, are largely influenced by preexisting networks and communities. This insight can inform strategies to tackle the economic challenges posed by this loss of talent and maximize the benefits of such migration for both Ukraine and the receiving nations.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Authority without Care: Moral Values behind the Mask Mandate Response
Authors:
Yelena Mejova,
Kyrieki Kalimeri,
Gianmarco De Francisci Morales
Abstract:
Face masks are one of the cheapest and most effective non-pharmaceutical interventions available against airborne diseases such as COVID-19. Unfortunately, they have been met with resistance by a substantial fraction of the populace, especially in the U.S. In this study, we uncover the latent moral values that underpin the response to the mask mandate, and paint them against the country's politica…
▽ More
Face masks are one of the cheapest and most effective non-pharmaceutical interventions available against airborne diseases such as COVID-19. Unfortunately, they have been met with resistance by a substantial fraction of the populace, especially in the U.S. In this study, we uncover the latent moral values that underpin the response to the mask mandate, and paint them against the country's political backdrop. We monitor the discussion about masks on Twitter, which involves almost 600k users in a time span of 7 months. By using a combination of graph mining, natural language processing, topic modeling, content analysis, and time series analysis, we characterize the responses to the mask mandate of both those in favor and against them. We base our analysis on the theoretical frameworks of Moral Foundation Theory and Hofstede's cultural dimensions. Our results show that, while the anti-mask stance is associated with a conservative political leaning, the moral values expressed by its adherents diverge from the ones typically used by conservatives. In particular, the expected emphasis on the values of authority and purity is accompanied by an atypical dearth of in-group loyalty. We find that after the mandate, both pro- and anti-mask sides decrease their emphasis on care about others, and increase their attention on authority and fairness, further politicizing the issue. In addition, the mask mandate reverses the expression of Individualism-Collectivism between the two sides, with an increase of individualism in the anti-mask narrative, and a decrease in the pro-mask one. We argue that monitoring the dynamics of moral positioning is crucial for designing effective public health campaigns that are sensitive to the underlying values of the target audience.
△ Less
Submitted 30 March, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Monitoring Gender Gaps via LinkedIn Advertising Estimates: the case study of Italy
Authors:
Margherita Bertè,
Kyriaki Kalimeri,
Daniela Paolotti
Abstract:
Women remain underrepresented in the labour market. Although significant advancements are being made to increase female participation in the workforce, the gender gap is still far from being bridged. We contribute to the growing literature on gender inequalities in the labour market, evaluating the potential of the LinkedIn estimates to monitor the evolution of the gender gaps sustainably, complem…
▽ More
Women remain underrepresented in the labour market. Although significant advancements are being made to increase female participation in the workforce, the gender gap is still far from being bridged. We contribute to the growing literature on gender inequalities in the labour market, evaluating the potential of the LinkedIn estimates to monitor the evolution of the gender gaps sustainably, complementing the official data sources. In particular, assessing the labour market patterns at a subnational level in Italy. Our findings show that the LinkedIn estimates accurately capture the gender disparities in Italy regarding sociodemographic attributes such as gender, age, geographic location, seniority, and industry category. At the same time, we assess data biases such as the digitalisation gap, which impacts the representativity of the workforce in an imbalanced manner, confirming that women are under-represented in Southern Italy. Additionally to confirming the gender disparities to the official census, LinkedIn estimates are a valuable tool to provide dynamic insights; we showed an immigration flow of highly skilled women, predominantly from the South. Digital surveillance of gender inequalities with detailed and timely data is particularly significant to enable policymakers to tailor impactful campaigns.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
Global misinformation spillovers in the online vaccination debate before and during COVID-19
Authors:
Jacopo Lenti,
Kyriaki Kalimeri,
André Panisson,
Daniela Paolotti,
Michele Tizzani,
Yelena Mejova,
Michele Starnini
Abstract:
Anti-vaccination views pervade online social media, fueling distrust in scientific expertise and increasing vaccine-hesitant individuals. While previous studies focused on specific countries, the COVID-19 pandemic brought the vaccination discourse worldwide, underpinning the need to tackle low-credible information flows on a global scale to design effective countermeasures. Here, we leverage 316 m…
▽ More
Anti-vaccination views pervade online social media, fueling distrust in scientific expertise and increasing vaccine-hesitant individuals. While previous studies focused on specific countries, the COVID-19 pandemic brought the vaccination discourse worldwide, underpinning the need to tackle low-credible information flows on a global scale to design effective countermeasures. Here, we leverage 316 million vaccine-related Twitter messages in 18 languages, from October 2019 to March 2021, to quantify misinformation flows between users exposed to anti-vaccination (no-vax) content. We find that, during the pandemic, no-vax communities became more central in the country-specific debates and their cross-border connections strengthened, revealing a global Twitter anti-vaccination network. U.S. users are central in this network, while Russian users also become net exporters of misinformation during vaccination roll-out. Interestingly, we find that Twitter's content moderation efforts, and in particular the suspension of users following the January 6th U.S. Capitol attack, had a worldwide impact in reducing misinformation spread about vaccines. These findings may help public health institutions and social media platforms to mitigate the spread of health-related, low-credible information by revealing vulnerable online communities.
△ Less
Submitted 19 December, 2022; v1 submitted 21 November, 2022;
originally announced November 2022.
-
LibertyMFD: A Lexicon to Assess the Moral Foundation of Liberty
Authors:
Oscar Araque,
Lorenzo Gatti,
Kyriaki Kalimeri
Abstract:
Quantifying the moral narratives expressed in the user-generated text, news, or public discourses is fundamental for understanding individuals' concerns and viewpoints and preventing violent protests and social polarisation. The Moral Foundation Theory (MFT) was developed to operationalise morality in a five-dimensional scale system. Recent developments of the theory urged for the introduction of…
▽ More
Quantifying the moral narratives expressed in the user-generated text, news, or public discourses is fundamental for understanding individuals' concerns and viewpoints and preventing violent protests and social polarisation. The Moral Foundation Theory (MFT) was developed to operationalise morality in a five-dimensional scale system. Recent developments of the theory urged for the introduction of a new foundation, the Liberty Foundation. Being only recently added to the theory, there are no available linguistic resources to assess whether liberty is present in text corpora. Given its importance to current social issues such as the vaccination debate, we propose two data-driven approaches, deriving two candidate lexicons generated based on aligned documents from online news sources with different worldviews. After extensive experimentation, we contribute to the research community a novel lexicon that assesses the liberty moral foundation in the way individuals with contrasting viewpoints express themselves through written text. The LibertyMFD dictionary can be a valuable tool for policymakers to understand diverse viewpoints on controversial social issues such as vaccination, abortion, or even uprisings, as they happen and on a large scale.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
"More Than Words": Linking Music Preferences and Moral Values Through Lyrics
Authors:
Vjosa Preniqi,
Kyriaki Kalimeri,
Charalampos Saitis
Abstract:
This study explores the association between music preferences and moral values by applying text analysis techniques to lyrics. Harvesting data from a Facebook-hosted application, we align psychometric scores of 1,386 users to lyrics from the top 5 songs of their preferred music artists as emerged from Facebook Page Likes. We extract a set of lyrical features related to each song's overarching narr…
▽ More
This study explores the association between music preferences and moral values by applying text analysis techniques to lyrics. Harvesting data from a Facebook-hosted application, we align psychometric scores of 1,386 users to lyrics from the top 5 songs of their preferred music artists as emerged from Facebook Page Likes. We extract a set of lyrical features related to each song's overarching narrative, moral valence, sentiment, and emotion. A machine learning framework was designed to exploit regression approaches and evaluate the predictive power of lyrical features for inferring moral values. Results suggest that lyrics from top songs of artists people like inform their morality. Virtues of hierarchy and tradition achieve higher prediction scores ($.20 \leq r \leq .30$) than values of empathy and equality ($.08 \leq r \leq .11$), while basic demographic variables only account for a small part in the models' explainability. This shows the importance of music listening behaviours, as assessed via lyrical preferences, alone in capturing moral values. We discuss the technological and musicological implications and possible future improvements.
△ Less
Submitted 2 September, 2022;
originally announced September 2022.
-
Moral Narratives Around the Vaccination Debate on Facebook
Authors:
Mariano Gastón Beiró,
Jacopo D'Ignazi,
Maria Florencia Prado,
Victoria Perez Bustos,
Kyriaki Kalimeri
Abstract:
Vaccine hesitancy is a complex issue with psychological, cultural, and even societal factors entangled in the decision-making process. The narrative around this process is captured in our everyday interactions; social media data offer a direct and spontaneous view of peoples' argumentation. Here, we analysed more than 500,000 public posts and comments from Facebook Pages dedicated to the topic of…
▽ More
Vaccine hesitancy is a complex issue with psychological, cultural, and even societal factors entangled in the decision-making process. The narrative around this process is captured in our everyday interactions; social media data offer a direct and spontaneous view of peoples' argumentation. Here, we analysed more than 500,000 public posts and comments from Facebook Pages dedicated to the topic of vaccination to study the role of moral values and, in particular, the understudied role of the Liberty moral foundation from the actual user-generated text. We operationalise morality by employing the Moral Foundations Theory, while our proposed framework is based on recurrent neural network classifiers with a short memory and entity linking information. Our findings show that the principal moral narratives around the vaccination debate focus on the values of Liberty, Care, and Authority. Vaccine advocates urge compliance with the authorities as prosocial behaviour to protect society. On the other hand, vaccine sceptics mainly build their narrative around the value of Liberty, advocating for the right to choose freely whether to adhere or not to the vaccination. We contribute to the automatic understanding of vaccine hesitancy drivers emerging from user-generated text, providing concrete insights into the moral framing around vaccination decision-making. Especially in emergencies such as the Covid-19 pandemic, contrary to traditional surveys, these insights can be provided contemporary to the event, helping policymakers craft communication campaigns that adequately address the concerns of the hesitant population.
△ Less
Submitted 15 March, 2023; v1 submitted 3 June, 2022;
originally announced June 2022.
-
Modelling Moral Traits with Music Listening Preferences and Demographics
Authors:
Vjosa Preniqi,
Kyriaki Kalimeri,
Charalampos Saitis
Abstract:
Music is an essential component in our everyday lives and experiences, as it is a way that we use to express our feelings, emotions and cultures. In this study, we explore the association between music genre preferences, demographics and moral values by exploring self-reported data from an online survey administered in Canada. Participants filled in the moral foundations questionnaire, while they…
▽ More
Music is an essential component in our everyday lives and experiences, as it is a way that we use to express our feelings, emotions and cultures. In this study, we explore the association between music genre preferences, demographics and moral values by exploring self-reported data from an online survey administered in Canada. Participants filled in the moral foundations questionnaire, while they also provided their basic demographic information, and music preferences. Here, we predict the moral values of the participants inferring on their musical preferences employing classification and regression techniques. We also explored the predictive power of features estimated from factor analysis on the music genres, as well as the generalist/specialist (GS) score for revealing the diversity of musical choices for each user. Our results show the importance of music in predicting a person's moral values (.55-.69 AUROC); while knowledge of basic demographic features such as age and gender is enough to increase the performance (.58-.71 AUROC).
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
Young Adult Unemployment Through the Lens of Social Media: Italy as a case study
Authors:
Alessandra Urbinati,
Kyriaki Kalimeri,
Andrea Bonanomi,
Alessandro Rosina,
Ciro Cattuto,
Daniela Paolotti
Abstract:
Youth unemployment rates are still in alerting levels for many countries, among which Italy. Direct consequences include poverty, social exclusion, and criminal behaviours, while negative impact on the future employability and wage cannot be obscured. In this study, we employ survey data together with social media data, and in particular likes on Facebook Pages, to analyse personality, moral value…
▽ More
Youth unemployment rates are still in alerting levels for many countries, among which Italy. Direct consequences include poverty, social exclusion, and criminal behaviours, while negative impact on the future employability and wage cannot be obscured. In this study, we employ survey data together with social media data, and in particular likes on Facebook Pages, to analyse personality, moral values, but also cultural elements of the young unemployed population in Italy. Our findings show that there are small but significant differences in personality and moral values, with the unemployed males to be less agreeable while females more open to new experiences. At the same time, unemployed have a more collectivist point of view, valuing more in-group loyalty, authority, and purity foundations. Interestingly, topic modelling analysis did not reveal major differences in interests and cultural elements of the unemployed. Utilisation patterns emerged though; the employed seem to use Facebook to connect with local activities, while the unemployed use it mostly as for entertainment purposes and as a source of news, making them susceptible to mis/disinformation. We believe these findings can help policymakers get a deeper understanding of this population and initiatives that improve both the hard and the soft skills of this fragile population.
△ Less
Submitted 14 October, 2020; v1 submitted 9 October, 2020;
originally announced October 2020.
-
Falling into the Echo Chamber: the Italian Vaccination Debate on Twitter
Authors:
Alessandro Cossard,
Gianmarco De Francisci Morales,
Kyriaki Kalimeri,
Yelena Mejova,
Daniela Paolotti,
Michele Starnini
Abstract:
The reappearance of measles in the US and Europe, a disease considered eliminated in early 2000s, has been accompanied by a growing debate on the merits of vaccination on social media. In this study we examine the extent to which the vaccination debate on Twitter is conductive to potential outreach to the vaccination hesitant. We focus on Italy, one of the countries most affected by the latest mea…
▽ More
The reappearance of measles in the US and Europe, a disease considered eliminated in early 2000s, has been accompanied by a growing debate on the merits of vaccination on social media. In this study we examine the extent to which the vaccination debate on Twitter is conductive to potential outreach to the vaccination hesitant. We focus on Italy, one of the countries most affected by the latest measles outbreaks. We discover that the vaccination skeptics, as well as the advocates, reside in their own distinct "echo chambers". The structure of these communities differs as well, with skeptics arranged in a tightly connected cluster, and advocates organizing themselves around few authoritative hubs. At the center of these echo chambers we find the ardent supporters, for which we build highly accurate network- and content-based classifiers (attaining 95% cross-validated accuracy). Insights of this study provide several avenues for potential future interventions, including network-guided targeting, accounting for the political context, and monitoring of alternative sources of information.
△ Less
Submitted 26 March, 2020;
originally announced March 2020.
-
Advertisers Jump on Coronavirus Bandwagon: Politics, News, and Business
Authors:
Yelena Mejova,
Kyriaki Kalimeri
Abstract:
In the age of social media, disasters and epidemics usher not only a devastation and affliction in the physical world, but also prompt a deluge of information, opinions, prognoses and advice to billions of internet users. The coronavirus epidemic of 2019-2020, or COVID-19, is no exception, with the World Health Organization warning of a possible "infodemic" of fake news. In this study, we examine…
▽ More
In the age of social media, disasters and epidemics usher not only a devastation and affliction in the physical world, but also prompt a deluge of information, opinions, prognoses and advice to billions of internet users. The coronavirus epidemic of 2019-2020, or COVID-19, is no exception, with the World Health Organization warning of a possible "infodemic" of fake news. In this study, we examine the alternative narratives around the coronavirus outbreak through advertisements promoted on Facebook, the largest social media platform in the US. Using the new Facebook Ads Library, we discover advertisers from public health and non-profit sectors, alongside those from news media, politics, and business, incorporating coronavirus into their messaging and agenda. We find the virus used in political attacks, donation solicitations, business promotion, stock market advice, and animal rights campaigning. Among these, we find several instances of possible misinformation, ranging from bioweapons conspiracy theories to unverifiable claims by politicians. As we make the dataset available to the community, we hope the advertising domain will become an important part of quality control for public health communication and public discourse in general.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
Facebook Ads as a Demographic Tool to Measure the Urban-Rural Divide
Authors:
Daniele Rama,
Yelena Mejova,
Michele Tizzoni,
Kyriaki Kalimeri,
Ingmar Weber
Abstract:
In the global move toward urbanization, making sure the people remaining in rural areas are not left behind in terms of development and policy considerations is a priority for governments worldwide. However, it is increasingly challenging to track important statistics concerning this sparse, geographically dispersed population, resulting in a lack of reliable, up-to-date data. In this study, we ex…
▽ More
In the global move toward urbanization, making sure the people remaining in rural areas are not left behind in terms of development and policy considerations is a priority for governments worldwide. However, it is increasingly challenging to track important statistics concerning this sparse, geographically dispersed population, resulting in a lack of reliable, up-to-date data. In this study, we examine the usefulness of the Facebook Advertising platform, which offers a digital "census" of over two billions of its users, in measuring potential rural-urban inequalities. We focus on Italy, a country where about 30% of the population lives in rural areas. First, we show that the population statistics that Facebook produces suffer from instability across time and incomplete coverage of sparsely populated municipalities. To overcome such limitation, we propose an alternative methodology for estimating Facebook Ads audiences that nearly triples the coverage of the rural municipalities from 19% to 55% and makes feasible fine-grained sub-population analysis. Using official national census data, we evaluate our approach and confirm known significant urban-rural divides in terms of educational attainment and income. Extending the analysis to Facebook-specific user "interests" and behaviors, we provide further insights on the divide, for instance, finding that rural areas show a higher interest in gambling. Notably, we find that the most predictive features of income in rural areas differ from those for urban centres, suggesting researchers need to consider a broader range of attributes when examining rural wellbeing. The findings of this study illustrate the necessity of improving existing tools and methodologies to include under-represented populations in digital demographic studies -- the failure to do so could result in misleading observations, conclusions, and most importantly, policies.
△ Less
Submitted 26 February, 2020;
originally announced February 2020.
-
MoralStrength: Exploiting a Moral Lexicon and Embedding Similarity for Moral Foundations Prediction
Authors:
Oscar Araque,
Lorenzo Gatti,
Kyriaki Kalimeri
Abstract:
Moral rhetoric plays a fundamental role in how we perceive and interpret the information we receive, greatly influencing our decision-making process. Especially when it comes to controversial social and political issues, our opinions and attitudes are hardly ever based on evidence alone. The Moral Foundations Dictionary (MFD) was developed to operationalize moral values in the text. In this study,…
▽ More
Moral rhetoric plays a fundamental role in how we perceive and interpret the information we receive, greatly influencing our decision-making process. Especially when it comes to controversial social and political issues, our opinions and attitudes are hardly ever based on evidence alone. The Moral Foundations Dictionary (MFD) was developed to operationalize moral values in the text. In this study, we present MoralStrength, a lexicon of approximately 1,000 lemmas, obtained as an extension of the Moral Foundations Dictionary, based on WordNet synsets. Moreover, for each lemma it provides with a crowdsourced numeric assessment of Moral Valence, indicating the strength with which a lemma is expressing the specific value. We evaluated the predictive potentials of this moral lexicon, defining three utilization approaches of increased complexity, ranging from lemmas' statistical properties to a deep learning approach of word embeddings based on semantic similarity. Logistic regression models trained on the features extracted from MoralStrength, significantly outperformed the current state-of-the-art, reaching an F1-score of 87.6% over the previous 62.4% (p-value<0.01), and an average F1-Score of 86.25% over six different datasets. Such findings pave the way for further research, allowing for an in-depth understanding of moral narratives in text for a wide range of social issues.
△ Less
Submitted 4 September, 2019; v1 submitted 17 April, 2019;
originally announced April 2019.
-
Human Values and Attitudes towards Vaccination in Social Media
Authors:
Kyriaki Kalimeri,
Mariano Beiro,
Alessandra Urbinati,
Andrea Bonanomi,
Alessandro Rosino,
Ciro Cattuto
Abstract:
Psychological, political, cultural, and even societal factors are entangled in the reasoning and decision-making process towards vaccination, rendering vaccine hesitancy a complex issue. Here, administering a series of surveys via a Facebook-hosted application, we study the worldviews of people that "Liked" supportive or vaccine resilient Facebook Pages. In particular, we assess differences in pol…
▽ More
Psychological, political, cultural, and even societal factors are entangled in the reasoning and decision-making process towards vaccination, rendering vaccine hesitancy a complex issue. Here, administering a series of surveys via a Facebook-hosted application, we study the worldviews of people that "Liked" supportive or vaccine resilient Facebook Pages. In particular, we assess differences in political viewpoints, moral values, personality traits, and general interests, finding that those sceptical about vaccination, appear to trust less the government, are less agreeable, while they are emphasising more on anti-authoritarian values. Exploring the differences in moral narratives as expressed in the linguistic descriptions of the Facebook Pages, we see that pages that defend vaccines prioritise the value of the family while the vaccine hesitancy pages are focusing on the value of freedom. Finally, creating embeddings based on the health-related likes on Facebook Pages, we explore common, latent interests of vaccine-hesitant people, showing a strong preference for natural cures. This exploratory analysis aims at exploring the potentials of a social media platform to act as a sensing tool, providing researchers and policymakers with insights drawn from the digital traces, that can help design communication campaigns that build confidence, based on the values that also appeal to the socio-moral criteria of people.
△ Less
Submitted 23 May, 2019; v1 submitted 1 April, 2019;
originally announced April 2019.
-
Effect of Values and Technology Use on Exercise: Implications for Personalized Behavior Change Interventions
Authors:
Yelena Mejova,
Kyriaki Kalimeri
Abstract:
Technology has recently been recruited in the war against the ongoing obesity crisis; however, the adoption of Health & Fitness applications for regular exercise is a struggle. In this study, we present a unique demographically representative dataset of 15k US residents that combines technology use logs with surveys on moral views, human values, and emotional contagion. Combining these data, we pr…
▽ More
Technology has recently been recruited in the war against the ongoing obesity crisis; however, the adoption of Health & Fitness applications for regular exercise is a struggle. In this study, we present a unique demographically representative dataset of 15k US residents that combines technology use logs with surveys on moral views, human values, and emotional contagion. Combining these data, we provide a holistic view of individuals to model their physical exercise behavior. First, we show which values determine the adoption of Health & Fitness mobile applications, finding that users who prioritize the value of purity and de-emphasize values of conformity, hedonism, and security are more likely to use such apps. Further, we achieve a weighted AUROC of .673 in predicting whether individual exercises, and we also show that the application usage data allows for substantially better classification performance (.608) compared to using basic demographics (.513) or internet browsing data (.546). We also find a strong link of exercise to respondent socioeconomic status, as well as the value of happiness. Using these insights, we propose actionable design guidelines for persuasive technologies targeting health behavior modification.
△ Less
Submitted 27 March, 2019;
originally announced March 2019.
-
Evaluation of Biases in Self-reported Demographic and Psychometric Information: Traditional versus Facebook-based Surveys
Authors:
Kyriaki Kalimeri,
Mariano G. Beiro,
Andrea Bonanomi,
Alessandro Rosina,
Ciro Cattuto
Abstract:
Social media in scientific research offer a unique digital observatory of human behaviours and hence great opportunities to conduct research at large scale answering complex sociodemographic questions. We focus on the identification and assessment of biases in social media administered surveys. This study aims to shed light on population, self-selection and behavioural biases, empirically comparin…
▽ More
Social media in scientific research offer a unique digital observatory of human behaviours and hence great opportunities to conduct research at large scale answering complex sociodemographic questions. We focus on the identification and assessment of biases in social media administered surveys. This study aims to shed light on population, self-selection and behavioural biases, empirically comparing the consistency between self-reported information collected traditionally versus social media administered questionnaires, including demographic and psychometric attributes. We engaged a demographically representative cohort of young adults in Italy (approximately 4,000 participants) in taking a traditionally administered online survey and then, after one year, we invited them to use our ad hoc Facebook application (988 accepted) where they filled in part of the initial survey. We assess the statistically significant differences indicating population, self-selection, and behavioural biases due to the different context in which the questionnaire is administered. Our findings suggest that surveys administered on Facebook do not exhibit major biases with respect to traditionally administered surveys neither in terms of demographics, nor personality traits. Loyalty, authority, and social binding values were higher in the Facebook platform, probably due to the platform's intrinsic social character. We conclude, that Facebook apps are valid research tools for administering demographic and psychometric surveys provided that the entailed biases are taken into consideration. We contribute to the characterisation of Facebook apps as a valid scientific tool to administer demographic and psychometric surveys, and to the assessment of population, self-selection, and behavioural biases in the collected data.
△ Less
Submitted 23 January, 2019;
originally announced January 2019.
-
Multimodal Classification of Stressful Environments in Visually Impaired Mobility Using EEG and Peripheral Biosignals
Authors:
Charalampos Saitis,
Kyriaki Kalimeri
Abstract:
In this study, we aim to better understand the cognitive-emotional experience of visually impaired people when navigating in unfamiliar urban environments, both outdoor and indoor. We propose a multimodal framework based on random forest classifiers, which predict the actual environment among predefined generic classes of urban settings, inferring on real-time, non-invasive, ambulatory monitoring…
▽ More
In this study, we aim to better understand the cognitive-emotional experience of visually impaired people when navigating in unfamiliar urban environments, both outdoor and indoor. We propose a multimodal framework based on random forest classifiers, which predict the actual environment among predefined generic classes of urban settings, inferring on real-time, non-invasive, ambulatory monitoring of brain and peripheral biosignals. Model performance reached 93% for the outdoor and 87% for the indoor environments (expressed in weighted AUROC), demonstrating the potential of the approach. Estimating the density distributions of the most predictive biomarkers, we present a series of geographic and temporal visualizations depicting the environmental contexts in which the most intense affective and cognitive reactions take place. A linear mixed model analysis revealed significant differences between categories of vision impairment, but not between normal and impaired vision. Despite the limited size of our cohort, these findings pave the way to emotionally intelligent mobility-enhancing systems, capable of implicit adaptation not only to changing environments but also to shifts in the affective state of the user in relation to different environmental and situational factors.
△ Less
Submitted 25 November, 2018;
originally announced November 2018.
-
Wearable proximity sensors for monitoring a mass casualty incident exercise: a feasibility study
Authors:
Laura Ozella,
Laetitia Gauvin,
Luca Carenzo,
Marco Quaggiotto,
Pier Luigi Ingrassia,
Michele Tizzoni,
André Panisson,
Davide Colombo,
Anna Sapienza,
Kyriaki Kalimeri,
Francesco Della Corte,
Ciro Cattuto
Abstract:
Over the past several decades, naturally occurring and man-made mass casualty incidents (MCI) have increased in frequency and number, worldwide. To test the impact of such event on medical resources, simulations can provide a safe, controlled setting while replicating the chaotic environment typical of an actual disaster. A standardised method to collect and analyse data from mass casualty exercis…
▽ More
Over the past several decades, naturally occurring and man-made mass casualty incidents (MCI) have increased in frequency and number, worldwide. To test the impact of such event on medical resources, simulations can provide a safe, controlled setting while replicating the chaotic environment typical of an actual disaster. A standardised method to collect and analyse data from mass casualty exercises is needed, in order to assess preparedness and performance of the healthcare staff involved. We report on the use of wearable proximity sensors to measure proximity events during a MCI simulation. We investigated the interactions between medical staff and patients, to evaluate the time dedicated by the medical staff with respect to the severity of the injury of the victims depending on the roles. We estimated the presence of the patients in the different spaces of the field hospital, in order to study the patients' flow. Data were obtained and collected through the deployment of wearable proximity sensors during a mass casualty incident functional exercise. The scenario included two areas: the accident site and the Advanced Medical Post (AMP), and the exercise lasted 3 hours. A total of 238 participants simulating medical staff and victims were involved. Each participant wore a proximity sensor and 30 fixed devices were placed in the field hospital. The contact networks show a heterogeneous distribution of the cumulative time spent in proximity by participants. We obtained contact matrices based on cumulative time spent in proximity between victims and the rescuers. Our results showed that the time spent in proximity by the healthcare teams with the victims is related to the severity of the patient's injury. The analysis of patients' flow showed that the presence of patients in the rooms of the hospital is consistent with triage code and diagnosis, and no obvious bottlenecks were found.
△ Less
Submitted 18 September, 2018;
originally announced September 2018.
-
Predicting Demographics, Moral Foundations, and Human Values from Digital Behaviors
Authors:
Kyriaki Kalimeri,
Mariano G. Beiro,
Matteo Delfino,
Robert Raleigh,
Ciro Cattuto
Abstract:
Personal electronic devices including smartphones give access to behavioural signals that can be used to learn about the characteristics and preferences of individuals. In this study, we explore the connection between demographic and psychological attributes and the digital behavioural records, for a cohort of 7,633 people, closely representative of the US population with respect to gender, age, g…
▽ More
Personal electronic devices including smartphones give access to behavioural signals that can be used to learn about the characteristics and preferences of individuals. In this study, we explore the connection between demographic and psychological attributes and the digital behavioural records, for a cohort of 7,633 people, closely representative of the US population with respect to gender, age, geographical distribution, education, and income. Along with the demographic data, we collected self-reported assessments on validated psychometric questionnaires for moral traits and basic human values and combined this information with passively collected multi-modal digital data from web browsing behaviour and smartphone usage. A machine learning framework was then designed to infer both the demographic and psychological attributes from the behavioural data. In a cross-validated setting, our models predicted demographic attributes with good accuracy as measured by the weighted AUROC score (Area Under the Receiver Operating Characteristic), but were less performant for the moral traits and human values. These results call for further investigation since they are still far from unveiling individuals' psychological fabric. This connection, along with the most predictive features that we provide for each attribute, might prove useful for designing personalised services, communication strategies, and interventions, and can be used to sketch a portrait of people with a similar worldview.
△ Less
Submitted 21 November, 2018; v1 submitted 5 December, 2017;
originally announced December 2017.