Skip to main content

Showing 1–7 of 7 results for author: Di Giovanni, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2201.06293  [pdf, other

    cs.SI

    VaccinEU: COVID-19 vaccine conversations on Twitter in French, German and Italian

    Authors: Marco Di Giovanni, Francesco Pierri, Christopher Torres-Lugo, Marco Brambilla

    Abstract: Despite the increasing limitations for unvaccinated people, in many European countries there is still a non-negligible fraction of individuals who refuse to get vaccinated against SARS-CoV-2, undermining governmental efforts to eradicate the virus. We study the role of online social media in influencing individuals' opinion towards getting vaccinated by designing a large-scale collection of Twitte… ▽ More

    Submitted 4 April, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

    Comments: 9 pages, 6 figures, 3 tables. Data can be fully accessed in a Dataverse (https://doi.org/10.7910/DVN/NZUMZG) and a GitHub repository (https://github.com/DataSciencePolimi/VaccinEU)

    Journal ref: Proc. Intl. AAAI Conf. on Web and Social Media (ICWSM), 2022

  2. arXiv:2112.02721  [pdf, other

    cs.CL cs.AI cs.LG

    NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

    Authors: Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo , et al. (101 additional authors not shown)

    Abstract: Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data split… ▽ More

    Submitted 11 October, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: 39 pages, repository at https://github.com/GEM-benchmark/NL-Augmenter

  3. arXiv:2110.02030  [pdf, other

    cs.CL

    Exploiting Twitter as Source of Large Corpora of Weakly Similar Pairs for Semantic Sentence Embeddings

    Authors: Marco Di Giovanni, Marco Brambilla

    Abstract: Semantic sentence embeddings are usually supervisedly built minimizing distances between pairs of embeddings of sentences labelled as semantically similar by annotators. Since big labelled datasets are rare, in particular for non-English languages, and expensive, recent studies focus on unsupervised approaches that require not-paired input sentences. We instead propose a language-independent appro… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

    Comments: 9 pages, 3 figures, accepted at EMNLP2021

  4. arXiv:2101.03757  [pdf, other

    cs.SI

    VaccinItaly: monitoring Italian conversations around vaccines on Twitter and Facebook

    Authors: Francesco Pierri, Andrea Tocchetti, Lorenzo Corti, Marco Di Giovanni, Silvio Pavanetto, Marco Brambilla, Stefano Ceri

    Abstract: We present VaccinItaly, a project which monitors Italian online conversations around vaccines, on Twitter and Facebook. We describe the ongoing data collection, which follows the SARS-CoV-2 vaccination campaign roll-out in Italy and we provide public access to the data collected. We show results from a preliminary analysis of the spread of low- and high-credibility news shared alongside vaccine-re… ▽ More

    Submitted 4 May, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

    Comments: To appear in the proceedings of ICWSM 2021. The repository associated to this paper is here: https://github.com/frapierri/VaccinItaly

  5. arXiv:2010.05736  [pdf, other

    cs.CL

    EFSG: Evolutionary Fooling Sentences Generator

    Authors: Marco Di Giovanni, Marco Brambilla

    Abstract: Large pre-trained language representation models (LMs) have recently collected a huge number of successes in many NLP tasks. In 2018 BERT, and later its successors (e.g. RoBERTa), obtained state-of-the-art results in classical benchmark tasks, such as GLUE benchmark. After that, works about adversarial attacks have been published to test their generalization proprieties and robustness. In th… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: 13 pages, 19 figures

  6. Information disorders on Italian Facebook during COVID-19 infodemic

    Authors: Alessandro Celestini, Marco Di Giovanni, Stefano Guarino, Francesco Pierri

    Abstract: In this work we carry out an exploratory analysis of online conversations on the Italian Facebook during the recent COVID-19 pandemic. We analyze the circulation of controversial topics associated with the origin of the virus, which involve popular targets of misinformation, such as migrants and 5G technology. We collected over 1.5 M posts in Italian language and related to COVID-19, shared by nea… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

    Comments: 16 pages, 13 figures, 7 tables

  7. arXiv:1904.08991  [pdf, other

    physics.comp-ph cs.LG

    Physical Symmetries Embedded in Neural Networks

    Authors: M. Mattheakis, P. Protopapas, D. Sondak, M. Di Giovanni, E. Kaxiras

    Abstract: Neural networks are a central technique in machine learning. Recent years have seen a wave of interest in applying neural networks to physical systems for which the governing dynamics are known and expressed through differential equations. Two fundamental challenges facing the development of neural networks in physics applications is their lack of interpretability and their physics-agnostic design… ▽ More

    Submitted 29 January, 2020; v1 submitted 18 April, 2019; originally announced April 2019.

    Comments: This is the same manuscript with version 1 (arXiv:1904.08991v1) which accidentally was replaced 16 pages, 8 figures