-
AlbNews: A Corpus of Headlines for Topic Modeling in Albanian
Authors:
Erion Çano,
Dario Lamaj
Abstract:
The scarcity of available text corpora for low-resource languages like Albanian is a serious hurdle for research in natural language processing tasks. This paper introduces AlbNews, a collection of 600 topically labeled news headlines and 2600 unlabeled ones in Albanian. The data can be freely used for conducting topic modeling research. We report the initial classification scores of some traditio…
▽ More
The scarcity of available text corpora for low-resource languages like Albanian is a serious hurdle for research in natural language processing tasks. This paper introduces AlbNews, a collection of 600 topically labeled news headlines and 2600 unlabeled ones in Albanian. The data can be freely used for conducting topic modeling research. We report the initial classification scores of some traditional machine learning classifiers trained with the AlbNews samples. These results show that basic models outrun the ensemble learning ones and can serve as a baseline for future experiments.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
The potential of large language models for improving probability learning: A study on ChatGPT3.5 and first-year computer engineering students
Authors:
Angel Udias,
Antonio Alonso-Ayuso,
Ignacio Sanchez,
Sonia Hernandez,
Maria Eugenia Castellanos,
Raquel Montes Diez,
Emilio Lopez Cano
Abstract:
In this paper, we assess the efficacy of ChatGPT (version Feb 2023), a large-scale language model, in solving probability problems typically presented in introductory computer engineering exams. Our study comprised a set of 23 probability exercises administered to students at Rey Juan Carlos University (URJC) in Madrid. The responses produced by ChatGPT were evaluated by a group of five statistics…
▽ More
In this paper, we assess the efficacy of ChatGPT (version Feb 2023), a large-scale language model, in solving probability problems typically presented in introductory computer engineering exams. Our study comprised a set of 23 probability exercises administered to students at Rey Juan Carlos University (URJC) in Madrid. The responses produced by ChatGPT were evaluated by a group of five statistics professors, who assessed them qualitatively and assigned grades based on the same criteria used for students. Our results indicate that ChatGPT surpasses the average student in terms of phrasing, organization, and logical reasoning. The model's performance remained consistent for both the Spanish and English versions of the exercises. However, ChatGPT encountered difficulties in executing basic numerical operations. Our experiments demonstrate that requesting ChatGPT to provide the solution in the form of an R script proved to be an effective approach for overcoming these limitations. In summary, our results indicate that ChatGPT surpasses the average student in solving probability problems commonly presented in introductory computer engineering exams. Nonetheless, the model exhibits limitations in reasoning around certain probability concepts. The model's ability to deliver high-quality explanations and illustrate solutions in any programming language, coupled with its performance in solving probability exercises, suggests that large language models have the potential to serve as learning assistants.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
AlbNER: A Corpus for Named Entity Recognition in Albanian
Authors:
Erion Çano
Abstract:
Scarcity of resources such as annotated text corpora for under-resourced languages like Albanian is a serious impediment in computational linguistics and natural language processing research. This paper presents AlbNER, a corpus of 900 sentences with labeled named entities, collected from Albanian Wikipedia articles. Preliminary results with BERT and RoBERTa variants fine-tuned and tested with Alb…
▽ More
Scarcity of resources such as annotated text corpora for under-resourced languages like Albanian is a serious impediment in computational linguistics and natural language processing research. This paper presents AlbNER, a corpus of 900 sentences with labeled named entities, collected from Albanian Wikipedia articles. Preliminary results with BERT and RoBERTa variants fine-tuned and tested with AlbNER data indicate that model size has slight impact on NER performance, whereas language transfer has a significant one. AlbNER corpus and these obtained results should serve as baselines for future experiments.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
AlbMoRe: A Corpus of Movie Reviews for Sentiment Analysis in Albanian
Authors:
Erion Çano
Abstract:
Lack of available resources such as text corpora for low-resource languages seriously hinders research on natural language processing and computational linguistics. This paper presents AlbMoRe, a corpus of 800 sentiment annotated movie reviews in Albanian. Each text is labeled as positive or negative and can be used for sentiment analysis research. Preliminary results based on traditional machine…
▽ More
Lack of available resources such as text corpora for low-resource languages seriously hinders research on natural language processing and computational linguistics. This paper presents AlbMoRe, a corpus of 800 sentiment annotated movie reviews in Albanian. Each text is labeled as positive or negative and can be used for sentiment analysis research. Preliminary results based on traditional machine learning classifiers trained with the AlbMoRe samples are also reported. They can serve as comparison baselines for future research experiments.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
MemeGraphs: Linking Memes to Knowledge Graphs
Authors:
Vasiliki Kougia,
Simon Fetzel,
Thomas Kirchmair,
Erion Çano,
Sina Moayed Baharlou,
Sahand Sharifzadeh,
Benjamin Roth
Abstract:
Memes are a popular form of communicating trends and ideas in social media and on the internet in general, combining the modalities of images and text. They can express humor and sarcasm but can also have offensive content. Analyzing and classifying memes automatically is challenging since their interpretation relies on the understanding of visual elements, language, and background knowledge. Thus…
▽ More
Memes are a popular form of communicating trends and ideas in social media and on the internet in general, combining the modalities of images and text. They can express humor and sarcasm but can also have offensive content. Analyzing and classifying memes automatically is challenging since their interpretation relies on the understanding of visual elements, language, and background knowledge. Thus, it is important to meaningfully represent these sources and the interaction between them in order to classify a meme as a whole. In this work, we propose to use scene graphs, that express images in terms of objects and their visual relations, and knowledge graphs as structured representations for meme classification with a Transformer-based architecture. We compare our approach with ImgBERT, a multimodal model that uses only learned (instead of structured) representations of the meme, and observe consistent improvements. We further provide a dataset with human graph annotations that we compare to automatically generated graphs and entity linking. Analysis shows that automatic methods link more entities than human annotators and that automatically generated graphs are better suited for hatefulness classification in memes.
△ Less
Submitted 26 June, 2023; v1 submitted 28 May, 2023;
originally announced May 2023.
-
CSRCZ: A Dataset About Corporate Social Responsibility in Czech Republic
Authors:
Xhesilda Vogli,
Erion Çano
Abstract:
As stakeholders' pressure on corporates for disclosing their corporate social responsibility operations grows, it is crucial to understand how efficient corporate disclosure systems are in bridging the gap between corporate social responsibility reports and their actual practice. Meanwhile, research on corporate social responsibility is still not aligned with the recent data-driven strategies, and…
▽ More
As stakeholders' pressure on corporates for disclosing their corporate social responsibility operations grows, it is crucial to understand how efficient corporate disclosure systems are in bridging the gap between corporate social responsibility reports and their actual practice. Meanwhile, research on corporate social responsibility is still not aligned with the recent data-driven strategies, and little public data are available. This paper aims to describe CSRCZ, a newly created dataset based on disclosure reports from the websites of 1000 companies that operate in Czech Republic. Each company was analyzed based on three main parameters: company size, company industry, and company initiatives. We describe the content of the dataset as well as its potential use for future research. We believe that CSRCZ has implications for further research, since it is the first publicly available dataset of its kind.
△ Less
Submitted 5 January, 2023;
originally announced January 2023.
-
Batch Layer Normalization, A new normalization layer for CNNs and RNN
Authors:
Amir Ziaee,
Erion Çano
Abstract:
This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning p…
▽ More
This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning process. It also performs the exact computation with a minor change at inference times, using either mini-batch statistics or population statistics. The decision process to either use statistics of mini-batch or population gives BLN the ability to play a comprehensive role in the hyper-parameter optimization process of models. The key advantage of BLN is the support of the theoretical analysis of being independent of the input data, and its statistical configuration heavily depends on the task performed, the amount of training data, and the size of batches. Test results indicate the application potential of BLN and its faster convergence than batch normalization and layer normalization in both Convolutional and Recurrent Neural Networks. The code of the experiments is publicly available online (https://github.com/A2Amir/Batch-Layer-Normalization).
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
hmBERT: Historical Multilingual Language Models for Named Entity Recognition
Authors:
Stefan Schweter,
Luisa März,
Katharina Schmid,
Erion Çano
Abstract:
Compared to standard Named Entity Recognition (NER), identifying persons, locations, and organizations in historical texts constitutes a big challenge. To obtain machine-readable corpora, the historical text is usually scanned and Optical Character Recognition (OCR) needs to be performed. As a result, the historical corpora contain errors. Also, entities like location or organization can change ov…
▽ More
Compared to standard Named Entity Recognition (NER), identifying persons, locations, and organizations in historical texts constitutes a big challenge. To obtain machine-readable corpora, the historical text is usually scanned and Optical Character Recognition (OCR) needs to be performed. As a result, the historical corpora contain errors. Also, entities like location or organization can change over time, which poses another challenge. Overall, historical texts come with several peculiarities that differ greatly from modern texts and large labeled corpora for training a neural tagger are hardly available for this domain. In this work, we tackle NER for historical German, English, French, Swedish, and Finnish by training large historical language models. We circumvent the need for large amounts of labeled data by using unlabeled data for pretraining a language model. We propose hmBERT, a historical multilingual BERT-based language model, and release the model in several versions of different sizes. Furthermore, we evaluate the capability of hmBERT by solving downstream NER as part of this year's HIPE-2022 shared task and provide detailed analysis and insights. For the Multilingual Classical Commentary coarse-grained NER challenge, our tagger HISTeria outperforms the other teams' models for two out of three languages.
△ Less
Submitted 1 July, 2022; v1 submitted 31 May, 2022;
originally announced May 2022.
-
Topic Segmentation of Research Article Collections
Authors:
Erion Çano,
Benjamin Roth
Abstract:
Collections of research article data harvested from the web have become common recently since they are important resources for experimenting on tasks such as named entity recognition, text summarization, or keyword generation. In fact, certain types of experiments require collections that are both large and topically structured, with records assigned to separate research disciplines. Unfortunately…
▽ More
Collections of research article data harvested from the web have become common recently since they are important resources for experimenting on tasks such as named entity recognition, text summarization, or keyword generation. In fact, certain types of experiments require collections that are both large and topically structured, with records assigned to separate research disciplines. Unfortunately, the current collections of publicly available research articles are either small or heterogeneous and unstructured. In this work, we perform topic segmentation of a paper data collection that we crawled and produce a multitopic dataset of roughly seven million paper data records. We construct a taxonomy of topics extracted from the data records and then annotate each document with its corresponding topic from that taxonomy. As a result, it is possible to use this newly proposed dataset in two modalities: as a heterogeneous collection of documents from various disciplines or as a set of homogeneous collections, each from a single research topic.
△ Less
Submitted 18 May, 2022;
originally announced May 2022.
-
Personalized musically induced emotions of not-so-popular Colombian music
Authors:
Juan Sebastián Gómez-Cañón,
Perfecto Herrera,
Estefanía Cano,
Emilia Gómez
Abstract:
This work presents an initial proof of concept of how Music Emotion Recognition (MER) systems could be intentionally biased with respect to annotations of musically induced emotions in a political context. In specific, we analyze traditional Colombian music containing politically charged lyrics of two types: (1) vallenatos and social songs from the "left-wing" guerrilla Fuerzas Armadas Revoluciona…
▽ More
This work presents an initial proof of concept of how Music Emotion Recognition (MER) systems could be intentionally biased with respect to annotations of musically induced emotions in a political context. In specific, we analyze traditional Colombian music containing politically charged lyrics of two types: (1) vallenatos and social songs from the "left-wing" guerrilla Fuerzas Armadas Revolucionarias de Colombia (FARC) and (2) corridos from the "right-wing" paramilitaries Autodefensas Unidas de Colombia (AUC). We train personalized machine learning models to predict induced emotions for three users with diverse political views - we aim at identifying the songs that may induce negative emotions for a particular user, such as anger and fear. To this extent, a user's emotion judgements could be interpreted as problematizing data - subjective emotional judgments could in turn be used to influence the user in a human-centered machine learning environment. In short, highly desired "emotion regulation" applications could potentially deviate to "emotion manipulation" - the recent discredit of emotion recognition technologies might transcend ethical issues of diversity and inclusion.
△ Less
Submitted 9 December, 2021;
originally announced December 2021.
-
Focused Contrastive Training for Test-based Constituency Analysis
Authors:
Benjamin Roth,
Erion Çano
Abstract:
We propose a scheme for self-training of grammaticality models for constituency analysis based on linguistic tests. A pre-trained language model is fine-tuned by contrastive estimation of grammatical sentences from a corpus, and ungrammatical sentences that were perturbed by a syntactic test, a transformation that is motivated by constituency theory. We show that consistent gains can be achieved i…
▽ More
We propose a scheme for self-training of grammaticality models for constituency analysis based on linguistic tests. A pre-trained language model is fine-tuned by contrastive estimation of grammatical sentences from a corpus, and ungrammatical sentences that were perturbed by a syntactic test, a transformation that is motivated by constituency theory. We show that consistent gains can be achieved if only certain positive instances are chosen for training, depending on whether they could be the result of a test transformation. This way, the positives, and negatives exhibit similar characteristics, which makes the objective more challenging for the language model, and also allows for additional markup that indicates the position of the test application within the sentence.
△ Less
Submitted 30 September, 2021;
originally announced September 2021.
-
How Many Pages? Paper Length Prediction from the Metadata
Authors:
Erion Çano,
Ondřej Bojar
Abstract:
Being able to predict the length of a scientific paper may be helpful in numerous situations. This work defines the paper length prediction task as a regression problem and reports several experimental results using popular machine learning models. We also create a huge dataset of publication metadata and the respective lengths in number of pages. The dataset will be freely available and is intend…
▽ More
Being able to predict the length of a scientific paper may be helpful in numerous situations. This work defines the paper length prediction task as a regression problem and reports several experimental results using popular machine learning models. We also create a huge dataset of publication metadata and the respective lengths in number of pages. The dataset will be freely available and is intended to foster research in this domain. As future work, we would like to explore more advanced regressors based on neural networks and big pretrained language models.
△ Less
Submitted 17 December, 2020; v1 submitted 29 October, 2020;
originally announced October 2020.
-
Degradation effects of water immersion on earbud audio quality
Authors:
Scott Beveridge,
Steffen A. Herff,
Estefanía Cano
Abstract:
Earbuds are subjected to constant use and scenarios that may degrade sound quality. Indeed, a common fate of earbuds is being forgotten in pockets and faced with a laundry cycle (LC). Manufacturers' accounts of the extent to which LCs affect earbud sound quality are vague at best, leaving users to their own devices in assessing the damage caused. This paper offers a systematic, empirical approach…
▽ More
Earbuds are subjected to constant use and scenarios that may degrade sound quality. Indeed, a common fate of earbuds is being forgotten in pockets and faced with a laundry cycle (LC). Manufacturers' accounts of the extent to which LCs affect earbud sound quality are vague at best, leaving users to their own devices in assessing the damage caused. This paper offers a systematic, empirical approach to measure the effects of laundering earbuds on sound quality. Three earbud pairs were subjected to LCs spaced 24 hours apart. After each LC, a professional microphone as well as a mid-market smartphone were used to record i) a test tone ii) a frequency sweep and iii) a music signal played through the earbuds. We deployed mixed effects models and found significant degradation in terms of RMS noise loudness, Total Harmonic Distortion (THD), as well as measures of change in the frequency responses of the earbuds. All transducers showed degradation already after the first cycle, and no transducers produced a measurable signal after the sixth LC. The degradation effects were detectable in both, the professional microphone as well as the smartphone recordings. We hope that the present work is a first step in establishing a practical, and ecologically valid method for everyday users to assess the degree of degradation of their personal earbuds.
△ Less
Submitted 1 September, 2020;
originally announced September 2020.
-
A Data-driven Neural Network Architecture for Sentiment Analysis
Authors:
Erion Çano,
Maurizio Morisio
Abstract:
The fabulous results of convolution neural networks in image-related tasks, attracted attention of text mining, sentiment analysis and other text analysis researchers. It is however difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. In this paper we present the creation steps of two big data…
▽ More
The fabulous results of convolution neural networks in image-related tasks, attracted attention of text mining, sentiment analysis and other text analysis researchers. It is however difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. In this paper we present the creation steps of two big datasets of song emotions. We also explore usage of convolution and max-pooling neural layers on song lyrics, product and movie review text datasets. Three variants of a simple and flexible neural network architecture are also compared. Our intention was to spot any important patterns that can serve as guidelines for parameter optimization of similar models. We also wanted to identify architecture design choices which lead to high performing sentiment analysis models. To this end, we conducted a series of experiments with neural architectures of various configurations. Our results indicate that parallel convolutions of filter lengths up to three are usually enough for capturing relevant text features. Also, max-pooling region size should be adapted to the length of text documents for producing the best feature maps. Top results we got are obtained with feature maps of lengths 6 to 18. An improvement on future neural network models for sentiment analysis, could be generating sentiment polarity prediction of documents using aggregation of predictions on smaller excerpt of the entire text.
△ Less
Submitted 30 June, 2020;
originally announced June 2020.
-
Mood-based On-Car Music Recommendations
Authors:
Erion Çano,
Riccardo Coppola,
Eleonora Gargiulo,
Marco Marengo,
Maurizio Morisio
Abstract:
Driving and music listening are two inseparable everyday activities for millions of people today in the world. Considering the high correlation between music, mood and driving comfort and safety, it makes sense to use appropriate and intelligent music recommendations based on the mood of drivers and songs in the context of car driving. The objective of this paper is to present the project of a con…
▽ More
Driving and music listening are two inseparable everyday activities for millions of people today in the world. Considering the high correlation between music, mood and driving comfort and safety, it makes sense to use appropriate and intelligent music recommendations based on the mood of drivers and songs in the context of car driving. The objective of this paper is to present the project of a contextual mood-based music recommender system capable of regulating the driver's mood and trying to have a positive influence on her driving behaviour. Here we present the proof of concept of the system and describe the techniques and technologies that are part of it. Further possible future improvements on each of the building blocks are also presented.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
Automating Text Naturalness Evaluation of NLG Systems
Authors:
Erion Çano,
Ondřej Bojar
Abstract:
Automatic methods and metrics that assess various quality criteria of automatically generated texts are important for developing NLG systems because they produce repeatable results and allow for a fast development cycle. We present here an attempt to automate the evaluation of text naturalness which is a very important characteristic of natural language generation methods. Instead of relying on hu…
▽ More
Automatic methods and metrics that assess various quality criteria of automatically generated texts are important for developing NLG systems because they produce repeatable results and allow for a fast development cycle. We present here an attempt to automate the evaluation of text naturalness which is a very important characteristic of natural language generation methods. Instead of relying on human participants for scoring or labeling the text samples, we propose to automate the process by using a human likeliness metric we define and a discrimination procedure based on large pretrained language models with their probability distributions. We analyze the text probability fractions and observe how they are influenced by the size of the generative and discriminative models involved in the process. Based on our results, bigger generators and larger pretrained discriminators are more appropriate for a better evaluation of text naturalness. A comprehensive validation procedure with human participants is required as follow up to check how well this automatic evaluation scheme correlates with human judgments.
△ Less
Submitted 23 June, 2020;
originally announced June 2020.
-
Human or Machine: Automating Human Likeliness Evaluation of NLG Texts
Authors:
Erion Çano,
Ondřej Bojar
Abstract:
Automatic evaluation of various text quality criteria produced by data-driven intelligent methods is very common and useful because it is cheap, fast, and usually yields repeatable results. In this paper, we present an attempt to automate the human likeliness evaluation of the output text samples coming from natural language generation methods used to solve several tasks. We propose to use a human…
▽ More
Automatic evaluation of various text quality criteria produced by data-driven intelligent methods is very common and useful because it is cheap, fast, and usually yields repeatable results. In this paper, we present an attempt to automate the human likeliness evaluation of the output text samples coming from natural language generation methods used to solve several tasks. We propose to use a human likeliness score that shows the percentage of the output samples from a method that look as if they were written by a human. Instead of having human participants label or rate those samples, we completely automate the process by using a discrimination procedure based on large pretrained language models and their probability distributions. As follow up, we plan to perform an empirical analysis of human-written and machine-generated texts to find the optimal setup of this evaluation approach. A validation procedure involving human participants will also check how the automatic evaluation correlates with human judgments.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
Quality of Word Embeddings on Sentiment Analysis Tasks
Authors:
Erion Çano,
Maurizio Morisio
Abstract:
Word embeddings or distributed representations of words are being used in various applications like machine translation, sentiment analysis, topic identification etc. Quality of word embeddings and performance of their applications depends on several factors like training method, corpus size and relevance etc. In this study we compare performance of a dozen of pretrained word embedding models on l…
▽ More
Word embeddings or distributed representations of words are being used in various applications like machine translation, sentiment analysis, topic identification etc. Quality of word embeddings and performance of their applications depends on several factors like training method, corpus size and relevance etc. In this study we compare performance of a dozen of pretrained word embedding models on lyrics sentiment analysis and movie review polarity tasks. According to our results, Twitter Tweets is the best on lyrics sentiment analysis, whereas Google News and Common Crawl are the top performers on movie polarity analysis. Glove trained models slightly outrun those trained with Skipgram. Also, factors like topic relevance and size of corpus significantly impact the quality of the models. When medium or large-sized text sets are available, obtaining word embeddings from same training dataset is usually the best choice.
△ Less
Submitted 6 March, 2020;
originally announced March 2020.
-
Two Huge Title and Keyword Generation Corpora of Research Articles
Authors:
Erion Çano,
Ondřej Bojar
Abstract:
Recent developments in sequence-to-sequence learning with neural networks have considerably improved the quality of automatically generated text summaries and document keywords, stipulating the need for even bigger training corpora. Metadata of research articles are usually easy to find online and can be used to perform research on various tasks. In this paper, we introduce two huge datasets for t…
▽ More
Recent developments in sequence-to-sequence learning with neural networks have considerably improved the quality of automatically generated text summaries and document keywords, stipulating the need for even bigger training corpora. Metadata of research articles are usually easy to find online and can be used to perform research on various tasks. In this paper, we introduce two huge datasets for text summarization (OAGSX) and keyword generation (OAGKX) research, containing 34 million and 23 million records, respectively. The data were retrieved from the Open Academic Graph which is a network of research profiles and publications. We carefully processed each record and also tried several extractive and abstractive methods of both tasks to create performance baselines for other researchers. We further illustrate the performance of those methods previewing their outputs. In the near future, we would like to apply topic modeling on the two sets to derive subsets of research articles from more specific disciplines.
△ Less
Submitted 11 February, 2020;
originally announced February 2020.
-
Keyphrase Generation: A Multi-Aspect Survey
Authors:
Erion Çano,
Ondřej Bojar
Abstract:
Extractive keyphrase generation research has been around since the nineties, but the more advanced abstractive approach based on the encoder-decoder framework and sequence-to-sequence learning has been explored only recently. In fact, more than a dozen of abstractive methods have been proposed in the last three years, producing meaningful keyphrases and achieving state-of-the-art scores. In this s…
▽ More
Extractive keyphrase generation research has been around since the nineties, but the more advanced abstractive approach based on the encoder-decoder framework and sequence-to-sequence learning has been explored only recently. In fact, more than a dozen of abstractive methods have been proposed in the last three years, producing meaningful keyphrases and achieving state-of-the-art scores. In this survey, we examine various aspects of the extractive keyphrase generation methods and focus mostly on the more recent abstractive methods that are based on neural networks. We pay particular attention to the mechanisms that have driven the perfection of the later. A huge collection of scientific article metadata and the corresponding keyphrases is created and released for the research community. We also present various keyphrase generation and text summarization research patterns and trends of the last two decades.
△ Less
Submitted 11 October, 2019;
originally announced October 2019.
-
Efficiency Metrics for Data-Driven Models: A Text Summarization Case Study
Authors:
Erion Çano,
Ondřej Bojar
Abstract:
Using data-driven models for solving text summarization or similar tasks has become very common in the last years. Yet most of the studies report basic accuracy scores only, and nothing is known about the ability of the proposed models to improve when trained on more data. In this paper, we define and propose three data efficiency metrics: data score efficiency, data time deficiency and overall da…
▽ More
Using data-driven models for solving text summarization or similar tasks has become very common in the last years. Yet most of the studies report basic accuracy scores only, and nothing is known about the ability of the proposed models to improve when trained on more data. In this paper, we define and propose three data efficiency metrics: data score efficiency, data time deficiency and overall data efficiency. We also propose a simple scheme that uses those metrics and apply it for a more comprehensive evaluation of popular methods on text summarization and title generation tasks. For the latter task, we process and release a huge collection of 35 million abstract-title pairs from scientific articles. Our results reveal that among the tested models, the Transformer is the most efficient on both tasks.
△ Less
Submitted 14 September, 2019;
originally announced September 2019.
-
The emotions that we perceive in music: the influence of language and lyrics comprehension on agreement
Authors:
Juan Sebastián Gómez Cañón,
Perfecto Herrera,
Emilia Gómez,
Estefanía Cano
Abstract:
In the present study, we address the relationship between the emotions perceived in pop and rock music (mainly in Euro-American styles with English lyrics) and the language spoken by the listener. Our goal is to understand the influence of lyrics comprehension on the perception of emotions and use this information to improve Music Emotion Recognition (MER) models. Two main research questions are a…
▽ More
In the present study, we address the relationship between the emotions perceived in pop and rock music (mainly in Euro-American styles with English lyrics) and the language spoken by the listener. Our goal is to understand the influence of lyrics comprehension on the perception of emotions and use this information to improve Music Emotion Recognition (MER) models. Two main research questions are addressed: 1. Are there differences and similarities between the emotions perceived in pop/rock music by listeners raised with different mother tongues? 2. Do personal characteristics have an influence on the perceived emotions for listeners of a given language? Personal characteristics include the listeners' general demographics, familiarity and preference for the fragments, and music sophistication. Our hypothesis is that inter-rater agreement (as defined by Krippendorff's alpha coefficient) from subjects is directly influenced by the comprehension of lyrics.
△ Less
Submitted 25 October, 2019; v1 submitted 12 September, 2019;
originally announced September 2019.
-
Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation
Authors:
Stylianos Ioannis Mimilakis,
Konstantinos Drossos,
Estefanía Cano,
Gerald Schuller
Abstract:
The goal of this work is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder (DAE) model that are conditioned on the mixture magnitude spectra. To approximate the mapping functions, we propose an algorithm inspired by the knowledge distillation, denoted the neura…
▽ More
The goal of this work is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder (DAE) model that are conditioned on the mixture magnitude spectra. To approximate the mapping functions, we propose an algorithm inspired by the knowledge distillation, denoted the neural couplings algorithm (NCA). The NCA yields a matrix that expresses the mapping of the mixture to the target source magnitude information. Using the NCA, we examine the mapping functions of three fundamental DAE-based models in music source separation; one with single-layer encoder and decoder, one with multi-layer encoder and single-layer decoder, and one using skip-filtering connections (SF) with a single-layer encoding and decoding. We first train these models with realistic data to estimate the singing voice magnitude spectra from the corresponding mixture. We then use the optimized models and test spectral data as input to the NCA. Our experimental findings show that approaches based on the DAE model learn scalar filtering operators, exhibiting a predominant diagonal structure in their corresponding mapping functions, limiting the exploitation of inter-frequency structure of music data. In contrast, skip-filtering connections are shown to assist the DAE model in learning filtering operators that exploit richer inter-frequency structures.
△ Less
Submitted 20 October, 2019; v1 submitted 12 April, 2019;
originally announced April 2019.
-
Keyphrase Generation: A Text Summarization Struggle
Authors:
Erion Çano,
Ondřej Bojar
Abstract:
Authors' keyphrases assigned to scientific articles are essential for recognizing content and topic aspects. Most of the proposed supervised and unsupervised methods for keyphrase generation are unable to produce terms that are valuable but do not appear in the text. In this paper, we explore the possibility of considering the keyphrase string as an abstractive summary of the title and the abstrac…
▽ More
Authors' keyphrases assigned to scientific articles are essential for recognizing content and topic aspects. Most of the proposed supervised and unsupervised methods for keyphrase generation are unable to produce terms that are valuable but do not appear in the text. In this paper, we explore the possibility of considering the keyphrase string as an abstractive summary of the title and the abstract. First, we collect, process and release a large dataset of scientific paper metadata that contains 2.2 million records. Then we experiment with popular text summarization neural architectures. Despite using advanced deep learning models, large quantities of data and many days of computation, our systematic evaluation on four test datasets reveals that the explored text summarization methods could not produce better keyphrases than the simpler unsupervised methods, or the existing supervised ones.
△ Less
Submitted 3 April, 2019; v1 submitted 29 March, 2019;
originally announced April 2019.
-
Word Embeddings for Sentiment Analysis: A Comprehensive Empirical Survey
Authors:
Erion Çano,
Maurizio Morisio
Abstract:
This work investigates the role of factors like training method, training corpus size and thematic relevance of texts in the performance of word embedding features on sentiment analysis of tweets, song lyrics, movie reviews and item reviews. We also explore specific training or post-processing methods that can be used to enhance the performance of word embeddings in certain tasks or domains. Our e…
▽ More
This work investigates the role of factors like training method, training corpus size and thematic relevance of texts in the performance of word embedding features on sentiment analysis of tweets, song lyrics, movie reviews and item reviews. We also explore specific training or post-processing methods that can be used to enhance the performance of word embeddings in certain tasks or domains. Our empirical observations indicate that models trained with multithematic texts that are large and rich in vocabulary are the best in answering syntactic and semantic word analogy questions. We further observe that influence of thematic relevance is stronger on movie and phone reviews, but weaker on tweets and lyrics. These two later domains are more sensitive to corpus size and training method, with Glove outperforming Word2vec. "Injecting" extra intelligence from lexicons or generating sentiment specific word embeddings are two prominent alternatives for increasing performance of word embedding features.
△ Less
Submitted 2 February, 2019;
originally announced February 2019.
-
Hybrid Recommender Systems: A Systematic Literature Review
Authors:
Erion Çano,
Maurizio Morisio
Abstract:
Recommender systems are software tools used to generate and provide suggestions for items and other entities to the users by exploiting various strategies. Hybrid recommender systems combine two or more recommendation strategies in different ways to benefit from their complementary advantages. This systematic literature review presents the state of the art in hybrid recommender systems of the last…
▽ More
Recommender systems are software tools used to generate and provide suggestions for items and other entities to the users by exploiting various strategies. Hybrid recommender systems combine two or more recommendation strategies in different ways to benefit from their complementary advantages. This systematic literature review presents the state of the art in hybrid recommender systems of the last decade. It is the first quantitative review work completely focused in hybrid recommenders. We address the most relevant problems considered and present the associated data mining and recommendation techniques used to overcome them. We also explore the hybridization classes each hybrid recommender belongs to, the application domains, the evaluation process and proposed future research directions. Based on our findings, most of the studies combine collaborative filtering with another technique often in a weighted way. Also cold-start and data sparsity are the two traditional and top problems being addressed in 23 and 22 studies each, while movies and movie datasets are still widely used by most of the authors. As most of the studies are evaluated by comparisons with similar methods using accuracy metrics, providing more credible and user oriented evaluations remains a typical challenge. Besides this, newer challenges were also identified such as responding to the variation of user context, evolving user tastes or providing cross-domain recommendations. Being a hot topic, hybrid recommenders represent a good basis with which to respond accordingly by exploring newer opportunities such as contextualizing recommendations, involving parallel hybrid algorithms, processing larger datasets, etc.
△ Less
Submitted 12 January, 2019;
originally announced January 2019.
-
Sentiment Analysis of Czech Texts: An Algorithmic Survey
Authors:
Erion Çano,
Ondřej Bojar
Abstract:
In the area of online communication, commerce and transactions, analyzing sentiment polarity of texts written in various natural languages has become crucial. While there have been a lot of contributions in resources and studies for the English language, "smaller" languages like Czech have not received much attention. In this survey, we explore the effectiveness of many existing machine learning a…
▽ More
In the area of online communication, commerce and transactions, analyzing sentiment polarity of texts written in various natural languages has become crucial. While there have been a lot of contributions in resources and studies for the English language, "smaller" languages like Czech have not received much attention. In this survey, we explore the effectiveness of many existing machine learning algorithms for sentiment analysis of Czech Facebook posts and product reviews. We report the sets of optimal parameter values for each algorithm and the scores in both datasets. We finally observe that support vector machines are the best classifier and efforts to increase performance even more with bagging, boosting or voting ensemble schemes fail to do so.
△ Less
Submitted 15 March, 2019; v1 submitted 9 January, 2019;
originally announced January 2019.
-
Text-based Sentiment Analysis and Music Emotion Recognition
Authors:
Erion Çano
Abstract:
Sentiment polarity of tweets, blog posts or product reviews has become highly attractive and is utilized in recommender systems, market predictions, business intelligence and more. Deep learning techniques are becoming top performers on analyzing such texts. There are however several problems that need to be solved for efficient use of deep neural networks on text mining and text polarity analysis…
▽ More
Sentiment polarity of tweets, blog posts or product reviews has become highly attractive and is utilized in recommender systems, market predictions, business intelligence and more. Deep learning techniques are becoming top performers on analyzing such texts. There are however several problems that need to be solved for efficient use of deep neural networks on text mining and text polarity analysis. First, deep neural networks need to be fed with data sets that are big in size as well as properly labeled. Second, there are various uncertainties regarding the use of word embedding vectors: should they be generated from the same data set that is used to train the model or it is better to source them from big and popular collections? Third, to simplify model creation it is convenient to have generic neural network architectures that are effective and can adapt to various texts, encapsulating much of design complexity. This thesis addresses the above problems to provide methodological and practical insights for utilizing neural networks on sentiment analysis of texts and achieving state of the art results. Regarding the first problem, the effectiveness of various crowdsourcing alternatives is explored and two medium-sized and emotion-labeled song data sets are created utilizing social tags. To address the second problem, a series of experiments with large text collections of various contents and domains were conducted, trying word embeddings of various parameters. Regarding the third problem, a series of experiments involving convolution and max-pooling neural layers were conducted. Combining convolutions of words, bigrams, and trigrams with regional max-pooling layers in a couple of stacks produced the best results. The derived architecture achieves competitive performance on sentiment polarity analysis of movie, business and product reviews.
△ Less
Submitted 6 October, 2018;
originally announced October 2018.
-
Head-up Displays (HUD) in driving
Authors:
Marcos Maroto,
Enrique Caño,
Pavel González,
Diego Villegas
Abstract:
Head Up Displays (HUDs) were designed originally to present at the usual viewpoints of the pilot the main sensor data during aircraft missions, because of placing instrument information in the forward field of view enhances pilots ability to utilize both instrument and environmental information simultaneously. The first civilian motor vehicle had a monochrome HUD that was released in 1988 by Gener…
▽ More
Head Up Displays (HUDs) were designed originally to present at the usual viewpoints of the pilot the main sensor data during aircraft missions, because of placing instrument information in the forward field of view enhances pilots ability to utilize both instrument and environmental information simultaneously. The first civilian motor vehicle had a monochrome HUD that was released in 1988 by General Motors as a technological improvement of HeadDown Display (HDD) interface, which is commonly used in automobile industry. The HUD reduces the number and duration of the drivers sight deviations from the road, by projecting the required information directly into the drivers line of vision. There are many studies about ways of presenting the information: standard oneearpiece presentation, threedimensional audio presentation, visual only or audiovisual presentation. Results have shown that using a 3D auditory display the time of acquiring targets is approximately 2.2 seconds faster than using a oneearpiece way. Nevertheless, a disadvantage is when the drivers attention unconsciously shifts away from the road and goes focused on processing the information presented by the HUD. By this reason, the time, the way and the channel are important to represent the information on a HUD. A solution is a context aware multimodal proactive recommended system that features personalized content combined with the use of car sensors to determine when the information has to be presented.
△ Less
Submitted 22 March, 2018;
originally announced March 2018.
-
Combination of Upper and Lower Probabilities
Authors:
Jose E. Cano,
Serafin Moral,
Juan F. Verdegay-Lopez
Abstract:
In this paper, we consider several types of information and methods of combination associated with incomplete probabilistic systems. We discriminate between 'a priori' and evidential information. The former one is a description of the whole population, the latest is a restriction based on observations for a particular case. Then, we propose different combination methods for each one of them. W…
▽ More
In this paper, we consider several types of information and methods of combination associated with incomplete probabilistic systems. We discriminate between 'a priori' and evidential information. The former one is a description of the whole population, the latest is a restriction based on observations for a particular case. Then, we propose different combination methods for each one of them. We also consider conditioning as the heterogeneous combination of 'a priori' and evidential information. The evidential information is represented as a convex set of likelihood functions. These will have an associated possibility distribution with behavior according to classical Possibility Theory.
△ Less
Submitted 20 March, 2013;
originally announced March 2013.
-
Run Control and Monitor System for the CMS Experiment
Authors:
M. Bellato,
L. Berti,
V. Brigljevic,
G. Bruno,
E. Cano,
S. Cittolin,
A. Csilling,
S. Erhan,
D. Gigi,
F. Glege,
R. Gomez-Reino,
M. Gulmini,
J. Gutleber,
C. Jacobs,
M. Kozlovszky,
H. Larsen,
I. Magrans,
G. Maron,
F. Meijers,
E. Meschi,
S. Murray,
A. Oh,
L. Orsini,
L. Pollet,
A. Racz
, et al. (8 additional authors not shown)
Abstract:
The Run Control and Monitor System (RCMS) of the CMS experiment is the set of hardware and software components responsible for controlling and monitoring the experiment during data-taking. It provides users with a "virtual counting room", enabling them to operate the experiment and to monitor detector status and data quality from any point in the world. This paper describes the architecture of t…
▽ More
The Run Control and Monitor System (RCMS) of the CMS experiment is the set of hardware and software components responsible for controlling and monitoring the experiment during data-taking. It provides users with a "virtual counting room", enabling them to operate the experiment and to monitor detector status and data quality from any point in the world. This paper describes the architecture of the RCMS with particular emphasis on its scalability through a distributed collection of nodes arranged in a tree-based hierarchy. The current implementation of the architecture in a prototype RCMS used in test beam setups, detector validations and DAQ demonstrators is documented. A discussion of the key technologies used, including Web Services, and the results of tests performed with a 128-node system are presented.
△ Less
Submitted 18 June, 2003;
originally announced June 2003.