Zum Hauptinhalt springen

Showing 1–44 of 44 results for author: Spanakis, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13469  [pdf, other

    cs.CL

    Fixed and Adaptive Simultaneous Machine Translation Strategies Using Adapters

    Authors: Abderrahmane Issam, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis

    Abstract: Simultaneous machine translation aims at solving the task of real-time translation by starting to translate before consuming the full input, which poses challenges in terms of balancing quality and latency of the translation. The wait-$k$ policy offers a solution by starting to translate after consuming $k$ words, where the choice of the number $k$ directly affects the latency and quality. In appl… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted at IWSLT 2024

  2. arXiv:2407.12451  [pdf, other

    cs.CY cs.CL cs.SI

    Across Platforms and Languages: Dutch Influencers and Legal Disclosures on Instagram, YouTube and TikTok

    Authors: Haoyang Gui, Thales Bertaglia, Catalina Goanta, Sybe de Vries, Gerasimos Spanakis

    Abstract: Content monetization on social media fuels a growing influencer economy. Influencer marketing remains largely undisclosed or inappropriately disclosed on social media. Non-disclosure issues have become a priority for national and supranational authorities worldwide, who are starting to impose increasingly harsher sanctions on them. This paper proposes a transparent methodology for measuring whethe… ▽ More

    Submitted 12 August, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

    Comments: Accept for publication at the 16th International Conference on Advances in Social Networks Analysis and Mining - ASONAM-2024

  3. arXiv:2407.09202  [pdf, other

    cs.CY cs.SI

    Influencer Self-Disclosure Practices on Instagram: A Multi-Country Longitudinal Study

    Authors: Thales Bertaglia, Catalina Goanta, Gerasimos Spanakis, Adriana Iamnitchi

    Abstract: This paper presents a longitudinal study of more than ten years of activity on Instagram consisting of over a million posts by 400 content creators from four countries: the US, Brazil, Netherlands and Germany. Our study shows differences in the professionalisation of content monetisation between countries, yet consistent patterns; significant differences in the frequency of posts yet similar user… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: submitted to Online Social Networks and Media

  4. arXiv:2405.04854  [pdf, other

    cs.LG

    Explaining Clustering of Ecological Momentary Assessment Data Through Temporal and Feature Attention

    Authors: Mandani Ntekouli, Gerasimos Spanakis, Lourens Waldorp, Anne Roefs

    Abstract: In the field of psychopathology, Ecological Momentary Assessment (EMA) studies offer rich individual data on psychopathology-relevant variables (e.g., affect, behavior, etc) in real-time. EMA data is collected dynamically, represented as complex multivariate time series (MTS). Such information is crucial for a better understanding of mental disorders at the individual- and group-level. More specif… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 24 pages, 12 figures, accepted at the World Conference on eXplainable Artificial Intelligence 2024

  5. arXiv:2405.00516  [pdf, other

    cs.LG cs.AI cs.CL

    Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning

    Authors: Lucas-Andreï Thil, Mirela Popa, Gerasimos Spanakis

    Abstract: Recent advancements in language models have demonstrated remarkable improvements in various natural language processing (NLP) tasks such as web navigation. Supervised learning (SL) approaches have achieved impressive performance while utilizing significantly less training data compared to previous methods. However, these SL-based models fall short when compared to reinforcement learning (RL) appro… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: ACM 2024, Avila Spain. 9 pages

    Report number: 9798400702433 MSC Class: 68T07 ACM Class: I.2.7; I.2.8; I.2.1

    Journal ref: Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, 2024

  6. arXiv:2404.02894  [pdf, other

    cs.CY cs.SI

    Automated Transparency: A Legal and Empirical Analysis of the Digital Services Act Transparency Database

    Authors: Rishabh Kaushal, Jacob van de Kerkhof, Catalina Goanta, Gerasimos Spanakis, Adriana Iamnitchi

    Abstract: The Digital Services Act (DSA) is a much awaited platforms liability reform in the European Union that was adopted on 1 November 2022 with the ambition to set a global example in terms of accountability and transparency. Among other obligations, the DSA emphasizes the need for online platforms to report on their content moderation decisions (`statements of reasons' - SoRs), which is a novel transp… ▽ More

    Submitted 3 May, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: accepted to FAccT 2024; camera-ready version; 19 pages

  7. arXiv:2403.19442  [pdf, other

    cs.LG

    Exploiting Individual Graph Structures to Enhance Ecological Momentary Assessment (EMA) Forecasting

    Authors: Mandani Ntekouli, Gerasimos Spanakis, Lourens Waldorp, Anne Roefs

    Abstract: In the evolving field of psychopathology, the accurate assessment and forecasting of data derived from Ecological Momentary Assessment (EMA) is crucial. EMA offers contextually-rich psychopathological measurements over time, that practically lead to Multivariate Time Series (MTS) data. Thus, many challenges arise in analysis from the temporal complexities inherent in emotional, behavioral, and con… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 9 pages, 3 figures, 2024 IEEE 40th International Conference on Data Engineering Workshops

  8. arXiv:2402.15059  [pdf, other

    cs.CL cs.IR

    ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval

    Authors: Antoine Louis, Vageesh Saxena, Gijs van Dijck, Gerasimos Spanakis

    Abstract: State-of-the-art neural retrievers predominantly focus on high-resource languages like English, which impedes their adoption in retrieval scenarios involving other languages. Current approaches circumvent the lack of high-quality labeled data in non-English languages by leveraging multilingual pretrained language models capable of cross-lingual transfer. However, these models require substantial t… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Under review. Code is available at https://github.com/ant-louis/xm-retrievers

  9. arXiv:2402.12332  [pdf, other

    cs.CL

    Triple-Encoders: Representations That Fire Together, Wire Together

    Authors: Justus-Jonas Erker, Florian Mai, Nils Reimers, Gerasimos Spanakis, Iryna Gurevych

    Abstract: Search-based dialog models typically re-encode the dialog history at every turn, incurring high cost. Curved Contrastive Learning, a representation learning method that encodes relative distances between utterances into the embedding space via a bi-encoder, has recently shown promising results for dialog modeling at far superior efficiency. While high efficiency is achieved through independently e… ▽ More

    Submitted 13 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: accepted at ACL 2024 (main conference)

  10. arXiv:2402.01416  [pdf, other

    cs.CL cs.AI cs.LG

    Sequence Shortening for Context-Aware Machine Translation

    Authors: Paweł Mąka, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis

    Abstract: Context-aware Machine Translation aims to improve translations of sentences by incorporating surrounding sentences as context. Towards this task, two main architectures have been applied, namely single-encoder (based on concatenation) and multi-encoder models. In this study, we show that a special case of multi-encoder architecture, where the latent representation of the source sentence is cached… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Findings of the ACL: EACL 2024

  11. arXiv:2310.07491  [pdf, other

    cs.LG

    Model-based Clustering of Individuals' Ecological Momentary Assessment Time-series Data for Improving Forecasting Performance

    Authors: Mandani Ntekouli, Gerasimos Spanakis, Lourens Waldorp, Anne Roefs

    Abstract: Through Ecological Momentary Assessment (EMA) studies, a number of time-series data is collected across multiple individuals, continuously monitoring various items of emotional behavior. Such complex data is commonly analyzed in an individual level, using personalized models. However, it is believed that additional information of similar individuals is likely to enhance these models leading to bet… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 17 pages, 7 figures, BNAIC/BeNeLearn 2023 (Joint International Scientific Conferences on AI and Machine Learning)

  12. arXiv:2310.05688  [pdf, other

    cs.CL

    Larth: Dataset and Machine Translation for Etruscan

    Authors: Gianluca Vico, Gerasimos Spanakis

    Abstract: Etruscan is an ancient language spoken in Italy from the 7th century BC to the 1st century AD. There are no native speakers of the language at the present day, and its resources are scarce, as there exist only around 12,000 known inscriptions. To the best of our knowledge, there are no publicly available Etruscan corpora for natural language processing. Therefore, we propose a dataset for machine… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  13. arXiv:2310.05553  [pdf, other

    cs.CL

    Regulation and NLP (RegNLP): Taming Large Language Models

    Authors: Catalina Goanta, Nikolaos Aletras, Ilias Chalkidis, Sofia Ranchordas, Gerasimos Spanakis

    Abstract: The scientific innovation in Natural Language Processing (NLP) and more broadly in artificial intelligence (AI) is at its fastest pace to date. As large language models (LLMs) unleash a new era of automation, important debates emerge regarding the benefits and risks of their development, deployment and use. Currently, these debates have been dominated by often polarized narratives mainly led by th… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: 9 pages, long paper at EMNLP 2023 proceedings

  14. arXiv:2310.05484  [pdf, other

    cs.CL cs.CY cs.LG

    IDTraffickers: An Authorship Attribution Dataset to link and connect Potential Human-Trafficking Operations on Text Escort Advertisements

    Authors: Vageesh Saxena, Benjamin Bashpole, Gijs Van Dijck, Gerasimos Spanakis

    Abstract: Human trafficking (HT) is a pervasive global issue affecting vulnerable individuals, violating their fundamental human rights. Investigations reveal that a significant number of HT cases are associated with online advertisements (ads), particularly in escort markets. Consequently, identifying and connecting HT vendors has become increasingly challenging for Law Enforcement Agencies (LEAs). To addr… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  15. arXiv:2309.17050  [pdf, other

    cs.CL

    Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models

    Authors: Antoine Louis, Gijs van Dijck, Gerasimos Spanakis

    Abstract: Many individuals are likely to face a legal dispute at some point in their lives, but their lack of understanding of how to navigate these complex issues often renders them vulnerable. The advancement of natural language processing opens new avenues for bridging this legal literacy gap through the development of automated legal aid systems. However, existing legal question answering (LQA) approach… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: Under review. Code is available at https://github.com/maastrichtlawtech/lleqa

  16. arXiv:2309.12764  [pdf, other

    cs.SI

    Multi-Modal Embeddings for Isolating Cross-Platform Coordinated Information Campaigns on Social Media

    Authors: Fabio Barbero, Sander op den Camp, Kristian van Kuijk, Carlos Soto García-Delgado, Gerasimos Spanakis, Adriana Iamnitchi

    Abstract: Coordinated multi-platform information operations are implemented in a variety of contexts on social media, including state-run disinformation campaigns, marketing strategies, and social activism. Characterized by the promotion of messages via multi-platform coordination, in which multiple user accounts, within a short time, post content advancing a shared informational agenda on multiple platform… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: To appear in the 5th Multidisciplinary International Symposium on Disinformation in Open Online Media (MISDOOM 2023)

    ACM Class: H.3.5; H.3.1

  17. arXiv:2306.05115  [pdf, ps, other

    cs.CL cs.SI

    Closing the Loop: Testing ChatGPT to Generate Model Explanations to Improve Human Labelling of Sponsored Content on Social Media

    Authors: Thales Bertaglia, Stefan Huber, Catalina Goanta, Gerasimos Spanakis, Adriana Iamnitchi

    Abstract: Regulatory bodies worldwide are intensifying their efforts to ensure transparency in influencer marketing on social media through instruments like the Unfair Commercial Practices Directive (UCPD) in the European Union, or Section 5 of the Federal Trade Commission Act. Yet enforcing these obligations has proven to be highly problematic due to the sheer scale of the influencer market. The task of au… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Accepted to The World Conference on eXplainable Artificial Intelligence, Lisbon, Portugal, July 2023

  18. arXiv:2305.02763  [pdf, other

    cs.CY cs.CL cs.CR cs.LG

    VendorLink: An NLP approach for Identifying & Linking Vendor Migrants & Potential Aliases on Darknet Markets

    Authors: Vageesh Saxena, Nils Rethmeier, Gijs Van Dijck, Gerasimos Spanakis

    Abstract: The anonymity on the Darknet allows vendors to stay undetected by using multiple vendor aliases or frequently migrating between markets. Consequently, illegal markets and their connections are challenging to uncover on the Darknet. To identify relationships between illegal markets and their vendors, we propose VendorLink, an NLP-based approach that examines writing patterns to verify, identify, an… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  19. arXiv:2301.12847  [pdf, other

    cs.IR cs.CL

    Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural Networks

    Authors: Antoine Louis, Gijs van Dijck, Gerasimos Spanakis

    Abstract: Statutory article retrieval (SAR), the task of retrieving statute law articles relevant to a legal question, is a promising application of legal text processing. In particular, high-quality SAR systems can improve the work efficiency of legal professionals and provide basic legal assistance to citizens in need at no cost. Unlike traditional ad-hoc information retrieval, where each document is cons… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: EACL 2023. Code is available at https://github.com/maastrichtlawtech/gdsr

  20. arXiv:2212.01159  [pdf, other

    cs.LG

    Clustering individuals based on multivariate EMA time-series data

    Authors: Mandani Ntekouli, Gerasimos Spanakis, Lourens Waldorp, Anne Roefs

    Abstract: In the field of psychopathology, Ecological Momentary Assessment (EMA) methodological advancements have offered new opportunities to collect time-intensive, repeated and intra-individual measurements. This way, a large amount of data has become available, providing the means for further exploring mental disorders. Consequently, advanced machine learning (ML) methods are needed to understand data c… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: 15 pages, 6 figures, Psychometrika

  21. arXiv:2211.07591  [pdf, other

    cs.CL

    Imagination is All You Need! Curved Contrastive Learning for Abstract Sequence Modeling Utilized on Long Short-Term Dialogue Planning

    Authors: Justus-Jonas Erker, Stefan Schaffer, Gerasimos Spanakis

    Abstract: Inspired by the curvature of space-time (Einstein, 1921), we introduce Curved Contrastive Learning (CCL), a novel representation learning technique for learning the relative turn distance between utterance pairs in multi-turn dialogues. The resulting bi-encoder models can guide transformers as a response ranking model towards a goal in a zero-shot fashion by projecting the goal utterance and the c… ▽ More

    Submitted 26 June, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: Accepted in ACL 2023 Findings

  22. arXiv:2204.01689  [pdf, other

    cs.LG

    Using Explainable Boosting Machine to Compare Idiographic and Nomothetic Approaches for Ecological Momentary Assessment Data

    Authors: Mandani Ntekouli, Gerasimos Spanakis, Lourens Waldorp, Anne Roefs

    Abstract: Previous research on EMA data of mental disorders was mainly focused on multivariate regression-based approaches modeling each individual separately. This paper goes a step further towards exploring the use of non-linear interpretable machine learning (ML) models in classification problems. ML models can enhance the ability to accurately predict the occurrence of different behaviors by recognizing… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: 13 pages, 2 figures, accepted on the symposium 'Intelligent Data Analysis' (2022)

  23. arXiv:2108.11792  [pdf, other

    cs.CL

    A Statutory Article Retrieval Dataset in French

    Authors: Antoine Louis, Gerasimos Spanakis

    Abstract: Statutory article retrieval is the task of automatically retrieving law articles relevant to a legal question. While recent advances in natural language processing have sparked considerable interest in many legal tasks, statutory article retrieval remains primarily untouched due to the scarcity of large-scale and high-quality annotated datasets. To address this bottleneck, we introduce the Belgian… ▽ More

    Submitted 15 March, 2022; v1 submitted 26 August, 2021; originally announced August 2021.

    Comments: ACL 2022. Code and dataset are available at https://github.com/maastrichtlawtech/bsard

  24. arXiv:2105.05975  [pdf, other

    cs.CL cs.LG

    Analysing The Impact Of Linguistic Features On Cross-Lingual Transfer

    Authors: Błażej Dolicki, Gerasimos Spanakis

    Abstract: There is an increasing amount of evidence that in cases with little or no data in a target language, training on a different language can yield surprisingly good results. However, currently there are no established guidelines for choosing the training (source) language. In attempt to solve this issue we thoroughly analyze a state-of-the-art multilingual model and try to determine what impacts good… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

  25. arXiv:2012.05633  [pdf, other

    cs.CV cs.LG

    Can we detect harmony in artistic compositions? A machine learning approach

    Authors: Adam Vandor, Marie van Vollenhoven, Gerhard Weiss, Gerasimos Spanakis

    Abstract: Harmony in visual compositions is a concept that cannot be defined or easily expressed mathematically, even by humans. The goal of the research described in this paper was to find a numerical representation of artistic compositions with different levels of harmony. We ask humans to rate a collection of grayscale images based on the harmony they convey. To represent the images, a set of special fea… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

    Comments: 9 pages, ICAART 2021

  26. arXiv:2011.00244  [pdf, other

    cs.CL

    Evaluating Bias In Dutch Word Embeddings

    Authors: Rodrigo Alejandro Chávez Mulsa, Gerasimos Spanakis

    Abstract: Recent research in Natural Language Processing has revealed that word embeddings can encode social biases present in the training data which can affect minorities in real world applications. This paper explores the gender bias implicit in Dutch embeddings while investigating whether English language based approaches can also be used in Dutch. We implement the Word Embeddings Association Test (WEAT… ▽ More

    Submitted 3 November, 2020; v1 submitted 31 October, 2020; originally announced November 2020.

    Comments: Accepted at GeBNLP 2020, data at https://github.com/Noixas/Official-Evaluating-Bias-In-Dutch

  27. arXiv:2010.16228  [pdf, ps, other

    cs.CL cs.AI cs.LG stat.ML

    "Thy algorithm shalt not bear false witness": An Evaluation of Multiclass Debiasing Methods on Word Embeddings

    Authors: Thalea Schlender, Gerasimos Spanakis

    Abstract: With the vast development and employment of artificial intelligence applications, research into the fairness of these algorithms has been increased. Specifically, in the natural language processing domain, it has been shown that social biases persist in word embeddings and are thus in danger of amplifying these biases when used. As an example of social bias, religious biases are shown to persist i… ▽ More

    Submitted 4 November, 2020; v1 submitted 30 October, 2020; originally announced October 2020.

    Comments: 15 pages, presented at BNAIC/BENELEARN 2020, data/code at https://github.com/thaleaschlender/An-Evaluation-of-Multiclass-Debiasing-Methods-on-Word-Embeddings

  28. arXiv:2005.12143  [pdf, other

    cs.CL

    Adapting End-to-End Speech Recognition for Readable Subtitles

    Authors: Danni Liu, Jan Niehues, Gerasimos Spanakis

    Abstract: Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time. Therefore, this work focuses on ASR with output compression, a task challenging for supervised approaches due to the scarcity of training data. We first investi… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: IWSLT 2020

  29. arXiv:2005.11185  [pdf, other

    cs.CL cs.SD eess.AS

    Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection

    Authors: Danni Liu, Gerasimos Spanakis, Jan Niehues

    Abstract: Encoder-decoder models provide a generic architecture for sequence-to-sequence tasks such as speech recognition and translation. While offline systems are often evaluated on quality metrics like word error rates (WER) and BLEU, latency is also a crucial factor in many practical use-cases. We propose three latency reduction techniques for chunk-based incremental inference and evaluate their efficie… ▽ More

    Submitted 13 October, 2020; v1 submitted 22 May, 2020; originally announced May 2020.

    Comments: Interspeech 2020

  30. arXiv:2001.11857  [pdf, ps, other

    cs.CL

    Hybrid Tiled Convolutional Neural Networks for Text Sentiment Classification

    Authors: Maria Mihaela Trusca, Gerasimos Spanakis

    Abstract: The tiled convolutional neural network (tiled CNN) has been applied only to computer vision for learning invariances. We adjust its architecture to NLP to improve the extraction of the most salient features for sentiment analysis. Knowing that the major drawback of the tiled CNN in the NLP field is its inflexible filter structure, we propose a novel architecture called hybrid tiled CNN that applie… ▽ More

    Submitted 31 January, 2020; originally announced January 2020.

    Comments: 8 pages, 2 figures, accepted for publication in the 12th International Conference on Agents and Artificial Intelligence (ICAART 2020), Malta, 22-24 February 2020

  31. arXiv:1909.09974  [pdf, other

    cs.LG cs.CV eess.IV

    LoGANv2: Conditional Style-Based Logo Generation with Generative Adversarial Networks

    Authors: Cedric Oeldorf, Gerasimos Spanakis

    Abstract: Domains such as logo synthesis, in which the data has a high degree of multi-modality, still pose a challenge for generative adversarial networks (GANs). Recent research shows that progressive training (ProGAN) and mapping network extensions (StyleGAN) enable both increased training stability for higher dimensional problems and better feature separation within the embedded latent space. However, t… ▽ More

    Submitted 22 September, 2019; originally announced September 2019.

    Comments: accepted for poster presentation at ICMLA 2019, data+code available: https://github.com/cedricoeldorf/ConditionalStyleGAN

  32. arXiv:1909.02809  [pdf, other

    cs.CL cs.AI

    #MeTooMaastricht: Building a chatbot to assist survivors of sexual harassment

    Authors: Tobias Bauer, Emre Devrim, Misha Glazunov, William Lopez Jaramillo, Balaganesh Mohan, Gerasimos Spanakis

    Abstract: Inspired by the recent social movement of #MeToo, we are building a chatbot to assist survivors of sexual harassment cases (designed for the city of Maastricht but can easily be extended). The motivation behind this work is twofold: properly assist survivors of such events by directing them to appropriate institutions that can offer them help and increase the incident documentation so as to gather… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

    Comments: 19 pages, accepted at SoGood2019 workshop (ECMLPKDD2019)

  33. arXiv:1903.02540  [pdf, other

    cs.LG stat.ML

    Autoregressive Convolutional Recurrent Neural Network for Univariate and Multivariate Time Series Prediction

    Authors: Matteo Maggiolo, Gerasimos Spanakis

    Abstract: Time Series forecasting (univariate and multivariate) is a problem of high complexity due the different patterns that have to be detected in the input, ranging from high to low frequencies ones. In this paper we propose a new model for timeseries prediction that utilizes convolutional layers for feature extraction, a recurrent encoder and a linear autoregressive component. We motivate the model an… ▽ More

    Submitted 6 March, 2019; originally announced March 2019.

    Comments: ESAN2019 accepted paper, 6 pages

  34. arXiv:1901.11467  [pdf, other

    cs.CL cs.AI

    Towards Controlled Transformation of Sentiment in Sentences

    Authors: Wouter Leeftink, Gerasimos Spanakis

    Abstract: An obstacle to the development of many natural language processing products is the vast amount of training examples necessary to get satisfactory results. The generation of these examples is often a tedious and time-consuming task. This paper this paper proposes a method to transform the sentiment of sentences in order to limit the work necessary to generate more training data. This means that one… ▽ More

    Submitted 31 January, 2019; originally announced January 2019.

    Comments: Accepted at ICAART 2019, 8 pages

  35. arXiv:1901.11462  [pdf, other

    cs.CL cs.AI

    Exploring the context of recurrent neural network based conversational agents

    Authors: Raffaele Piccini, Gerasimos Spanakis

    Abstract: Conversational agents have begun to rise both in the academic (in terms of research) and commercial (in terms of applications) world. This paper investigates the task of building a non-goal driven conversational agent, using neural network generative models and analyzes how the conversation context is handled. It compares a simpler Encoder-Decoder with a Hierarchical Recurrent Encoder-Decoder arch… ▽ More

    Submitted 31 January, 2019; originally announced January 2019.

    Comments: Accepted at ICAART 2019, 10 pages

  36. arXiv:1810.10395  [pdf, other

    cs.CV cs.AI cs.LG

    LoGAN: Generating Logos with a Generative Adversarial Neural Network Conditioned on color

    Authors: Ajkel Mino, Gerasimos Spanakis

    Abstract: Designing a logo is a long, complicated, and expensive process for any designer. However, recent advancements in generative algorithms provide models that could offer a possible solution. Logos are multi-modal, have very few categorical properties, and do not have a continuous latent space. Yet, conditional generative adversarial networks can be used to generate logos that could help designers in… ▽ More

    Submitted 23 October, 2018; originally announced October 2018.

    Comments: 6 page, ICMLA18

  37. arXiv:1810.07791  [pdf, other

    cs.CY cs.HC cs.LG cs.NE stat.ML

    MaaSim: A Liveability Simulation for Improving the Quality of Life in Cities

    Authors: Dominika Woszczyk, Gerasimos Spanakis

    Abstract: Urbanism is no longer planned on paper thanks to powerful models and 3D simulation platforms. However, current work is not open to the public and lacks an optimisation agent that could help in decision making. This paper describes the creation of an open-source simulation based on an existing Dutch liveability score with a built-in AI module. Features are selected using feature engineering and Ran… ▽ More

    Submitted 13 October, 2018; originally announced October 2018.

    Comments: 16 pages

  38. arXiv:1712.03249  [pdf, other

    cs.AI cs.CL cs.IR

    Social Emotion Mining Techniques for Facebook Posts Reaction Prediction

    Authors: Florian Krebs, Bruno Lubascher, Tobias Moers, Pieter Schaap, Gerasimos Spanakis

    Abstract: As of February 2016 Facebook allows users to express their experienced emotions about a post by using five so-called `reactions'. This research paper proposes and evaluates alternative methods for predicting these reactions to user posts on public pages of firms/companies (like supermarket chains). For this purpose, we collected posts (and their reactions) from Facebook pages of large supermarket… ▽ More

    Submitted 8 December, 2017; originally announced December 2017.

    Comments: 10 pages, 13 figures and accepted at ICAART 2018. (Dataset: https://github.com/jerryspan/FacebookR)

  39. arXiv:1710.05780  [pdf, other

    cs.CL cs.AI cs.IR

    A retrieval-based dialogue system utilizing utterance and context embeddings

    Authors: Alexander Bartl, Gerasimos Spanakis

    Abstract: Finding semantically rich and computer-understandable representations for textual dialogues, utterances and words is crucial for dialogue systems (or conversational agents), as their performance mostly depends on understanding the context of conversations. Recent research aims at finding distributed vector representations (embeddings) for words, such that semantically similar words are relatively… ▽ More

    Submitted 20 October, 2017; v1 submitted 16 October, 2017; originally announced October 2017.

    Comments: A shorter version is accepted at ICMLA2017 conference; acknowledgement added; typos corrected

  40. arXiv:1710.03323  [pdf, other

    cs.IR cs.LG

    Massive Open Online Courses Temporal Profiling for Dropout Prediction

    Authors: Tom Rolandus Hagedoorn, Gerasimos Spanakis

    Abstract: Massive Open Online Courses (MOOCs) are attracting the attention of people all over the world. Regardless the platform, numbers of registrants for online courses are impressive but in the same time, completion rates are disappointing. Understanding the mechanisms of dropping out based on the learner profile arises as a crucial task in MOOCs, since it will allow intervening at the right moment in o… ▽ More

    Submitted 9 October, 2017; originally announced October 2017.

    Comments: 8 pages, ICTAI17

  41. arXiv:1710.02368  [pdf, other

    stat.ML cs.DC cs.LG

    Accumulated Gradient Normalization

    Authors: Joeri Hermans, Gerasimos Spanakis, Rico Möckel

    Abstract: This work addresses the instability in asynchronous data parallel optimization. It does so by introducing a novel distributed optimizer which is able to efficiently optimize a centralized model under communication constraints. The optimizer achieves this by pushing a normalized sequence of first-order gradients to a parameter server. This implies that the magnitude of a worker delta is smaller com… ▽ More

    Submitted 6 October, 2017; originally announced October 2017.

    Comments: 16 pages, 12 figures, ACML2017

  42. arXiv:1707.00331  [pdf, other

    cs.IR

    Reciprocal Recommender System for Learners in Massive Open Online Courses (MOOCs)

    Authors: Sankalp Prabhakar, Gerasimos Spanakis, Osmar Zaiane

    Abstract: Massive open online courses (MOOC) describe platforms where users with completely different backgrounds subscribe to various courses on offer. MOOC forums and discussion boards offer learners a medium to communicate with each other and maximize their learning outcomes. However, oftentimes learners are hesitant to approach each other for different reasons (being shy, don't know the right match, etc… ▽ More

    Submitted 2 July, 2017; originally announced July 2017.

    Comments: 10 pages, accepted as full paper @ ICWL 2017

  43. arXiv:1607.01582  [pdf, other

    cs.LG

    Bagged Boosted Trees for Classification of Ecological Momentary Assessment Data

    Authors: Gerasimos Spanakis, Gerhard Weiss, Anne Roefs

    Abstract: Ecological Momentary Assessment (EMA) data is organized in multiple levels (per-subject, per-day, etc.) and this particular structure should be taken into account in machine learning algorithms used in EMA like decision trees and its variants. We propose a new algorithm called BBT (standing for Bagged Boosted Trees) that is enhanced by a over/under sampling method and can provide better estimates… ▽ More

    Submitted 6 July, 2016; originally announced July 2016.

    Comments: to be presented at ECAI2016

  44. AMSOM: Adaptive Moving Self-organizing Map for Clustering and Visualization

    Authors: Gerasimos Spanakis, Gerhard Weiss

    Abstract: Self-Organizing Map (SOM) is a neural network model which is used to obtain a topology-preserving mapping from the (usually high dimensional) input/feature space to an output/map space of fewer dimensions (usually two or three in order to facilitate visualization). Neurons in the output space are connected with each other but this structure remains fixed throughout training and learning is achieve… ▽ More

    Submitted 19 May, 2016; originally announced May 2016.

    Comments: ICAART 2016 accepted full paper