Zum Hauptinhalt springen

Showing 1–26 of 26 results for author: Dias, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.21330  [pdf

    cs.CL

    Performance of Recent Large Language Models for a Low-Resourced Language

    Authors: Ravindu Jayakody, Gihan Dias

    Abstract: Large Language Models (LLMs) have shown significant advances in the past year. In addition to new versions of GPT and Llama, several other LLMs have been introduced recently. Some of these are open models available for download and modification. Although multilingual large language models have been available for some time, their performance on low-resourced languages such as Sinhala has been poo… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

  2. arXiv:2404.19359  [pdf, other

    cs.CL cs.AI

    Evaluating Lexicon Incorporation for Depression Symptom Estimation

    Authors: Kirill Milintsevich, Gaël Dias, Kairit Sirts

    Abstract: This paper explores the impact of incorporating sentiment, emotion, and domain-specific lexicons into a transformer-based model for depression symptom estimation. Lexicon information is added by marking the words in the input transcripts of patient-therapist conversations as well as in social media posts. Overall results show that the introduction of external knowledge within pre-trained language… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Accepted to Clinical NLP workshop at NAACL 2024

  3. arXiv:2403.00438  [pdf, other

    cs.CL

    Your Model Is Not Predicting Depression Well And That Is Why: A Case Study of PRIMATE Dataset

    Authors: Kirill Milintsevich, Kairit Sirts, Gaël Dias

    Abstract: This paper addresses the quality of annotations in mental health datasets used for NLP-based depression level estimation from social media texts. While previous research relies on social media-based datasets annotated with binary categories, i.e. depressed or non-depressed, recent datasets such as D2S and PRIMATE aim for nuanced annotations using PHQ-9 symptoms. However, most of these datasets rel… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  4. arXiv:2402.16815  [pdf, other

    cs.GR

    2+2D Texture for Full Positive Parallax Effect

    Authors: Alexandre Yip Gonçalves Dias, Marcelo Knörich Zuffo

    Abstract: The representation of parallax on virtual environment is still a problem to be studied. Common algorithms, such as Bump Mapping, Parallax Mapping and Displacement Mapping, treats this problem for small disparity between a real object and a simplified model. This work will introduce a new texture structure and one possible render algorithm able to display parallax for large disparities, it is an ap… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  5. arXiv:2311.15386  [pdf, other

    eess.IV cs.LG physics.med-ph

    Spectro-ViT: A Vision Transformer Model for GABA-edited MRS Reconstruction Using Spectrograms

    Authors: Gabriel Dias, Rodrigo Pommot Berto, Mateus Oliveira, Lucas Ueda, Sergio Dertkigil, Paula D. P. Costa, Amirmohammad Shamaei, Roberto Souza, Ashley Harris, Leticia Rittner

    Abstract: Purpose: To investigate the use of a Vision Transformer (ViT) to reconstruct/denoise GABA-edited magnetic resonance spectroscopy (MRS) from a quarter of the typically acquired number of transients using spectrograms. Theory and Methods: A quarter of the typically acquired number of transients collected in GABA-edited MRS scans are pre-processed and converted to a spectrogram image representation… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  6. arXiv:2107.02983  [pdf

    cs.CL

    SinSpell: A Comprehensive Spelling Checker for Sinhala

    Authors: Upuli Liyanapathirana, Kaumini Gunasinghe, Gihan Dias

    Abstract: We have built SinSpell, a comprehensive spelling checker for the Sinhala language which is spoken by over 16 million people, mainly in Sri Lanka. However, until recently, Sinhala had no spelling checker with acceptable coverage. Sinspell is still the only open source Sinhala spelling checker. SinSpell identifies possible spelling errors and suggests corrections. It also contains a module which aut… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

  7. arXiv:2012.13436  [pdf, other

    cs.CL

    ThamizhiUDp: A Dependency Parser for Tamil

    Authors: Kengatharaiyer Sarveswaran, Gihan Dias

    Abstract: This paper describes how we developed a neural-based dependency parser, namely ThamizhiUDp, which provides a complete pipeline for the dependency parsing of the Tamil language text using Universal Dependency formalism. We have considered the phases of the dependency parsing pipeline and identified tools and resources in each of these phases to improve the accuracy and to tackle data scarcity. Tham… ▽ More

    Submitted 24 December, 2020; originally announced December 2020.

    Comments: 5 Pages, Published at ICON2020: 17th International Conference on Natural Language Processing (December 18-21, 2020)

  8. arXiv:2011.02821  [pdf

    cs.CL

    Data Augmentation and Terminology Integration for Domain-Specific Sinhala-English-Tamil Statistical Machine Translation

    Authors: Aloka Fernando, Surangika Ranathunga, Gihan Dias

    Abstract: Out of vocabulary (OOV) is a problem in the context of Machine Translation (MT) in low-resourced languages. When source and/or target languages are morphologically rich, it becomes even worse. Bilingual list integration is an approach to address the OOV problem. This allows more words to be translated than are in the training data. However, since bilingual lists contain words in the base form, it… ▽ More

    Submitted 3 February, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

  9. arXiv:2007.00999  [pdf

    cs.DB

    ER model Partitioning: Towards Trustworthy Automated Systems Development

    Authors: Dhammika Pieris, M. C Wijegunesekera, N. G. J Dias

    Abstract: In database development, a conceptual model is created, in the form of an Entity-relationship(ER) model, and transformed to a relational database schema (RDS) to create the database. However, some important information represented on the ER model may not be transformed and represented on the RDS. This situation causes a loss of information during the transformation process. With a view to preservi… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: 9 pages, 5 figures. International Journal of Advanced Computer Science and Applications, 2020

  10. arXiv:2002.12482  [pdf

    cs.SE

    An Improved Generic ER Schema for Conceptual Modeling of Information Systems

    Authors: Dhammika Pieris, M. C Wijegunesekera, N. G. J. Dias

    Abstract: The Entity-Relationship (ER) model is widely used for creating ER schemas for modeling application domains in the field of Information Systems development. However, when an ER schema is transformed to a Relational Database Schema (RDS), some important information on the ER schema may not be represented meaningfully on the RDS. This causes a loss of information during the transformation process. Al… ▽ More

    Submitted 27 February, 2020; originally announced February 2020.

    Comments: 5 pages, 5 figures, Proceedings of the Asia International Conference on Multidisciplinary Research 2019, Colombo, Sri Lanka, Vol.-1

  11. arXiv:1904.07656  [pdf, other

    cs.CY cs.AI cs.LG

    The Verbal and Non Verbal Signals of Depression -- Combining Acoustics, Text and Visuals for Estimating Depression Level

    Authors: Syed Arbaaz Qureshi, Mohammed Hasanuzzaman, Sriparna Saha, Gaël Dias

    Abstract: Depression is a serious medical condition that is suffered by a large number of people around the world. It significantly affects the way one feels, causing a persistent lowering of mood. In this paper, we propose a novel attention-based deep neural network which facilitates the fusion of various modalities. We use this network to regress the depression level. Acoustic, text and visual modalities… ▽ More

    Submitted 2 April, 2019; originally announced April 2019.

    Comments: 10 pages including references, 2 figures

  12. Measuring the Correlation of Personal Identity Documents in Structured Format

    Authors: Sachithra Dangalla, Chanaka Lakmal, Chamin Wickramarathna, Chandu Herath, Gihan Dias, Shantha Fernando

    Abstract: Personal identity documents play a major role in every citizen's life and the authorities responsible for validating them typically require human intervention to manually cross-check multiple documents belonging to an individual. The world is rapidly replacing physical documents with digital documents where every piece of data is stored digitally in a machine-readable and structured format. In thi… ▽ More

    Submitted 11 November, 2021; v1 submitted 7 January, 2019; originally announced January 2019.

    Comments: 17th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2018)

  13. IDStack -- The Common Protocol for Document Verification built on Digital Signatures

    Authors: Chanaka Lakmal, Sachithra Dangalla, Chandu Herath, Chamin Wickramarathna, Gihan Dias, Shantha Fernando

    Abstract: The use of physical documents is inconvenient and inefficient in today's world, which motivates us to move towards the use of digital documents. Digital documents can solve many problems of inefficiency of data management but proving their authenticity and verifying them is still a problem. This paper presents a solution for this problem using text extraction, digital signatures and a correlation… ▽ More

    Submitted 12 November, 2021; v1 submitted 7 January, 2019; originally announced January 2019.

    Comments: 35th National Information Technology Conference (NITC 2017) in partnership with South East Asia Regional Computer Confederation (SEARCC)

  14. arXiv:1807.10076  [pdf, other

    cs.CL

    Concurrent Learning of Semantic Relations

    Authors: Georgios Balikas, Gaël Dias, Rumen Moraliyski, Massih-Reza Amini

    Abstract: Discovering whether words are semantically related and identifying the specific semantic relation that holds between them is of crucial importance for NLP as it is essential for tasks like query expansion in IR. Within this context, different methodologies have been proposed that either exclusively focus on a single lexical relation (e.g. hypernymy vs. random) or learn specific classifiers capable… ▽ More

    Submitted 30 July, 2018; v1 submitted 26 July, 2018; originally announced July 2018.

    Comments: 10 pages

  15. arXiv:1607.03607  [pdf, other

    cs.NI

    Cloud Empowered Self-Managing WSNs

    Authors: Gabriel Martins Dias, Cintia Borges Margi, Filipe C. P. de Oliveira, Boris Bellalta

    Abstract: Wireless Sensor Networks (WSNs) are composed of low powered and resource-constrained wireless sensor nodes that are not capable of performing high-complexity algorithms. Integrating these networks into the Internet of Things (IoT) facilitates their real-time optimization based on remote data visualization and analysis. This work describes the design and implementation of a scalable system architec… ▽ More

    Submitted 13 July, 2016; originally announced July 2016.

    Comments: 12 pages, 4200 words, 4 figures, 2 tables, submitted to "IEEE Communications Magazine" special issue on the Internet of Things

    ACM Class: C.1.3; C.2.4

  16. arXiv:1607.03443  [pdf, other

    cs.NI

    A Survey about Prediction-Based Data Reduction in Wireless Sensor Networks

    Authors: Gabriel Martins Dias, Boris Bellalta, Simon Oechsner

    Abstract: One of the main characteristics of Wireless Sensor Networks (WSNs) is the constrained energy resources of their wireless sensor nodes. Although this issue has been addressed in several works and got a lot of attention within the years, the most recent advances pointed out that the energy harvesting and wireless charging techniques may offer means to overcome such a limitation. Consequently, an iss… ▽ More

    Submitted 12 July, 2016; originally announced July 2016.

    Comments: 37 pages, 6 figures, 3 tables. Submitted to ACM Computing Surveys

    ACM Class: C.2.4; I.2; A.1

  17. Performance Optimization of WSNs using External Information

    Authors: Gabriel Martins Dias

    Abstract: The goal of this work is to describe a self-management system that correlates data sensed by different Wireless Sensor Networks (WSNs) and adjusts the number of active nodes in each network to provide an appropriate amount of measurements. The architecture considers the factors that make the external data relevant to the local network, such as the distance between covered areas, the relation betwe… ▽ More

    Submitted 12 July, 2016; originally announced July 2016.

    Comments: Published in: IEEE 14th International Symposium and Workshops on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), 2013 (copyright has been transferred to IEEE)

    ACM Class: D.2.11; C.2.1

  18. arXiv:1606.02193  [pdf, other

    cs.NI cs.LG eess.SY

    Adapting Sampling Interval of Sensor Networks Using On-Line Reinforcement Learning

    Authors: Gabriel Martins Dias, Maddalena Nurchis, Boris Bellalta

    Abstract: Monitoring Wireless Sensor Networks (WSNs) are composed of sensor nodes that report temperature, relative humidity, and other environmental parameters. The time between two successive measurements is a critical parameter to set during the WSN configuration because it can impact the WSN's lifetime, the wireless medium contention and the quality of the reported data. As trends in monitored parameter… ▽ More

    Submitted 12 July, 2016; v1 submitted 7 June, 2016; originally announced June 2016.

    Comments: 6 pages, 2 figures, submitted to the IEEE World Forum on Internet of Things 2016

    ACM Class: C.2.4; I.2.1

  19. arXiv:1605.09011  [pdf, other

    eess.SY cs.NI

    A Self-Managed Architecture for Sensor Networks Based on Real Time Data Analysis

    Authors: Gabriel Martins Dias, Toni Adame, Boris Bellalta, Simon Oechsner

    Abstract: Wireless sensor networks (WSNs) have been adopted as merely data producers for years. However, the data collected by WSNs can also be used to manage their operation and avoid unnecessary measurements that do not provide any new knowledge about the environment. The benefits are twofold because wireless sensor nodes may save their limited energy resources and also reduce the wireless medium occupanc… ▽ More

    Submitted 12 July, 2016; v1 submitted 29 May, 2016; originally announced May 2016.

    Comments: 3 pages, 3 figures, demo proposal, accepted in the Future Technologies Conference IEEE 2016

    ACM Class: D.2.11; H.4.3; C.2.1

  20. arXiv:1604.01275  [pdf, other

    cs.NI

    On the importance and feasibility of forecasting data in sensors

    Authors: Gabriel Martins Dias, Boris Bellalta, Simon Oechsner

    Abstract: The first generation of wireless sensor nodes have constrained energy resources and computational power, which discourages applications to process any task other than measuring and transmitting towards a central server. However, nowadays, sensor networks tend to be incorporated into the Internet of Things and the hardware evolution may change the old strategy of avoiding data computation in the se… ▽ More

    Submitted 5 April, 2016; originally announced April 2016.

    Comments: 30 pages and 12 figures. This paper has been submitted to the Transactions on Mobile Computing journal

    MSC Class: 62P30 ACM Class: C.2.4; C.2.1

  21. The Impact of Dual Prediction Schemes on the Reduction of the Number of Transmissions in Sensor Networks

    Authors: Gabriel Martins Dias, Boris Bellalta, Simon Oechsner

    Abstract: Future Internet of Things (IoT) applications will require that billions of wireless devices transmit data to the cloud frequently. However, the wireless medium access is pointed as a problem for the next generations of wireless networks; hence, the number of data transmissions in Wireless Sensor Networks (WSNs) can quickly become a bottleneck, disrupting the exponential growth in the number of int… ▽ More

    Submitted 30 August, 2017; v1 submitted 29 September, 2015; originally announced September 2015.

    Comments: 30 pages, 8 figures

    MSC Class: 62P30 ACM Class: C.2.4; C.2.1

    Journal ref: Computer Communications 112C (2017) pp. 58-72

  22. arXiv:1505.03662  [pdf, other

    cs.AI cs.CY

    Predicting Occupancy Trends in Barcelona's Bicycle Service Stations Using Open Data

    Authors: Gabriel Martins Dias, Boris Bellalta, Simon Oechsner

    Abstract: In 2008, the CEO of the company that manages and maintains the public bicycle service in Barcelona recognized that one may not expect to always find a place to leave the rented bike nearby their destination, similarly to the case when, driving a car, people may not find a parking lot. In this work, we make predictions about the statuses of the stations of the public bicycle service in Barcelona. W… ▽ More

    Submitted 6 August, 2015; v1 submitted 14 May, 2015; originally announced May 2015.

    Comments: 7 pages, 7 figures, 1 table, accepted to SAI Intelligent Systems Conference 2015

    MSC Class: 68-06 ACM Class: I.2.M

  23. arXiv:1501.01254  [pdf

    cs.CL

    Unknown Words Analysis in POS tagging of Sinhala Language

    Authors: A. J. P. M. P. Jayaweera, N. G. J. Dias

    Abstract: Part of Speech (POS) is a very vital topic in Natural Language Processing (NLP) task in any language, which involves analysing the construction of the language, behaviours and the dynamics of the language, the knowledge that could be utilized in computational linguistics analysis and automation applications. In this context, dealing with unknown words (words do not appear in the lexicon referred a… ▽ More

    Submitted 6 January, 2015; originally announced January 2015.

    Comments: 7 pages

    ACM Class: I.2.7

  24. arXiv:1409.1001  [pdf, ps, other

    cs.NI

    Towards information-centric WSN simulations

    Authors: Gabriel Martins Dias, Boris Bellalta, Simon Oechsner

    Abstract: In pursuance of integrating Wireless Sensor Networks (WSNs) with other systems, the use of techniques from other fields, such as machine learning and information processing, are becoming more common. Therefore, we faced the problem of missing network simulations that are not only focused on the packet exchange between network elements, but also in the data that is transmitted between them. In othe… ▽ More

    Submitted 3 September, 2014; originally announced September 2014.

    Comments: Published in: A. Förster, C. Sommer, T. Steinbach, M. Wählisch (Eds.), Proc. of 1st OMNeT++ Community Summit, Hamburg, Germany, September 2, 2014, arXiv:1409.0093, 2014

    Report number: OMNET/2014/09

  25. arXiv:1407.2989  [pdf

    cs.CL

    Hidden Markov Model Based Part of Speech Tagger for Sinhala Language

    Authors: A. J. P. M. P. Jayaweera, N. G. J. Dias

    Abstract: In this paper we present a fundamental lexical semantics of Sinhala language and a Hidden Markov Model (HMM) based Part of Speech (POS) Tagger for Sinhala language. In any Natural Language processing task, Part of Speech is a very vital topic, which involves analysing of the construction, behaviour and the dynamics of the language, which the knowledge could utilized in computational linguistics an… ▽ More

    Submitted 10 July, 2014; originally announced July 2014.

    Comments: This paper contains 15 Pages, the paper was presented at ICONACC 2014, organized by Manipur University, Imphal, India

    ACM Class: I.2.7

    Journal ref: International Journal on Natural Language Computing (IJNLC) Vol. 3, No.3, June 2014. Pages 9-23

  26. A Centralized Mechanism to Make Predictions Based on Data From Multiple WSNs

    Authors: Gabriel Martins Dias, Simon Oechsner, Boris Bellalta

    Abstract: In this work, we present a method that exploits a scenario with inter-Wireless Sensor Networks (WSNs) information exchange by making predictions and adapting the workload of a WSN according to their outcomes. We show the feasibility of an approach that intelligently utilizes information produced by other WSNs that may or not belong to the same administrative domain. To illustrate how the predictio… ▽ More

    Submitted 12 July, 2016; v1 submitted 3 July, 2014; originally announced July 2014.

    Comments: 10 pages, simulation results and figures. Published in

    ACM Class: D.2.11; C.2.1

    Journal ref: Multiple Access Communications, Lecture Notes in Computer Science, Volume 9305, pp 19-32, 2015