Zum Hauptinhalt springen

Showing 1–22 of 22 results for author: Rutherford, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15099  [pdf, other

    cs.LG cs.AI cs.RO

    No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery

    Authors: Alexander Rutherford, Michael Beukman, Timon Willi, Bruno Lacerda, Nick Hawes, Jakob Foerster

    Abstract: What data or environments to use for training to improve downstream performance is a longstanding and very topical question in reinforcement learning. In particular, Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula enable agents to be robust to in- and out-of-distribution tasks. We ask to what extent these methods are themselves robust when app… ▽ More

    Submitted 29 August, 2024; v1 submitted 27 August, 2024; originally announced August 2024.

  2. arXiv:2406.08055  [pdf, other

    cs.CL

    Learning Job Title Representation from Job Description Aggregation Network

    Authors: Napat Laosaengpha, Thanit Tativannarat, Chawan Piansaddhayanon, Attapol Rutherford, Ekapol Chuangsuwanich

    Abstract: Learning job title representation is a vital process for developing automatic human resource tools. To do so, existing methods primarily rely on learning the title representation through skills extracted from the job description, neglecting the rich and diverse content within. Thus, we propose an alternative framework for learning job titles through their respective job description (JD) and utiliz… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: to be published in Findings of the Association for Computational Linguistics: ACL 2024

  3. arXiv:2406.06000  [pdf

    cs.CL

    ThaiCoref: Thai Coreference Resolution Dataset

    Authors: Pontakorn Trakuekul, Wei Qi Leong, Charin Polpanumas, Jitkapat Sawatphol, William Chandra Tjhi, Attapol T. Rutherford

    Abstract: While coreference resolution is a well-established research area in Natural Language Processing (NLP), research focusing on Thai language remains limited due to the lack of large annotated corpora. In this work, we introduce ThaiCoref, a dataset for Thai coreference resolution. Our dataset comprises 777,271 tokens, 44,082 mentions and 10,429 entities across four text genres: university essays, new… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  4. arXiv:2405.07586  [pdf, other

    cs.CL

    Thai Universal Dependency Treebank

    Authors: Panyut Sriwirote, Wei Qi Leong, Charin Polpanumas, Santhawat Thanyawong, William Chandra Tjhi, Wirote Aroonmanakun, Attapol T. Rutherford

    Abstract: Automatic dependency parsing of Thai sentences has been underexplored, as evidenced by the lack of large Thai dependency treebanks with complete dependency structures and the lack of a published systematic evaluation of state-of-the-art models, especially transformer-based parsers. In this work, we address these problems by introducing Thai Universal Dependency Treebank (TUD), a new largest Thai t… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  5. arXiv:2311.12475  [pdf, other

    cs.CL cs.AI

    PhayaThaiBERT: Enhancing a Pretrained Thai Language Model with Unassimilated Loanwords

    Authors: Panyut Sriwirote, Jalinee Thapiang, Vasan Timtong, Attapol T. Rutherford

    Abstract: While WangchanBERTa has become the de facto standard in transformer-based Thai language modeling, it still has shortcomings in regard to the understanding of foreign words, most notably English words, which are often borrowed without orthographic assimilation into Thai in many contexts. We identify the lack of foreign vocabulary in WangchanBERTa's tokenizer as the main source of these shortcomings… ▽ More

    Submitted 28 December, 2023; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: revised to fix formatting error, content unchanged

  6. arXiv:2311.10090  [pdf, other

    cs.LG cs.AI cs.MA

    JaxMARL: Multi-Agent RL Environments in JAX

    Authors: Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Gardar Ingvarsson, Timon Willi, Akbir Khan, Christian Schroeder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert Tjarko Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktaschel, Chris Lu, Jakob Nicolaus Foerster

    Abstract: Benchmarks play an important role in the development of machine learning algorithms. For example, research in reinforcement learning (RL) has been heavily influenced by available environments and benchmarks. However, RL environments are traditionally run on the CPU, limiting their scalability with typical academic compute. Recent advancements in JAX have enabled the wider use of hardware accelerat… ▽ More

    Submitted 19 December, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

  7. arXiv:2204.07073  [pdf, other

    cs.CY

    Longitudinal Complex Dynamics of Labour Markets Reveal Increasing Polarisation

    Authors: Shahad Althobaiti, Ahmad Alabdulkareem, Judy Hanwen Shen, Iyad Rahwan, Morgan Frank, Esteban Moro, Alex Rutherford

    Abstract: In this paper we conduct a longitudinal analysis of the structure of labour markets in the US over 7 decades of technological, economic and policy change. We make use of network science, natural language processing and machine learning to uncover structural changes in the labour market over time. We find a steady rate of both disappearance of jobs and a shift in the required work tasks, despite mu… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

  8. arXiv:2202.12856  [pdf, ps, other

    physics.soc-ph cs.CY

    The Dynamic Resilience of Urban Labour Networks

    Authors: Xiangnan Feng, Alex Rutherford

    Abstract: Understanding and potentially predicting or even controlling urban labour markets represents a great challenge for workers and policy makers alike. Cities are effective engines of economic growth and prosperity and incubate complex dynamics within their labour market, and the labour markets they support demonstrate considerable diversity. This presents a challenge to policy makers who would like t… ▽ More

    Submitted 25 February, 2022; originally announced February 2022.

    Comments: 27pages, 5 figures

  9. arXiv:2108.10755  [pdf, ps, other

    cs.CL

    More Than Words: Collocation Tokenization for Latent Dirichlet Allocation Models

    Authors: Jin Cheevaprawatdomrong, Alexandra Schofield, Attapol T. Rutherford

    Abstract: Traditionally, Latent Dirichlet Allocation (LDA) ingests words in a collection of documents to discover their latent topics using word-document co-occurrences. However, it is unclear how to achieve the best results for languages without marked word boundaries such as Chinese and Thai. Here, we explore the use of Pearson's chi-squared test, t-statistics, and Word Pair Encoding (WPE) to produce toke… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

  10. arXiv:2103.11225  [pdf

    physics.soc-ph cs.SI

    The Network Limits of Infectious Disease Control via Occupation-Based Targeting

    Authors: Demetris Avraam, Nick Obradovich, Niccoló Pescetelli, Manuel Cebrian, Alex Rutherford

    Abstract: Policymakers commonly employ non-pharmaceutical interventions to manage the scale and severity of pandemics. Of non-pharmaceutical interventions, social distancing policies -- designed to reduce person-to-person pathogenic spread -- have risen to recent prominence. In particular, stay-at-home policies of the sort widely implemented around the globe in response to the COVID-19 pandemic have proven… ▽ More

    Submitted 20 March, 2021; originally announced March 2021.

  11. scb-mt-en-th-2020: A Large English-Thai Parallel Corpus

    Authors: Lalita Lowphansirikul, Charin Polpanumas, Attapol T. Rutherford, Sarana Nutanong

    Abstract: The primary objective of our work is to build a large-scale English-Thai dataset for machine translation. We construct an English-Thai machine translation dataset with over 1 million segment pairs, curated from various sources, namely news, Wikipedia articles, SMS messages, task-based dialogs, web-crawled data and government documents. Methodology for gathering data, building parallel texts and re… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 35 pages, 4 figures

  12. arXiv:1911.07056  [pdf

    cs.CL cs.LG

    AttaCut: A Fast and Accurate Neural Thai Word Segmenter

    Authors: Pattarawat Chormai, Ponrawee Prasertsom, Attapol Rutherford

    Abstract: Word segmentation is a fundamental pre-processing step for Thai Natural Language Processing. The current off-the-shelf solutions are not benchmarked consistently, so it is difficult to compare their trade-offs. We conducted a speed and accuracy comparison of the popular systems on three different domains and found that the state-of-the-art deep learning system is slow and moreover does not use sub… ▽ More

    Submitted 16 November, 2019; originally announced November 2019.

    Comments: 14 pages, 7 figures, accepted as oral presentation at New in ML Workshop, NeurIPS 2019

  13. arXiv:1908.10842  [pdf, other

    eess.IV cs.CV cs.LG

    Self-supervised Recurrent Neural Network for 4D Abdominal and In-utero MR Imaging

    Authors: Tong Zhang, Laurence H. Jackson, Alena Uus, James R. Clough, Lisa Story, Mary A. Rutherford, Joseph V. Hajnal, Maria Deprez

    Abstract: Accurately estimating and correcting the motion artifacts are crucial for 3D image reconstruction of the abdominal and in-utero magnetic resonance imaging (MRI). The state-of-art methods are based on slice-to-volume registration (SVR) where multiple 2D image stacks are acquired in three orthogonal orientations. In this work, we present a novel reconstruction pipeline that only needs one orientatio… ▽ More

    Submitted 28 August, 2019; originally announced August 2019.

    Comments: Accepted by MICCAI 2019 workshop on Machine Learning for Medical Image Reconstruction

  14. arXiv:1808.00160  [pdf, other

    cs.CY cs.CR econ.GN

    Mapping the Privacy-Utility Tradeoff in Mobile Phone Data for Development

    Authors: Alejandro Noriega-Campero, Alex Rutherford, Oren Lederman, Yves A. de Montjoye, Alex Pentland

    Abstract: Today's age of data holds high potential to enhance the way we pursue and monitor progress in the fields of development and humanitarian action. We study the relation between data utility and privacy risk in large-scale behavioral data, focusing on mobile phone metadata as paradigmatic domain. To measure utility, we survey experts about the value of mobile phone metadata at various spatial and tem… ▽ More

    Submitted 1 August, 2018; originally announced August 2018.

  15. arXiv:1606.06343  [pdf, other

    cs.SI physics.soc-ph stat.ML

    Twitter as a Source of Global Mobility Patterns for Social Good

    Authors: Mark Dredze, Manuel García-Herranz, Alex Rutherford, Gideon Mann

    Abstract: Data on human spatial distribution and movement is essential for understanding and analyzing social systems. However existing sources for this data are lacking in various ways; difficult to access, biased, have poor geographical or temporal resolution, or are significantly delayed. In this paper, we describe how geolocation data from Twitter can be used to estimate global mobility patterns and add… ▽ More

    Submitted 20 June, 2016; originally announced June 2016.

    Comments: Presented at 2016 ICML Workshop on #Data4Good: Machine Learning in Social Good Applications, New York, NY

  16. arXiv:1606.01990  [pdf, other

    cs.CL

    Neural Network Models for Implicit Discourse Relation Classification in English and Chinese without Surface Features

    Authors: Attapol T. Rutherford, Vera Demberg, Nianwen Xue

    Abstract: Inferring implicit discourse relations in natural language text is the most difficult subtask in discourse parsing. Surface features achieve good performance, but they are not readily applicable to other languages without semantic lexicons. Previous neural models require parses, surface features, or a small label set to work well. Here, we propose neural network models that are based on feedforwar… ▽ More

    Submitted 6 June, 2016; originally announced June 2016.

  17. arXiv:1605.07866  [pdf, other

    cs.CV

    DeepCut: Object Segmentation from Bounding Box Annotations using Convolutional Neural Networks

    Authors: Martin Rajchl, Matthew C. H. Lee, Ozan Oktay, Konstantinos Kamnitsas, Jonathan Passerat-Palmbach, Wenjia Bai, Mellisa Damodaram, Mary A. Rutherford, Joseph V. Hajnal, Bernhard Kainz, Daniel Rueckert

    Abstract: In this paper, we propose DeepCut, a method to obtain pixelwise object segmentations given an image dataset labelled with bounding box annotations. It extends the approach of the well-known GrabCut method to include machine learning by training a neural network classifier from bounding box annotations. We formulate the problem as an energy minimisation problem over a densely-connected conditional… ▽ More

    Submitted 5 June, 2016; v1 submitted 25 May, 2016; originally announced May 2016.

  18. arXiv:1601.06028  [pdf, other

    cs.CY cs.SI physics.soc-ph

    The International Postal Network and Other Global Flows As Proxies for National Wellbeing

    Authors: Desislava Hristova, Alex Rutherford, Jose Anson, Miguel Luengo-Oroz, Cecilia Mascolo

    Abstract: The digital exhaust left by flows of physical and digital commodities provides a rich measure of the nature, strength and significance of relationships between countries in the global network. With this work, we examine how these traces and the network structure can reveal the socioeconomic profile of different countries. We take into account multiple international networks of physical and digital… ▽ More

    Submitted 25 January, 2016; v1 submitted 22 January, 2016; originally announced January 2016.

  19. arXiv:1412.2595  [pdf, other

    cs.CY physics.soc-ph

    Estimating Food Consumption and Poverty Indices with Mobile Phone Data

    Authors: Adeline Decuyper, Alex Rutherford, Amit Wadhwa, Jean-Martin Bauer, Gautier Krings, Thoralf Gutierrez, Vincent D. Blondel, Miguel A. Luengo-Oroz

    Abstract: Recent studies have shown the value of mobile phone data to tackle problems related to economic development and humanitarian action. In this research, we assess the suitability of indicators derived from mobile phone data as a proxy for food security indicators. We compare the measures extracted from call detail records and airtime credit purchases to the results of a nationwide household survey c… ▽ More

    Submitted 22 November, 2014; originally announced December 2014.

  20. Flooding through the lens of mobile phone activity

    Authors: David Pastor-Escuredo, Alfredo Morales-Guzmán, Yolanda Torres-Fernández, Jean-Martin Bauer, Amit Wadhwa, Carlos Castro-Correa, Liudmyla Romanoff, Jong Gun Lee, Alex Rutherford, Vanessa Frias-Martinez, Nuria Oliver, Enrique Frias-Martinez, Miguel Luengo-Oroz

    Abstract: Natural disasters affect hundreds of millions of people worldwide every year. Emergency response efforts depend upon the availability of timely information, such as information concerning the movements of affected populations. The analysis of aggregated and anonymized Call Detail Records (CDR) captured from the mobile phone infrastructure provides new possibilities to characterize human behavior d… ▽ More

    Submitted 24 November, 2014; originally announced November 2014.

    Comments: Submitted to IEEE Global Humanitarian Technologies Conference (GHTC) 2014

    Journal ref: IEEE Global Humanitarian Technology Conference (GHTC), 2014 IEEE (pp. 279-286)

  21. arXiv:1304.5097  [pdf, other

    physics.soc-ph cs.CY cs.SI

    Targeted Social Mobilisation in a Global Manhunt

    Authors: Alex Rutherford, Manuel Cebrian, Iyad Rahwan, Sohan Dsouza, James McInerney, Victor Naroditskiy, Matteo Venanzi, Nicholas R. Jennings, J. R. deLara, Eero Wahlstedt, Steven U. Miller

    Abstract: Social mobilization, the ability to mobilize large numbers of people via social networks to achieve highly distributed tasks, has received significant attention in recent times. This growing capability, facilitated by modern communication technology, is highly relevant to endeavors which require the search for individuals that posses rare information or skill, such as finding medical doctors durin… ▽ More

    Submitted 6 April, 2014; v1 submitted 18 April, 2013; originally announced April 2013.

    Comments: 10 pages, 11 figures (Added Supplementary Information)

    Journal ref: PLoS One (2013) 8 (9)

  22. arXiv:1110.1409  [pdf, other

    physics.soc-ph cs.SI

    Good Fences: The Importance of Setting Boundaries for Peaceful Coexistence

    Authors: Alex Rutherford, Dion Harmon, Justin Werfel, Shlomiya Bar-Yam, Alexander Gard-Murray, Andreas Gros, Yaneer Bar-Yam

    Abstract: We consider the conditions of peace and violence among ethnic groups, testing a theory designed to predict the locations of violence and interventions that can promote peace. Characterizing the model's success in predicting peace requires examples where peace prevails despite diversity. Switzerland is recognized as a country of peace, stability and prosperity. This is surprising because of its lin… ▽ More

    Submitted 6 October, 2011; originally announced October 2011.

    Comments: paper pages 1-14, 4 figures; appendices pages 15-43, 20 figures

    Report number: NECSI 2011-10-01