Zum Hauptinhalt springen

Showing 1–40 of 40 results for author: Ong, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11779  [pdf, other

    cs.LG cs.LO

    Compact Proofs of Model Performance via Mechanistic Interpretability

    Authors: Jason Gross, Rajashree Agrawal, Thomas Kwa, Euan Ong, Chun Hei Yip, Alex Gibson, Soufiane Noubir, Lawrence Chan

    Abstract: We propose using mechanistic interpretability -- techniques for reverse engineering model weights into human-interpretable algorithms -- to derive and compactly prove formal guarantees on model performance. We prototype this approach by formally proving lower bounds on the accuracy of 151 small transformers trained on a Max-of-$K$ task. We create 102 different computer-assisted proof strategies an… ▽ More

    Submitted 21 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: accepted to ICML 2024 Workshop on Mechanistic Interpretability (Spotlight)

  2. arXiv:2405.17921  [pdf

    cs.AI cs.CY

    Towards Clinical AI Fairness: Filling Gaps in the Puzzle

    Authors: Mingxuan Liu, Yilin Ning, Salinelat Teixayavong, Xiaoxuan Liu, Mayli Mertens, Yuqing Shang, Xin Li, Di Miao, Jie Xu, Daniel Shu Wei Ting, Lionel Tim-Ee Cheng, Jasmine Chiat Ling Ong, Zhen Ling Teo, Ting Fang Tan, Narrendar RaviChandran, Fei Wang, Leo Anthony Celi, Marcus Eng Hock Ong, Nan Liu

    Abstract: The ethical integration of Artificial Intelligence (AI) in healthcare necessitates addressing fairness-a concept that is highly context-specific across medical fields. Extensive studies have been conducted to expand the technical components of AI fairness, while tremendous calls for AI fairness have been raised from healthcare. Despite this, a significant disconnect persists between technical adva… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  3. arXiv:2404.10250  [pdf, other

    cs.PL cs.HC cs.MM

    AniFrame: A Programming Language for 2D Drawing and Frame-Based Animation

    Authors: Mark Edward M. Gonzales, Hans Oswald A. Ibrahim, Elyssia Barrie H. Ong, Ryan Austin Fernandez

    Abstract: Creative coding is an experimentation-heavy activity that requires translating high-level visual ideas into code. However, most languages and libraries for creative coding may not be adequately intuitive for beginners. In this paper, we present AniFrame, a domain-specific language for drawing and animation. Designed for novice programmers, it (i) features animation-specific data types, operations,… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted for paper presentation at the 24th Philippine Computing Science Congress (PCSC 2024), held in Laguna, Philippines

    ACM Class: D.3.2; J.5

  4. arXiv:2403.06999  [pdf

    cs.LG cs.AI cs.CY

    Survival modeling using deep learning, machine learning and statistical methods: A comparative analysis for predicting mortality after hospital admission

    Authors: Ziwen Wang, Jin Wee Lee, Tanujit Chakraborty, Yilin Ning, Mingxuan Liu, Feng Xie, Marcus Eng Hock Ong, Nan Liu

    Abstract: Survival analysis is essential for studying time-to-event outcomes and providing a dynamic understanding of the probability of an event occurring over time. Various survival analysis techniques, from traditional statistical models to state-of-the-art machine learning algorithms, support healthcare intervention and policy decisions. However, there remains ongoing discussion about their comparative… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  5. arXiv:2403.05235  [pdf

    cs.LG cs.AI cs.CY

    Fairness-Aware Interpretable Modeling (FAIM) for Trustworthy Machine Learning in Healthcare

    Authors: Mingxuan Liu, Yilin Ning, Yuhe Ke, Yuqing Shang, Bibhas Chakraborty, Marcus Eng Hock Ong, Roger Vaughan, Nan Liu

    Abstract: The escalating integration of machine learning in high-stakes fields such as healthcare raises substantial concerns about model fairness. We propose an interpretable framework - Fairness-Aware Interpretable Modeling (FAIM), to improve model fairness without compromising performance, featuring an interactive interface to identify a "fairer" model from a set of high-performing models and promoting t… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  6. arXiv:2403.05229  [pdf

    cs.AI

    Developing Federated Time-to-Event Scores Using Heterogeneous Real-World Survival Data

    Authors: Siqi Li, Yuqing Shang, Ziwen Wang, Qiming Wu, Chuan Hong, Yilin Ning, Di Miao, Marcus Eng Hock Ong, Bibhas Chakraborty, Nan Liu

    Abstract: Survival analysis serves as a fundamental component in numerous healthcare applications, where the determination of the time to specific events (such as the onset of a certain disease or death) for patients is crucial for clinical decision-making. Scoring systems are widely used for swift and efficient risk prediction. However, existing methods for constructing survival scores presume that data or… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  7. arXiv:2401.02789  [pdf

    q-bio.GN cs.CL

    Large Language Models in Plant Biology

    Authors: Hilbert Yuen In Lam, Xing Er Ong, Marek Mutwil

    Abstract: Large Language Models (LLMs), such as ChatGPT, have taken the world by storm and have passed certain forms of the Turing test. However, LLMs are not limited to human language and analyze sequential data, such as DNA, protein, and gene expression. The resulting foundation models can be repurposed to identify the complex patterns within the data, resulting in powerful, multi-purpose prediction tools… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  8. arXiv:2312.09230  [pdf, other

    cs.LG cs.AI cs.CL

    Successor Heads: Recurring, Interpretable Attention Heads In The Wild

    Authors: Rhys Gould, Euan Ong, George Ogden, Arthur Conmy

    Abstract: In this work we present successor heads: attention heads that increment tokens with a natural ordering, such as numbers, months, and days. For example, successor heads increment 'Monday' into 'Tuesday'. We explain the successor head behavior with an approach rooted in mechanistic interpretability, the field that aims to explain how models complete tasks in human-understandable terms. Existing rese… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: 12 main text pages, with appendix

  9. arXiv:2312.08764  [pdf, other

    cs.CV

    CattleEyeView: A Multi-task Top-down View Cattle Dataset for Smarter Precision Livestock Farming

    Authors: Kian Eng Ong, Sivaji Retta, Ramarajulu Srinivasan, Shawn Tan, Jun Liu

    Abstract: Cattle farming is one of the important and profitable agricultural industries. Employing intelligent automated precision livestock farming systems that can count animals, track the animals and their poses will raise productivity and significantly reduce the heavy burden on its already limited labor pool. To achieve such intelligent systems, a large cattle video dataset is essential in developing a… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: Published at VCIP 2023. Dataset and code available at https://github.com/AnimalEyeQ/CattleEyeView

  10. arXiv:2311.03417  [pdf

    cs.LG cs.AI

    Federated Learning for Clinical Structured Data: A Benchmark Comparison of Engineering and Statistical Approaches

    Authors: Siqi Li, Di Miao, Qiming Wu, Chuan Hong, Danny D'Agostino, Xin Li, Yilin Ning, Yuqing Shang, Huazhu Fu, Marcus Eng Hock Ong, Hamed Haddadi, Nan Liu

    Abstract: Federated learning (FL) has shown promising potential in safeguarding data privacy in healthcare collaborations. While the term "FL" was originally coined by the engineering community, the statistical field has also explored similar privacy-preserving algorithms. Statistical FL algorithms, however, remain considerably less recognized than their engineering counterparts. Our goal was to bridge the… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  11. arXiv:2311.02107  [pdf

    cs.LG cs.AI cs.CY

    Generative Artificial Intelligence in Healthcare: Ethical Considerations and Assessment Checklist

    Authors: Yilin Ning, Salinelat Teixayavong, Yuqing Shang, Julian Savulescu, Vaishaanth Nagaraj, Di Miao, Mayli Mertens, Daniel Shu Wei Ting, Jasmine Chiat Ling Ong, Mingxuan Liu, Jiuwen Cao, Michael Dunn, Roger Vaughan, Marcus Eng Hock Ong, Joseph Jao-Yiu Sung, Eric J Topol, Nan Liu

    Abstract: The widespread use of ChatGPT and other emerging technology powered by generative artificial intelligence (GenAI) has drawn much attention to potential ethical issues, especially in high-stakes applications such as healthcare, but ethical discussions are yet to translate into operationalisable solutions. Furthermore, ongoing ethical discussions often neglect other types of GenAI that have been use… ▽ More

    Submitted 23 February, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

  12. arXiv:2309.00236  [pdf, other

    cs.LG cs.CL cs.CR

    Image Hijacks: Adversarial Images can Control Generative Models at Runtime

    Authors: Luke Bailey, Euan Ong, Stuart Russell, Scott Emmons

    Abstract: Are foundation models secure against malicious actors? In this work, we focus on the image input to a vision-language model (VLM). We discover image hijacks, adversarial images that control the behaviour of VLMs at inference time, and introduce the general Behaviour Matching algorithm for training image hijacks. From this, we derive the Prompt Matching method, allowing us to train hijacks matching… ▽ More

    Submitted 22 April, 2024; v1 submitted 31 August, 2023; originally announced September 2023.

    Comments: Project page at https://image-hijacks.github.io

  13. arXiv:2305.04258  [pdf, other

    cs.DB cs.HC

    From Unstructured to Structured: Transforming Chatbot Dialogues into Data Mart Schema for Visualization

    Authors: Mark Edward M. Gonzales, Elyssia Barrie H. Ong, Charibeth K. Cheng, Ethel Chua Joy Ong, Judith J. Azcarraga

    Abstract: Schools are among the primary avenues for public healthcare interventions. With resource limitations posing challenges to the routine conduct of health and wellness checks in Philippine public schools, the deployment of a chatbot-assisted health monitoring system may provide an alternative method. However, deriving insights from raw conversations is not straightforward due to the expressiveness of… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

    Comments: Accepted for paper presentation at the 23rd Philippine Computing Science Congress (PCSC 2023), held in Cebu, Philippines

    ACM Class: H.3; H.5.2

  14. arXiv:2304.13493  [pdf

    cs.CY cs.AI

    Towards clinical AI fairness: A translational perspective

    Authors: Mingxuan Liu, Yilin Ning, Salinelat Teixayavong, Mayli Mertens, Jie Xu, Daniel Shu Wei Ting, Lionel Tim-Ee Cheng, Jasmine Chiat Ling Ong, Zhen Ling Teo, Ting Fang Tan, Ravi Chandran Narrendar, Fei Wang, Leo Anthony Celi, Marcus Eng Hock Ong, Nan Liu

    Abstract: Artificial intelligence (AI) has demonstrated the ability to extract insights from data, but the issue of fairness remains a concern in high-stakes fields such as healthcare. Despite extensive discussion and efforts in algorithm development, AI fairness and clinical concerns have not been adequately addressed. In this paper, we discuss the misalignment between technical and clinical perspectives o… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  15. Federated and distributed learning applications for electronic health records and structured medical data: A scoping review

    Authors: Siqi Li, Pinyan Liu, Gustavo G. Nascimento, Xinru Wang, Fabio Renato Manzolli Leite, Bibhas Chakraborty, Chuan Hong, Yilin Ning, Feng Xie, Zhen Ling Teo, Daniel Shu Wei Ting, Hamed Haddadi, Marcus Eng Hock Ong, Marco Aurélio Peres, Nan Liu

    Abstract: Federated learning (FL) has gained popularity in clinical research in recent years to facilitate privacy-preserving collaboration. Structured data, one of the most prevalent forms of clinical data, has experienced significant growth in volume concurrently, notably with the widespread adoption of electronic health records in clinical practice. This review examines FL applications on structured medi… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  16. arXiv:2304.03779  [pdf

    cs.LG cs.AI cs.CY

    A roadmap to fair and trustworthy prediction model validation in healthcare

    Authors: Yilin Ning, Victor Volovici, Marcus Eng Hock Ong, Benjamin Alan Goldstein, Nan Liu

    Abstract: A prediction model is most useful if it generalizes beyond the development data with external validations, but to what extent should it generalize remains unclear. In practice, prediction models are externally validated using data from very different settings, including populations from other health systems or countries, with predictably poor results. This may not be a fair reflection of the perfo… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

    Comments: 12 pages, 2 figures

  17. arXiv:2303.00282  [pdf

    cs.LG cs.AI cs.CR

    FedScore: A privacy-preserving framework for federated scoring system development

    Authors: Siqi Li, Yilin Ning, Marcus Eng Hock Ong, Bibhas Chakraborty, Chuan Hong, Feng Xie, Han Yuan, Mingxuan Liu, Daniel M. Buckland, Yong Chen, Nan Liu

    Abstract: We propose FedScore, a privacy-preserving federated learning framework for scoring system generation across multiple sites to facilitate cross-institutional collaborations. The FedScore framework includes five modules: federated variable ranking, federated variable transformation, federated score derivation, federated model selection and federated model evaluation. To illustrate usage and assess F… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

  18. arXiv:2212.08541  [pdf, other

    cs.LG cs.AI stat.ML

    Learnable Commutative Monoids for Graph Neural Networks

    Authors: Euan Ong, Petar Veličković

    Abstract: Graph neural networks (GNNs) have been shown to be highly sensitive to the choice of aggregation function. While summing over a node's neighbours can approximate any permutation-invariant function over discrete inputs, Cohen-Karlik et al. [2020] proved there are set-aggregation problems for which summing cannot generalise to unbounded inputs, proposing recurrent neural networks regularised towards… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

    Comments: Accepted to the proceedings of the First Learning on Graphs Conference (LoG 2022)

  19. Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques

    Authors: Mingxuan Liu, Siqi Li, Han Yuan, Marcus Eng Hock Ong, Yilin Ning, Feng Xie, Seyed Ehsan Saffari, Victor Volovici, Bibhas Chakraborty, Nan Liu

    Abstract: Objective: The proper handling of missing values is critical to delivering reliable estimates and decisions, especially in high-stakes fields such as clinical research. The increasing diversity and complexity of data have led many researchers to develop deep learning (DL)-based imputation techniques. We conducted a systematic review to evaluate the use of these techniques, with a particular focus… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

  20. arXiv:2206.04050  [pdf

    cs.LG cs.HC

    Balanced background and explanation data are needed in explaining deep learning models with SHAP: An empirical study on clinical decision making

    Authors: Mingxuan Liu, Yilin Ning, Han Yuan, Marcus Eng Hock Ong, Nan Liu

    Abstract: Objective: Shapley additive explanations (SHAP) is a popular post-hoc technique for explaining black box models. While the impact of data imbalance on predictive models has been extensively studied, it remains largely unknown with respect to SHAP-based model explanations. This study sought to investigate the effects of data imbalance on SHAP explanations for deep learning models, and to propose a… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

  21. arXiv:2205.09841  [pdf, other

    cs.CV

    Single-cell Subcellular Protein Localisation Using Novel Ensembles of Diverse Deep Architectures

    Authors: Syed Sameed Husain, Eng-Jon Ong, Dmitry Minskiy, Mikel Bober-Irizar, Amaia Irizar, Miroslaw Bober

    Abstract: Unravelling protein distributions within individual cells is key to understanding their function and state and indispensable to developing new treatments. Here we present the Hybrid subCellular Protein Localiser (HCPL), which learns from weakly labelled data to robustly localise single-cell subcellular protein patterns. It comprises innovative DNN architectures exploiting wavelet filters and learn… ▽ More

    Submitted 16 September, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

  22. arXiv:2204.08129  [pdf, other

    cs.CV

    Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding

    Authors: Xun Long Ng, Kian Eng Ong, Qichen Zheng, Yun Ni, Si Yong Yeo, Jun Liu

    Abstract: Understanding animals' behaviors is significant for a wide range of applications. However, existing animal behavior datasets have limitations in multiple aspects, including limited numbers of animal classes, data samples and provided tasks, and also limited variations in environmental conditions and viewpoints. To address these limitations, we create a large and diverse dataset, Animal Kingdom, th… ▽ More

    Submitted 3 June, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

    Comments: Accepted by CVPR2022 (Oral). Dataset: https://sutdcv.github.io/Animal-Kingdom

  23. arXiv:2202.08407  [pdf

    cs.LG

    AutoScore-Ordinal: An interpretable machine learning framework for generating scoring models for ordinal outcomes

    Authors: Seyed Ehsan Saffari, Yilin Ning, Xie Feng, Bibhas Chakraborty, Victor Volovici, Roger Vaughan, Marcus Eng Hock Ong, Nan Liu

    Abstract: Background: Risk prediction models are useful tools in clinical decision-making which help with risk stratification and resource allocations and may lead to a better health care for patients. AutoScore is a machine learning-based automatic clinical score generator for binary outcomes. This study aims to expand the AutoScore framework to provide a tool for interpretable risk prediction for ordinal… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

  24. arXiv:2201.03291  [pdf

    cs.LG

    A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study

    Authors: Yilin Ning, Siqi Li, Marcus Eng Hock Ong, Feng Xie, Bibhas Chakraborty, Daniel Shu Wei Ting, Nan Liu

    Abstract: Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors, but such 'black box' variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach usin… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

  25. arXiv:2111.11017  [pdf

    cs.LG

    Benchmarking emergency department triage prediction models with machine learning and large public electronic health records

    Authors: Feng Xie, Jun Zhou, Jin Wee Lee, Mingrui Tan, Siqi Li, Logasan S/O Rajnthern, Marcel Lucas Chee, Bibhas Chakraborty, An-Kwok Ian Wong, Alon Dagan, Marcus Eng Hock Ong, Fei Gao, Nan Liu

    Abstract: The demand for emergency department (ED) services is increasing across the globe, particularly during the current COVID-19 pandemic. Clinical triage and risk assessment have become increasingly challenging due to the shortage of medical resources and the strain on hospital infrastructure caused by the pandemic. As a result of the widespread use of electronic health records (EHRs), we now have acce… ▽ More

    Submitted 20 March, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

  26. arXiv:2111.09649  [pdf

    cs.CE

    HRnV-Calc: A software package for heart rate n-variability and heart rate variability analysis

    Authors: Chenglin Niu, Dagang Guo, Marcus Eng Hock Ong, Zhi Xiong Koh, Andrew Fu Wah Ho, Zhiping Lin, Chengyu Liu, Gari D. Clifford, Nan Liu

    Abstract: Objective: Heart rate variability (HRV) has been proven to be an important indicator of physiological status for numerous applications. Despite the progress and active developments made in HRV metric research over the last few decades, the representation of the heartbeat sequence upon which HRV is based has received relatively little attention. The recently introduced heart rate n-variability (HRn… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

  27. arXiv:2110.02484  [pdf

    cs.LG cs.HC

    Shapley variable importance clouds for interpretable machine learning

    Authors: Yilin Ning, Marcus Eng Hock Ong, Bibhas Chakraborty, Benjamin Alan Goldstein, Daniel Shu Wei Ting, Roger Vaughan, Nan Liu

    Abstract: Interpretable machine learning has been focusing on explaining final models that optimize performance. The current state-of-the-art is the Shapley additive explanations (SHAP) that locally explains variable impact on individual predictions, and it is recently extended for a global assessment across the dataset. Recently, Dong and Rudin proposed to extend the investigation to models from the same c… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

  28. arXiv:2110.00157  [pdf, other

    cs.CL cs.LG

    Under the Microscope: Interpreting Readability Assessment Models for Filipino

    Authors: Joseph Marvin Imperial, Ethel Ong

    Abstract: Readability assessment is the process of identifying the level of ease or difficulty of a certain piece of text for its intended audience. Approaches have evolved from the use of arithmetic formulas to more complex pattern-recognizing models trained using machine learning algorithms. While using these approaches provide competitive results, limited work is done on analyzing how linguistic variable… ▽ More

    Submitted 30 September, 2021; originally announced October 2021.

    Comments: Accepted for oral presentation at PACLIC 2021

  29. arXiv:2108.00241  [pdf

    cs.CL cs.LG

    Diverse Linguistic Features for Assessing Reading Difficulty of Educational Filipino Texts

    Authors: Joseph Marvin Imperial, Ethel Ong

    Abstract: In order to ensure quality and effective learning, fluency, and comprehension, the proper identification of the difficulty levels of reading materials should be observed. In this paper, we describe the development of automatic machine learning-based readability assessment models for educational Filipino texts using the most diverse set of linguistic features for the language. Results show that usi… ▽ More

    Submitted 31 July, 2021; originally announced August 2021.

    Comments: Accepted at ICCE 2021

  30. Deep learning for temporal data representation in electronic health records: A systematic review of challenges and methodologies

    Authors: Feng Xie, Han Yuan, Yilin Ning, Marcus Eng Hock Ong, Mengling Feng, Wynne Hsu, Bibhas Chakraborty, Nan Liu

    Abstract: Objective: Temporal electronic health records (EHRs) can be a wealth of information for secondary uses, such as clinical events prediction or chronic disease management. However, challenges exist for temporal data representation. We therefore sought to identify these challenges and evaluate novel methodologies for addressing them through a systematic examination of deep learning solutions. Metho… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.

  31. AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data

    Authors: Han Yuan, Feng Xie, Marcus Eng Hock Ong, Yilin Ning, Marcel Lucas Chee, Seyed Ehsan Saffari, Hairil Rizal Abdullah, Benjamin Alan Goldstein, Bibhas Chakraborty, Nan Liu

    Abstract: Background: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among a wide variety of decision-making models for determining the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. Its current framework, however, still leaves room for… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

  32. arXiv:2107.04458  [pdf, other

    cs.LG cs.CV

    Understanding the Distributions of Aggregation Layers in Deep Neural Networks

    Authors: Eng-Jon Ong, Sameed Husain, Miroslaw Bober

    Abstract: The process of aggregation is ubiquitous in almost all deep nets models. It functions as an important mechanism for consolidating deep features into a more compact representation, whilst increasing robustness to overfitting and providing spatial invariance in deep nets. In particular, the proximity of global aggregation layers to the output layers of DNNs mean that aggregated features have a direc… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

  33. AutoScore-Survival: Developing interpretable machine learning-based time-to-event scores with right-censored survival data

    Authors: Feng Xie, Yilin Ning, Han Yuan, Benjamin Alan Goldstein, Marcus Eng Hock Ong, Nan Liu, Bibhas Chakraborty

    Abstract: Scoring systems are highly interpretable and widely used to evaluate time-to-event outcomes in healthcare research. However, existing time-to-event scores are predominantly created ad-hoc using a few manually selected variables based on clinician's knowledge, suggesting an unmet need for a robust and efficient generic score-generating method. AutoScore was previously developed as an interpretabl… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  34. arXiv:2103.07277  [pdf, other

    cs.CL

    A Simple Post-Processing Technique for Improving Readability Assessment of Texts using Word Mover's Distance

    Authors: Joseph Marvin Imperial, Ethel Ong

    Abstract: Assessing the proper difficulty levels of reading materials or texts in general is the first step towards effective comprehension and learning. In this study, we improve the conventional methodology of automatic readability assessment by incorporating the Word Mover's Distance (WMD) of ranked texts as an additional post-processing technique to further ground the difficulty level given by a model.… ▽ More

    Submitted 19 September, 2021; v1 submitted 12 March, 2021; originally announced March 2021.

  35. arXiv:2101.10537  [pdf

    cs.CL cs.LG

    Application of Lexical Features Towards Improvement of Filipino Readability Identification of Children's Literature

    Authors: Joseph Marvin Imperial, Ethel Ong

    Abstract: Proper identification of grade levels of children's reading materials is an important step towards effective learning. Recent studies in readability assessment for the English domain applied modern approaches in natural language processing (NLP) such as machine learning (ML) techniques to automate the process. There is also a need to extract the correct linguistic features when modeling readabilit… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

    Comments: 8 tables, 1 figure. Presented at the Philippine Computing Science Congress 2020

  36. arXiv:1907.05794  [pdf, other

    cs.CV

    ACTNET: end-to-end learning of feature activations and multi-stream aggregation for effective instance image retrieval

    Authors: Syed Sameed Husain, Eng-Jon Ong, Miroslaw Bober

    Abstract: We propose a novel CNN architecture called ACTNET for robust instance image retrieval from large-scale datasets. Our key innovation is a learnable activation layer designed to improve the signal-to-noise ratio (SNR) of deep convolutional feature maps. Further, we introduce a controlled multi-stream aggregation, where complementary deep features from different convolutional layers are optimally tra… ▽ More

    Submitted 23 October, 2020; v1 submitted 12 July, 2019; originally announced July 2019.

  37. arXiv:1807.01026  [pdf, other

    cs.CV

    Deep Architectures and Ensembles for Semantic Video Classification

    Authors: Eng-Jon Ong, Sameed Husain, Mikel Bober-Irizar, Miroslaw Bober

    Abstract: This work addresses the problem of accurate semantic labelling of short videos. To this end, a multitude of different deep nets, ranging from traditional recurrent neural networks (LSTM, GRU), temporal agnostic networks (FV,VLAD,BoW), fully connected neural networks mid-stage AV fusion and others. Additionally, we also propose a residual architecture-based DNN for video classification, with state-… ▽ More

    Submitted 7 October, 2018; v1 submitted 3 July, 2018; originally announced July 2018.

  38. arXiv:1707.04272  [pdf, other

    cs.CV

    Cultivating DNN Diversity for Large Scale Video Labelling

    Authors: Mikel Bober-Irizar, Sameed Husain, Eng-Jon Ong, Miroslaw Bober

    Abstract: We investigate factors controlling DNN diversity in the context of the Google Cloud and YouTube-8M Video Understanding Challenge. While it is well-known that ensemble methods improve prediction performance, and that combining accurate but diverse predictors helps, there is little knowledge on how to best promote & measure DNN diversity. We show that diversity can be cultivated by some unexpected m… ▽ More

    Submitted 13 July, 2017; originally announced July 2017.

    Comments: CVPR 2017 Youtube-8M Workshop

  39. arXiv:1702.00338  [pdf, other

    cs.CV

    Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval

    Authors: Eng-Jon Ong, Sameed Husain, Miroslaw Bober

    Abstract: This paper addresses the problem of large scale image retrieval, with the aim of accurately ranking the similarity of a large number of images to a given query image. To achieve this, we propose a novel Siamese network. This network consists of two computational strands, each comprising of a CNN component followed by a Fisher vector component. The CNN component produces dense, deep convolutional d… ▽ More

    Submitted 1 February, 2017; originally announced February 2017.

  40. arXiv:cs/0108019  [pdf, ps, other

    cs.DC

    Scalable Unix Commands for Parallel Processors: A High-Performance Implementation

    Authors: E. Ong, E. Lusk, W. Gropp

    Abstract: We describe a family of MPI applications we call the Parallel Unix Commands. These commands are natural parallel versions of common Unix user commands such as ls, ps, and find, together with a few similar commands particular to the parallel environment. We describe the design and implementation of these programs and present some performance results on a 256-node Linux cluster. The Parallel Unix… ▽ More

    Submitted 27 August, 2001; originally announced August 2001.

    Comments: 9 pages, 2 figures

    Report number: ANL/MCS-P885-0601 ACM Class: D.1.3

    Journal ref: in Recent Advances in Parallel Virtual Machine and Message Passing Interface, eds. Y. Cotronis and J. Dongarra, Lecture Notes in Computer Science, Vol. 2131, Springer-Verlag, pp. 410-418, Sept. 2001.