Skip to main content

Showing 1–50 of 298 results for author: Wong, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.11511  [pdf, other

    cs.AI cs.CL cs.LG

    Reasoning with Large Language Models, a Survey

    Authors: Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back

    Abstract: Scaling up language models to billions of parameters has opened up possibilities for in-context learning, allowing instruction tuning and few-shot learning on tasks that the model was not specifically trained for. This has achieved breakthrough performance on language tasks such as translation, summarization, and question-answering. Furthermore, in addition to these associative "System 1" tasks, r… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  2. arXiv:2407.09285  [pdf, other

    cs.CV

    MetaFood CVPR 2024 Challenge on Physically Informed 3D Food Reconstruction: Methods and Results

    Authors: Jiangpeng He, Yuhao Chen, Gautham Vinod, Talha Ibn Mahmud, Fengqing Zhu, Edward Delp, Alexander Wong, Pengcheng Xi, Ahmad AlMughrabi, Umair Haroon, Ricardo Marques, Petia Radeva, Jiadong Tang, Dianyi Yang, Yu Gao, Zhaoxiang Liang, Yawei Jueluo, Chengyu Shi, Pengyu Wang

    Abstract: The increasing interest in computer vision applications for nutrition and dietary monitoring has led to the development of advanced 3D reconstruction techniques for food items. However, the scarcity of high-quality data and limited collaboration between industry and academia have constrained progress in this field. Building on recent advancements in 3D reconstruction, we host the MetaFood Workshop… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Technical report for MetaFood CVPR 2024 Challenge on Physically Informed 3D Food Reconstruction. arXiv admin note: substantial text overlap with arXiv:2407.01717

  3. arXiv:2407.00242  [pdf, other

    cs.CL

    EHRmonize: A Framework for Medical Concept Abstraction from Electronic Health Records using Large Language Models

    Authors: João Matos, Jack Gallifant, Jian Pei, A. Ian Wong

    Abstract: Electronic health records (EHRs) contain vast amounts of complex data, but harmonizing and processing this information remains a challenging and costly task requiring significant clinical expertise. While large language models (LLMs) have shown promise in various healthcare applications, their potential for abstracting medical concepts from EHRs remains largely unexplored. We introduce EHRmonize,… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: submitted for review, total of 10 pages

  4. arXiv:2406.13750  [pdf, other

    eess.IV cs.CV cs.LG

    Empowering Tuberculosis Screening with Explainable Self-Supervised Deep Neural Networks

    Authors: Neel Patel, Alexander Wong, Ashkan Ebadi

    Abstract: Tuberculosis persists as a global health crisis, especially in resource-limited populations and remote regions, with more than 10 million individuals newly infected annually. It stands as a stark symbol of inequity in public health. Tuberculosis impacts roughly a quarter of the global populace, with the majority of cases concentrated in eight countries, accounting for two-thirds of all tuberculosi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 9 pages, 3 figures

  5. arXiv:2406.03582  [pdf, other

    cs.CV cs.AI

    Understanding the Limitations of Diffusion Concept Algebra Through Food

    Authors: E. Zhixuan Zeng, Yuhao Chen, Alexander Wong

    Abstract: Image generation techniques, particularly latent diffusion models, have exploded in popularity in recent years. Many techniques have been developed to manipulate and clarify the semantic concepts these large-scale models learn, offering crucial insights into biases and concept relationships. However, these techniques are often only validated in conventional realms of human or animal faces and arti… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  6. arXiv:2405.17315  [pdf, other

    cs.CV

    All-day Depth Completion

    Authors: Vadim Ezhov, Hyoungseob Park, Zhaoyang Zhang, Rishi Upadhyay, Howard Zhang, Chethan Chinder Chandrappa, Achuta Kadambi, Yunhao Ba, Julie Dorsey, Alex Wong

    Abstract: We propose a method for depth estimation under different illumination conditions, i.e., day and night time. As photometry is uninformative in regions under low-illumination, we tackle the problem through a multi-sensor fusion approach, where we take as input an additional synchronized sparse point cloud (i.e., from a LiDAR) projected onto the image plane as a sparse depth map, along with a camera… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 8 pages, 4 figures

  7. arXiv:2405.08717  [pdf, other

    cs.CV cs.AI

    How Much You Ate? Food Portion Estimation on Spoons

    Authors: Aaryam Sharma, Chris Czarnecki, Yuhao Chen, Pengcheng Xi, Linlin Xu, Alexander Wong

    Abstract: Monitoring dietary intake is a crucial aspect of promoting healthy living. In recent years, advances in computer vision technology have facilitated dietary intake monitoring through the use of images and depth cameras. However, the current state-of-the-art image-based food portion estimation algorithms assume that users take images of their meals one or two times, which can be inconvenient and fai… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  8. arXiv:2405.08049  [pdf, other

    eess.IV cs.CV

    Optimizing Synthetic Correlated Diffusion Imaging for Breast Cancer Tumour Delineation

    Authors: Chi-en Amy Tai, Alexander Wong

    Abstract: Breast cancer is a significant cause of death from cancer in women globally, highlighting the need for improved diagnostic imaging to enhance patient outcomes. Accurate tumour identification is essential for diagnosis, treatment, and monitoring, emphasizing the importance of advanced imaging technologies that provide detailed views of tumour characteristics and disease. Synthetic correlated diffus… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  9. arXiv:2405.07869  [pdf, other

    eess.IV cs.CV

    Enhancing Clinically Significant Prostate Cancer Prediction in T2-weighted Images through Transfer Learning from Breast Cancer

    Authors: Chi-en Amy Tai, Alexander Wong

    Abstract: In 2020, prostate cancer saw a staggering 1.4 million new cases, resulting in over 375,000 deaths. The accurate identification of clinically significant prostate cancer is crucial for delivering effective treatment to patients. Consequently, there has been a surge in research exploring the application of deep neural networks to predict clinical significance based on magnetic resonance images. Howe… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  10. arXiv:2405.07861  [pdf, other

    eess.IV cs.CV

    Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion Imaging

    Authors: Chi-en Amy Tai, Alexander Wong

    Abstract: Breast cancer was diagnosed for over 7.8 million women between 2015 to 2020. Grading plays a vital role in breast cancer treatment planning. However, the current tumor grading method involves extracting tissue from patients, leading to stress, discomfort, and high medical costs. A recent paper leveraging volumetric deep radiomic features from synthetic correlated diffusion imaging (CDI$^s$) for br… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  11. arXiv:2405.07854  [pdf, other

    eess.IV cs.CV

    Using Multiparametric MRI with Optimized Synthetic Correlated Diffusion Imaging to Enhance Breast Cancer Pathologic Complete Response Prediction

    Authors: Chi-en Amy Tai, Alexander Wong

    Abstract: In 2020, 685,000 deaths across the world were attributed to breast cancer, underscoring the critical need for innovative and effective breast cancer treatment. Neoadjuvant chemotherapy has recently gained popularity as a promising treatment strategy for breast cancer, attributed to its efficacy in shrinking large tumors and leading to pathologic complete response. However, the current process to r… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  12. arXiv:2405.07814  [pdf, other

    cs.CV

    NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images

    Authors: Matthew Keller, Chi-en Amy Tai, Yuhao Chen, Pengcheng Xi, Alexander Wong

    Abstract: Many aging individuals encounter challenges in effectively tracking their dietary intake, exacerbating their susceptibility to nutrition-related health complications. Self-reporting methods are often inaccurate and suffer from substantial bias; however, leveraging intelligent prediction methods can automate and enhance precision in this process. Recent work has explored using computer vision predi… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  13. arXiv:2405.07121  [pdf, other

    cs.CV

    In The Wild Ellipse Parameter Estimation for Circular Dining Plates and Bowls

    Authors: Akil Pathiranage, Chris Czarnecki, Yuhao Chen, Pengcheng Xi, Linlin Xu, Alexander Wong

    Abstract: Ellipse estimation is an important topic in food image processing because it can be leveraged to parameterize plates and bowls, which in turn can be used to estimate camera view angles and food portion sizes. Automatically detecting the elliptical rim of plates and bowls and estimating their ellipse parameters for data "in-the-wild" is challenging: diverse camera angles and plate shapes could have… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  14. arXiv:2405.03662  [pdf, other

    cs.CV

    Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation

    Authors: Dong Lao, Congli Wang, Alex Wong, Stefano Soatto

    Abstract: We describe a method for recovering the irradiance underlying a collection of images corrupted by atmospheric turbulence. Since supervised data is often technically impossible to obtain, assumptions and biases have to be imposed to solve this inverse problem, and we choose to model them explicitly. Rather than initializing a latent irradiance ("template") by heuristics to estimate deformation, we… ▽ More

    Submitted 24 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  15. arXiv:2404.10295  [pdf, other

    cs.RO

    ControlMTR: Control-Guided Motion Transformer with Scene-Compliant Intention Points for Feasible Motion Prediction

    Authors: Jiawei Sun, Chengran Yuan, Shuo Sun, Shanze Wang, Yuhang Han, Shuailei Ma, Zefan Huang, Anthony Wong, Keng Peng Tee, Marcelo H. Ang Jr

    Abstract: The ability to accurately predict feasible multimodal future trajectories of surrounding traffic participants is crucial for behavior planning in autonomous vehicles. The Motion Transformer (MTR), a state-of-the-art motion prediction method, alleviated mode collapse and instability during training and enhanced overall prediction performance by replacing conventional dense future endpoints with a s… ▽ More

    Submitted 17 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  16. arXiv:2404.03635  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    WorDepth: Variational Language Prior for Monocular Depth Estimation

    Authors: Ziyao Zeng, Daniel Wang, Fengyu Yang, Hyoungseob Park, Yangchao Wu, Stefano Soatto, Byung-Woo Hong, Dong Lao, Alex Wong

    Abstract: Three-dimensional (3D) reconstruction from a single image is an ill-posed problem with inherent ambiguities, i.e. scale. Predicting a 3D scene from text description(s) is similarly ill-posed, i.e. spatial arrangements of objects described. We investigate the question of whether two inherently ambiguous modalities can be used in conjunction to produce metric-scaled reconstructions. To test this, we… ▽ More

    Submitted 2 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  17. arXiv:2403.14874  [pdf, other

    cs.CV cs.LG

    WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather

    Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Nathan Wei, Matthew Waliman, Yunhao Ba, Celso de Melo, Alex Wong, Achuta Kadambi

    Abstract: We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first… ▽ More

    Submitted 7 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2312.09534

  18. arXiv:2403.12327  [pdf, other

    cs.CV cs.LG

    GT-Rain Single Image Deraining Challenge Report

    Authors: Howard Zhang, Yunhao Ba, Ethan Yang, Rishi Upadhyay, Alex Wong, Achuta Kadambi, Yun Guo, Xueyao Xiao, Xiaoxiong Wang, Yi Li, Yi Chang, Luxin Yan, Chaochao Zheng, Luping Wang, Bin Liu, Sunder Ali Khowaja, Jiseok Yoon, Ik-Hyun Lee, Zhao Zhang, Yanyan Wei, Jiahuan Ren, Suiyi Zhao, Huan Zheng

    Abstract: This report reviews the results of the GT-Rain challenge on single image deraining at the UG2+ workshop at CVPR 2023. The aim of this competition is to study the rainy weather phenomenon in real world scenarios, provide a novel real world rainy image dataset, and to spark innovative ideas that will further the development of single image deraining methods on real images. Submissions were trained o… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  19. arXiv:2403.11328  [pdf, other

    cs.CV cs.AI

    Domain-Guided Masked Autoencoders for Unique Player Identification

    Authors: Bavesh Balaji, Jerrin Bright, Sirisha Rambhatla, Yuhao Chen, Alexander Wong, John Zelek, David A Clausi

    Abstract: Unique player identification is a fundamental module in vision-driven sports analytics. Identifying players from broadcast videos can aid with various downstream tasks such as player assessment, in-game analysis, and broadcast production. However, automatic detection of jersey numbers using deep features is challenging primarily due to: a) motion blur, b) low resolution video feed, and c) occlusio… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Submitted to 21st International Conference on Robots and Vision (CRV'24), Guelph, Ontario, Canada

  20. arXiv:2403.07715  [pdf, other

    eess.IV cs.CV

    Intra-video Positive Pairs in Self-Supervised Learning for Ultrasound

    Authors: Blake VanBerlo, Alexander Wong, Jesse Hoey, Robert Arntfield

    Abstract: Self-supervised learning (SSL) is one strategy for addressing the paucity of labelled data in medical imaging by learning representations from unlabelled images. Contrastive and non-contrastive SSL methods produce learned representations that are similar for pairs of related images. Such pairs are commonly constructed by randomly distorting the same image twice. The videographic nature of ultrasou… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 18 pages, 5 figures

    ACM Class: I.2.10; I.4.9; J.3

  21. arXiv:2402.13249  [pdf, other

    cs.CL cs.AI

    TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

    Authors: Liyan Tang, Igor Shalyminov, Amy Wing-mei Wong, Jon Burnsky, Jake W. Vincent, Yu'an Yang, Siffi Singh, Song Feng, Hwanjun Song, Hang Su, Lijia Sun, Yi Zhang, Saab Mansour, Kathleen McKeown

    Abstract: Single document news summarization has seen substantial progress on faithfulness in recent years, driven by research on the evaluation of factual consistency, or hallucinations. We ask whether these advances carry over to other text summarization domains. We propose a new evaluation benchmark on topic-focused dialogue summarization, generated by LLMs of varying sizes. We provide binary sentence-le… ▽ More

    Submitted 31 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: NAACL 2024; Linguistic annotations available at https://github.com/amazon-science/tofueval

  22. arXiv:2402.06912  [pdf, other

    cs.LG cs.AI

    Solving Deep Reinforcement Learning Benchmarks with Linear Policy Networks

    Authors: Annie Wong, Jacob de Nobel, Thomas Bäck, Aske Plaat, Anna V. Kononova

    Abstract: Although Deep Reinforcement Learning (DRL) methods can learn effective policies for challenging problems such as Atari games and robotics tasks, algorithms are complex and training times are often long. This study investigates how evolution strategies (ES) perform compared to gradient-based deep reinforcement learning methods. We use ES to optimize the weights of a neural network via neuroevolutio… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

  23. arXiv:2402.03557  [pdf, other

    cs.CV

    Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement

    Authors: Dayou Mao, Yuhao Chen, Yifan Wu, Maximilian Gilles, Alexander Wong

    Abstract: One of the main motivations of MTL is to develop neural networks capable of inferring multiple tasks simultaneously. While countless methods have been proposed in the past decade investigating robust model architectures and efficient training algorithms, there is still lack of understanding of these methods when applied on smaller feature extraction backbones, the generalizability of the commonly… ▽ More

    Submitted 16 April, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  24. arXiv:2402.03312  [pdf, other

    cs.CV cs.LG

    Test-Time Adaptation for Depth Completion

    Authors: Hyoungseob Park, Anjali Gupta, Alex Wong

    Abstract: It is common to observe performance degradation when transferring models trained on some (source) datasets to target testing data due to a domain gap between them. Existing methods for bridging this gap, such as domain adaptation (DA), may require the source data on which the model was trained (often not available), while others, i.e., source-free DA, require many passes through the testing data.… ▽ More

    Submitted 27 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  25. arXiv:2401.18084  [pdf, other

    cs.CV cs.RO

    Binding Touch to Everything: Learning Unified Multimodal Tactile Representations

    Authors: Fengyu Yang, Chao Feng, Ziyang Chen, Hyoungseob Park, Daniel Wang, Yiming Dou, Ziyao Zeng, Xien Chen, Rit Gangopadhyay, Andrew Owens, Alex Wong

    Abstract: The ability to associate touch with other modalities has huge implications for humans and computational systems. However, multimodal learning with touch remains challenging due to the expensive data collection process and non-standardized sensor outputs. We introduce UniTouch, a unified tactile model for vision-based touch sensors connected to multiple modalities, including vision, language, and s… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  26. arXiv:2401.08598  [pdf, other

    cs.CV

    NutritionVerse-Real: An Open Access Manually Collected 2D Food Scene Dataset for Dietary Intake Estimation

    Authors: Chi-en Amy Tai, Saeejith Nair, Olivia Markham, Matthew Keller, Yifan Wu, Yuhao Chen, Alexander Wong

    Abstract: Dietary intake estimation plays a crucial role in understanding the nutritional habits of individuals and populations, aiding in the prevention and management of diet-related health issues. Accurate estimation requires comprehensive datasets of food scenes, including images, segmentation masks, and accompanying dietary intake metadata. In this paper, we introduce NutritionVerse-Real, an open acces… ▽ More

    Submitted 20 November, 2023; originally announced January 2024.

  27. arXiv:2401.01868  [pdf, other

    cs.CV cs.AI

    Step length measurement in the wild using FMCW radar

    Authors: Parthipan Siva, Alexander Wong, Patricia Hewston, George Ioannidis, Dr. Jonathan Adachi, Dr. Alexander Rabinovich, Andrea Lee, Alexandra Papaioannou

    Abstract: With an aging population, numerous assistive and monitoring technologies are under development to enable older adults to age in place. To facilitate aging in place predicting risk factors such as falls, and hospitalization and providing early interventions are important. Much of the work on ambient monitoring for risk prediction has centered on gait speed analysis, utilizing privacy-preserving sen… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    ACM Class: I.5.4; C.3; J.7

  28. arXiv:2312.12774  [pdf, other

    cs.DC

    Data Extraction, Transformation, and Loading Process Automation for Algorithmic Trading Machine Learning Modelling and Performance Optimization

    Authors: Nassi Ebadifard, Ajitesh Parihar, Youry Khmelevsky, Gaetan Hains, Albert Wong, Frank Zhang

    Abstract: A data warehouse efficiently prepares data for effective and fast data analysis and modelling using machine learning algorithms. This paper discusses existing solutions for the Data Extraction, Transformation, and Loading (ETL) process and automation for algorithmic trading algorithms. Integrating the Data Warehouses and, in the future, the Data Lakes with the Machine Learning Algorithms gives eno… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  29. arXiv:2312.12414  [pdf, ps, other

    cs.DB cs.AI cs.LG

    Translating Natural Language Queries to SQL Using the T5 Model

    Authors: Albert Wong, Lien Pham, Young Lee, Shek Chan, Razel Sadaya, Youry Khmelevsky, Mathias Clement, Florence Wing Yau Cheng, Joe Mahony, Michael Ferri

    Abstract: This paper presents the development process of a natural language to SQL model using the T5 model as the basis. The models, developed in August 2022 for an online transaction processing system and a data warehouse, have a 73\% and 84\% exact match accuracy respectively. These models, in conjunction with other work completed in the research project, were implemented for several companies and used s… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  30. arXiv:2312.10072  [pdf, other

    cs.HC cs.AI cs.LG stat.AP

    Assessing the Usability of GutGPT: A Simulation Study of an AI Clinical Decision Support System for Gastrointestinal Bleeding Risk

    Authors: Colleen Chan, Kisung You, Sunny Chung, Mauro Giuffrè, Theo Saarinen, Niroop Rajashekar, Yuan Pu, Yeo Eun Shin, Loren Laine, Ambrose Wong, René Kizilcec, Jasjeet Sekhon, Dennis Shung

    Abstract: Applications of large language models (LLMs) like ChatGPT have potential to enhance clinical decision support through conversational interfaces. However, challenges of human-algorithmic interaction and clinician trust are poorly understood. GutGPT, a LLM for gastrointestinal (GI) bleeding risk prediction and management guidance, was deployed in clinical simulation scenarios alongside the electroni… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10, 2023, New Orleans, United States, 11 pages

  31. arXiv:2312.09534  [pdf, other

    cs.CV

    WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather

    Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Matthew Waliman, Yunhao Ba, Alex Wong, Achuta Kadambi

    Abstract: The introduction of large, foundational models to computer vision has led to drastically improved performance on the task of semantic segmentation. However, these existing methods exhibit a large performance drop when testing on images degraded by weather conditions such as rain, fog, or snow. We introduce a general paired-training method that can be applied to all current foundational model archi… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  32. arXiv:2312.09232  [pdf, other

    cs.CV

    DVQI: A Multi-task, Hardware-integrated Artificial Intelligence System for Automated Visual Inspection in Electronics Manufacturing

    Authors: Audrey Chung, Francis Li, Jeremy Ward, Andrew Hryniowski, Alexander Wong

    Abstract: As electronics manufacturers continue to face pressure to increase production efficiency amid difficulties with supply chains and labour shortages, many printed circuit board assembly (PCBA) manufacturers have begun to invest in automation and technological innovations to remain competitive. One such method is to leverage artificial intelligence (AI) to greatly augment existing manufacturing proce… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: 8 pages

  33. arXiv:2312.06192  [pdf, other

    cs.CV

    NutritionVerse-Synth: An Open Access Synthetically Generated 2D Food Scene Dataset for Dietary Intake Estimation

    Authors: Saeejith Nair, Chi-en Amy Tai, Yuhao Chen, Alexander Wong

    Abstract: Manually tracking nutritional intake via food diaries is error-prone and burdensome. Automated computer vision techniques show promise for dietary monitoring but require large and diverse food image datasets. To address this need, we introduce NutritionVerse-Synth (NV-Synth), a large-scale synthetic food image dataset. NV-Synth contains 84,984 photorealistic meal images rendered from 7,082 dynamic… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 6 pages

  34. arXiv:2312.05171  [pdf, other

    cs.AI cs.NE

    DARLEI: Deep Accelerated Reinforcement Learning with Evolutionary Intelligence

    Authors: Saeejith Nair, Mohammad Javad Shafiee, Alexander Wong

    Abstract: We present DARLEI, a framework that combines evolutionary algorithms with parallelized reinforcement learning for efficiently training and evolving populations of UNIMAL agents. Our approach utilizes Proximal Policy Optimization (PPO) for individual agent learning and pairs it with a tournament selection-based generational learning mechanism to foster morphological evolution. By building on Nvidia… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 9 pages

  35. arXiv:2312.03540  [pdf, other

    cs.CV

    FoodFusion: A Latent Diffusion Model for Realistic Food Image Generation

    Authors: Olivia Markham, Yuhao Chen, Chi-en Amy Tai, Alexander Wong

    Abstract: Current state-of-the-art image generation models such as Latent Diffusion Models (LDMs) have demonstrated the capacity to produce visually striking food-related images. However, these generated images often exhibit an artistic or surreal quality that diverges from the authenticity of real-world food representations. This inadequacy renders them impractical for applications requiring realistic food… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  36. arXiv:2312.00944  [pdf, other

    cs.CV cs.GR

    Enhancing Diffusion Models with 3D Perspective Geometry Constraints

    Authors: Rishi Upadhyay, Howard Zhang, Yunhao Ba, Ethan Yang, Blake Gella, Sicheng Jiang, Alex Wong, Achuta Kadambi

    Abstract: While perspective is a well-studied topic in art, it is generally taken for granted in images. However, for the recent wave of high-quality image synthesis methods such as latent diffusion models, perspective accuracy is not an explicit requirement. Since these methods are capable of outputting a wide gamut of possible images, it is difficult for these synthesized images to adhere to the principle… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Project Webpage: http://visual.ee.ucla.edu/diffusionperspective.htm/

  37. arXiv:2312.00837  [pdf, other

    eess.IV cs.CV

    An Adaptive Correspondence Scoring Framework for Unsupervised Image Registration of Medical Images

    Authors: Xiaoran Zhang, John C. Stendahl, Lawrence Staib, Albert J. Sinusas, Alex Wong, James S. Duncan

    Abstract: We propose an adaptive training scheme for unsupervised medical image registration. Existing methods rely on image reconstruction as the primary supervision signal. However, nuisance variables (e.g. noise and covisibility) often cause the loss of correspondence between medical images, violating the Lambertian assumption in physical waves (e.g. ultrasound) and consistent imaging acquisition. As the… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  38. arXiv:2312.00836  [pdf, other

    eess.IV cs.CV

    Heteroscedastic Uncertainty Estimation for Probabilistic Unsupervised Registration of Noisy Medical Images

    Authors: Xiaoran Zhang, Daniel H. Pak, Shawn S. Ahn, Xiaoxiao Li, Chenyu You, Lawrence Staib, Albert J. Sinusas, Alex Wong, James S. Duncan

    Abstract: This paper proposes a heteroscedastic uncertainty estimation framework for unsupervised medical image registration. Existing methods rely on objectives (e.g. mean-squared error) that assume a uniform noise level across the image, disregarding the heteroscedastic and input-dependent characteristics of noise distribution in real-world medical images. This further introduces noisy gradients due to un… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  39. arXiv:2311.18612  [pdf, other

    eess.IV cs.CV

    Cancer-Net PCa-Gen: Synthesis of Realistic Prostate Diffusion Weighted Imaging Data via Anatomic-Conditional Controlled Latent Diffusion

    Authors: Aditya Sridhar, Chi-en Amy Tai, Hayden Gunraj, Yuhao Chen, Alexander Wong

    Abstract: In Canada, prostate cancer is the most common form of cancer in men and accounted for 20% of new cancer cases for this demographic in 2022. Due to recent successes in leveraging machine learning for clinical decision support, there has been significant interest in the development of deep neural networks for prostate cancer diagnosis, prognosis, and treatment planning using diffusion weighted imagi… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  40. arXiv:2311.17677  [pdf, other

    eess.IV cs.CV

    COVIDx CXR-4: An Expanded Multi-Institutional Open-Source Benchmark Dataset for Chest X-ray Image-Based Computer-Aided COVID-19 Diagnostics

    Authors: Yifan Wu, Hayden Gunraj, Chi-en Amy Tai, Alexander Wong

    Abstract: The global ramifications of the COVID-19 pandemic remain significant, exerting persistent pressure on nations even three years after its initial outbreak. Deep learning models have shown promise in improving COVID-19 diagnostics but require diverse and larger-scale datasets to improve performance. In this paper, we introduce COVIDx CXR-4, an expanded multi-institutional open-source benchmark datas… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  41. arXiv:2311.11656  [pdf, other

    eess.IV cs.CV

    Double-Condensing Attention Condenser: Leveraging Attention in Deep Learning to Detect Skin Cancer from Skin Lesion Images

    Authors: Chi-en Amy Tai, Elizabeth Janes, Chris Czarnecki, Alexander Wong

    Abstract: Skin cancer is the most common type of cancer in the United States and is estimated to affect one in five Americans. Recent advances have demonstrated strong performance on skin cancer detection, as exemplified by state of the art performance in the SIIM-ISIC Melanoma Classification Challenge; however these solutions leverage ensembles of complex deep neural architectures requiring immense storage… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  42. arXiv:2311.11647  [pdf, other

    cs.CV

    Cancer-Net PCa-Data: An Open-Source Benchmark Dataset for Prostate Cancer Clinical Decision Support using Synthetic Correlated Diffusion Imaging Data

    Authors: Hayden Gunraj, Chi-en Amy Tai, Alexander Wong

    Abstract: The recent introduction of synthetic correlated diffusion (CDI$^s$) imaging has demonstrated significant potential in the realm of clinical decision support for prostate cancer (PCa). CDI$^s$ is a new form of magnetic resonance imaging (MRI) designed to characterize tissue characteristics through the joint correlation of diffusion signal attenuation across different Brownian motion sensitivities.… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  43. arXiv:2311.09566  [pdf, other

    cs.LG

    A Knowledge Distillation Approach for Sepsis Outcome Prediction from Multivariate Clinical Time Series

    Authors: Anna Wong, Shu Ge, Nassim Oufattole, Adam Dejl, Megan Su, Ardavan Saeedi, Li-wei H. Lehman

    Abstract: Sepsis is a life-threatening condition triggered by an extreme infection response. Our objective is to forecast sepsis patient outcomes using their medical history and treatments, while learning interpretable state representations to assess patients' risks in developing various adverse outcomes. While neural networks excel in outcome prediction, their limited interpretability remains a key issue.… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 12 pages

  44. arXiv:2310.09739  [pdf, other

    cs.CV

    AugUndo: Scaling Up Augmentations for Unsupervised Depth Completion

    Authors: Yangchao Wu, Tian Yu Liu, Hyoungseob Park, Stefano Soatto, Dong Lao, Alex Wong

    Abstract: Unsupervised depth completion methods are trained by minimizing sparse depth and image reconstruction error. Block artifacts from resampling, intensity saturation, and occlusions are amongst the many undesirable by-products of common data augmentation schemes that affect image reconstruction quality, and thus the training signal. Hence, typical augmentations on images viewed as essential to traini… ▽ More

    Submitted 25 December, 2023; v1 submitted 15 October, 2023; originally announced October 2023.

  45. arXiv:2310.06164  [pdf, other

    cs.CV cs.LG cs.RO

    DEUX: Active Exploration for Learning Unsupervised Depth Perception

    Authors: Marvin Chancán, Alex Wong, Ian Abraham

    Abstract: Depth perception models are typically trained on non-interactive datasets with predefined camera trajectories. However, this often introduces systematic biases into the learning process correlated to specific camera paths chosen during data acquisition. In this paper, we investigate the role of how data is collected for learning depth completion, from a robot navigation perspective, by leveraging… ▽ More

    Submitted 16 September, 2023; originally announced October 2023.

  46. arXiv:2310.03967  [pdf, other

    cs.CV cs.AI

    Sub-token ViT Embedding via Stochastic Resonance Transformers

    Authors: Dong Lao, Yangchao Wu, Tian Yu Liu, Alex Wong, Stefano Soatto

    Abstract: Vision Transformer (ViT) architectures represent images as collections of high-dimensional vectorized tokens, each corresponding to a rectangular non-overlapping patch. This representation trades spatial granularity for embedding dimensionality, and results in semantically rich but spatially coarsely quantized feature maps. In order to retrieve spatial details beneficial to fine-grained inference… ▽ More

    Submitted 6 May, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

  47. arXiv:2309.14293  [pdf, other

    cs.CV cs.AI cs.LG

    NAS-NeRF: Generative Neural Architecture Search for Neural Radiance Fields

    Authors: Saeejith Nair, Yuhao Chen, Mohammad Javad Shafiee, Alexander Wong

    Abstract: Neural radiance fields (NeRFs) enable high-quality novel view synthesis, but their high computational complexity limits deployability. While existing neural-based solutions strive for efficiency, they use one-size-fits-all architectures regardless of scene complexity. The same architecture may be unnecessarily large for simple scenes but insufficient for complex ones. Thus, there is a need to dyna… ▽ More

    Submitted 11 December, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: 8 pages

  48. arXiv:2309.13866  [pdf, other

    cs.LG cs.CV

    On Calibration of Modern Quantized Efficient Neural Networks

    Authors: Joey Kuang, Alexander Wong

    Abstract: We explore calibration properties at various precisions for three architectures: ShuffleNetv2, GhostNet-VGG, and MobileOne; and two datasets: CIFAR-100 and PathMNIST. The quality of calibration is observed to track the quantization quality; it is well-documented that performance worsens with lower precision, and we observe a similar correlation with poorer calibration. This becomes especially egre… ▽ More

    Submitted 26 September, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Accepted as an extended abstract at the ICCV 2023 Workshop on Low-Bit Quantized Neural Networks. Corrected some typos

  49. arXiv:2309.13773  [pdf, other

    cs.LG cs.AI cs.CV

    GHN-QAT: Training Graph Hypernetworks to Predict Quantization-Robust Parameters of Unseen Limited Precision Neural Networks

    Authors: Stone Yun, Alexander Wong

    Abstract: Graph Hypernetworks (GHN) can predict the parameters of varying unseen CNN architectures with surprisingly good accuracy at a fraction of the cost of iterative optimization. Following these successes, preliminary research has explored the use of GHNs to predict quantization-robust parameters for 8-bit and 4-bit quantized CNNs. However, this early work leveraged full-precision float32 training and… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

    Comments: Poster and extended abstract to be presented at the Workshop for Low Bit Quantized Neural Networks (LQBNN) @ ICCV 2023

  50. arXiv:2309.07704  [pdf, other

    cs.CV cs.AI

    NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches

    Authors: Chi-en Amy Tai, Matthew Keller, Saeejith Nair, Yuhao Chen, Yifan Wu, Olivia Markham, Krish Parmar, Pengcheng Xi, Heather Keller, Sharon Kirkpatrick, Alexander Wong

    Abstract: Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating, as malnutrition has been directly linked to decreased quality of life. However self-reporting methods such as food diaries suffer from substantial bias. Other conventional dietary assessment techniques and emerging alternative approaches such as mobile applications incur high time costs an… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.