Zum Hauptinhalt springen

Showing 1–50 of 241 results for author: Fu, K

.
  1. arXiv:2408.16921  [pdf, other

    quant-ph

    Rapid, in-situ neutralization of nitrogen- and silicon-vacancy centers in diamond using above-band-gap optical excitation

    Authors: Christian Pederson, Nicholas S. Yama, Lane Beale, Matthew L. Markham, Kai-Mei C. Fu

    Abstract: The charge state of a quantum point defect in a solid state host strongly determines its optical and spin characteristics. Consequently, techniques for controlling the charge state are required to realize technologies such as quantum networking and sensing. In this work we demonstrate the use of deep-ultraviolet (DUV) radiation to dynamically neutralize nitrogen- (NV) and silicon-vacancy (SiV) cen… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  2. arXiv:2408.15511  [pdf, other

    cs.RO cs.AI

    AeroVerse: UAV-Agent Benchmark Suite for Simulating, Pre-training, Finetuning, and Evaluating Aerospace Embodied World Models

    Authors: Fanglong Yao, Yuanchang Yue, Youzhi Liu, Xian Sun, Kun Fu

    Abstract: Aerospace embodied intelligence aims to empower unmanned aerial vehicles (UAVs) and other aerospace platforms to achieve autonomous perception, cognition, and action, as well as egocentric active interaction with humans and the environment. The aerospace embodied world model serves as an effective means to realize the autonomous intelligence of UAVs and represents a necessary pathway toward aerosp… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  3. arXiv:2408.07305  [pdf, ps, other

    cs.LG

    Learning Decisions Offline from Censored Observations with ε-insensitive Operational Costs

    Authors: Minxia Chen, Ke Fu, Teng Huang, Miao Bai

    Abstract: Many important managerial decisions are made based on censored observations. Making decisions without adequately handling the censoring leads to inferior outcomes. We investigate the data-driven decision-making problem with an offline dataset containing the feature data and the censored historical data of the variable of interest without the censoring indicators. Without assuming the underlying di… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  4. arXiv:2408.04213  [pdf, other

    stat.ME

    Hypothesis testing for general network models

    Authors: Kang Fu, Jianwei Hu, Seydou Keita

    Abstract: The network data has attracted considerable attention in modern statistics. In research on complex network data, one key issue is finding its underlying connection structure given a network sample. The methods that have been proposed in literature usually assume that the underlying structure is a known model. In practice, however, the true model is usually unknown, and network learning procedures… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  5. arXiv:2408.00525  [pdf, other

    cs.HC cs.DM cs.LG

    Identifying the Hierarchical Emotional Areas in the Human Brain Through Information Fusion

    Authors: Zhongyu Huang, Changde Du, Chaozhuo Li, Kaicheng Fu, Huiguang He

    Abstract: The brain basis of emotion has consistently received widespread attention, attracting a large number of studies to explore this cutting-edge topic. However, the methods employed in these studies typically only model the pairwise relationship between two brain regions, while neglecting the interactions and information fusion among multiple brain regions$\unicode{x2014}$one of the key ideas of the p… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  6. arXiv:2407.09209  [pdf, other

    cs.CL eess.AS

    Pronunciation Assessment with Multi-modal Large Language Models

    Authors: Kaiqi Fu, Linkai Peng, Nan Yang, Shuran Zhou

    Abstract: Large language models (LLMs), renowned for their powerful conversational abilities, are widely recognized as exceptional tools in the field of education, particularly in the context of automated intelligent instruction systems for language learning. In this paper, we propose a scoring system based on LLMs, motivated by their positive impact on text-related scoring tasks. Specifically, the speech e… ▽ More

    Submitted 18 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

  7. arXiv:2407.02682  [pdf

    physics.app-ph cond-mat.mtrl-sci

    Epitaxial Growth of Rutile GeO$_2$ via MOCVD

    Authors: Imteaz Rahaman, Bobby Duersch, Hunter D. Ellis, Michael A. Scarpulla, Kai Fu

    Abstract: Rutile Germanium Dioxide (r-GeO$_2$) has been identified as an ultrawide bandgap (UWBG) semiconductor recently, featuring a bandgap of 4.68 eV, comparable to Ga$_2$O$_3$ but offering bipolar dopability, higher electron mobility, higher thermal conductivity, and higher Baliga's figure of merit (BFOM).These superior properties position GeO$_2$ as a promising material for various semiconductor applic… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 20 pages, 5 figures, 3 tables

  8. arXiv:2407.01067  [pdf, other

    cs.AI cs.CL cs.CV cs.HC cs.LG

    Human-like object concept representations emerge naturally in multimodal large language models

    Authors: Changde Du, Kaicheng Fu, Bincheng Wen, Yi Sun, Jie Peng, Wei Wei, Ying Gao, Shengpei Wang, Chuncheng Zhang, Jinpeng Li, Shuang Qiu, Le Chang, Huiguang He

    Abstract: The conceptualization and categorization of natural objects in the human mind have long intrigued cognitive scientists and neuroscientists, offering crucial insights into human perception and cognition. Recently, the rapid development of Large Language Models (LLMs) has raised the attractive question of whether these models can also develop human-like object representations through exposure to vas… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  9. arXiv:2406.15848  [pdf, other

    cs.CV

    Quality-guided Skin Tone Enhancement for Portrait Photography

    Authors: Shiqi Gao, Huiyu Duan, Xinyue Li, Kang Fu, Yicong Peng, Qihang Xu, Yuanyuan Chang, Jia Wang, Xiongkuo Min, Guangtao Zhai

    Abstract: In recent years, learning-based color and tone enhancement methods for photos have become increasingly popular. However, most learning-based image enhancement methods just learn a mapping from one distribution to another based on one dataset, lacking the ability to adjust images continuously and controllably. It is important to enable the learning-based enhancement models to adjust an image contin… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  10. arXiv:2406.08804  [pdf, other

    cs.DC cs.AI cs.IR

    DIET: Customized Slimming for Incompatible Networks in Sequential Recommendation

    Authors: Kairui Fu, Shengyu Zhang, Zheqi Lv, Jingyuan Chen, Jiwei Li

    Abstract: Due to the continuously improving capabilities of mobile edges, recommender systems start to deploy models on edges to alleviate network congestion caused by frequent mobile requests. Several studies have leveraged the proximity of edge-side to real-time data, fine-tuning them to create edge-specific models. Despite their significant progress, these methods require substantial on-edge computationa… ▽ More

    Submitted 15 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024

  11. arXiv:2406.02017  [pdf, other

    cs.LG stat.ML

    On the Mode-Seeking Properties of Langevin Dynamics

    Authors: Xiwei Cheng, Kexin Fu, Farzan Farnia

    Abstract: The Langevin Dynamics framework, which aims to generate samples from the score function of a probability distribution, is widely used for analyzing and interpreting score-based generative modeling. While the convergence behavior of Langevin Dynamics under unimodal distributions has been extensively studied in the literature, in practice the data distribution could consist of multiple distinct mode… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  12. Variance-reduced sampling importance resampling

    Authors: Yao Xiao, Kang Fu, Kun Li

    Abstract: The sampling importance resampling method is widely utilized in various fields, such as numerical integration and statistical simulation. In this paper, two modified methods are presented by incorporating two variance reduction techniques commonly used in Monte Carlo simulation, namely antithetic sampling and Latin hypercube sampling, into the process of sampling importance resampling method respe… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  13. arXiv:2405.20600  [pdf, other

    cs.AI

    Multi-label Class Incremental Emotion Decoding with Augmented Emotional Semantics Learning

    Authors: Kaicheng Fu, Changde Du, Xiaoyu Chen, Jie Peng, Huiguang He

    Abstract: Emotion decoding plays an important role in affective human-computer interaction. However, previous studies ignored the dynamic real-world scenario, where human experience a blend of multiple emotions which are incrementally integrated into the model, leading to the multi-label class incremental learning (MLCIL) problem. Existing methods have difficulty in solving MLCIL issue due to notorious cata… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  14. arXiv:2405.19735  [pdf, other

    cs.CV

    Twin Deformable Point Convolutions for Point Cloud Semantic Segmentation in Remote Sensing Scenes

    Authors: Yong-Qiang Mao, Hanbo Bi, Xuexue Li, Kaiqiang Chen, Zhirui Wang, Xian Sun, Kun Fu

    Abstract: Thanks to the application of deep learning technology in point cloud processing of the remote sensing field, point cloud segmentation has become a research hotspot in recent years, which can be applied to real-world 3D, smart cities, and other fields. Although existing solutions have made unprecedented progress, they ignore the inherent characteristics of point clouds in remote sensing fields that… ▽ More

    Submitted 4 August, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  15. arXiv:2405.19689  [pdf, other

    cs.CV cs.IR

    Uncertainty-aware sign language video retrieval with probability distribution modeling

    Authors: Xuan Wu, Hongxiang Li, Yuanjiang Luo, Xuxin Cheng, Xianwei Zhuang, Meng Cao, Keren Fu

    Abstract: Sign language video retrieval plays a key role in facilitating information access for the deaf community. Despite significant advances in video-text retrieval, the complexity and inherent uncertainty of sign language preclude the direct application of these techniques. Previous methods achieve the mapping between sign language video and text through fine-grained modal alignment. However, due to th… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  16. arXiv:2405.17140  [pdf, other

    cs.CV

    SDL-MVS: View Space and Depth Deformable Learning Paradigm for Multi-View Stereo Reconstruction in Remote Sensing

    Authors: Yong-Qiang Mao, Hanbo Bi, Liangyu Xu, Kaiqiang Chen, Zhirui Wang, Xian Sun, Kun Fu

    Abstract: Research on multi-view stereo based on remote sensing images has promoted the development of large-scale urban 3D reconstruction. However, remote sensing multi-view image data suffers from the problems of occlusion and uneven brightness between views during acquisition, which leads to the problem of blurred details in depth estimation. To solve the above problem, we re-examine the deformable learn… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  17. arXiv:2405.07564  [pdf

    cond-mat.mtrl-sci

    Growth of GeO2 on R-plane and C-plane Sapphires by MOCVD

    Authors: Imteaz Rahaman, Hunter D. Ellis, Kathy Anderson, Michael A. Scarpulla, Kai Fu

    Abstract: Rutile Germanium Dioxide (GeO2) has been recently theoretically identified as an ultrawide bandgap (UWBG) semiconductor with bandgap 4.68 eV similar to Ga2O3 but having bipolar dopability and ~2x higher electron mobility, Baliga figure of merit (BFOM) and thermal conductivity than Ga2O3. Bulk crystal growth is rapidly moving towards making large sized native substrates available. These outstanding… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 22 pages, 14 Figures

  18. arXiv:2404.13322  [pdf, other

    cs.LG cs.AI

    MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities

    Authors: Kunxi Li, Tianyu Zhan, Kairui Fu, Shengyu Zhang, Kun Kuang, Jiwei Li, Zhou Zhao, Fei Wu

    Abstract: In this study, we focus on heterogeneous knowledge transfer across entirely different model architectures, tasks, and modalities. Existing knowledge transfer methods (e.g., backbone sharing, knowledge distillation) often hinge on shared elements within model structures or task-specific features/labels, limiting transfers to complex model types or tasks. To overcome these challenges, we present Mer… ▽ More

    Submitted 17 June, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  19. arXiv:2404.08980  [pdf, other

    cs.LG stat.ML

    Stability and Generalization in Free Adversarial Training

    Authors: Xiwei Cheng, Kexin Fu, Farzan Farnia

    Abstract: While adversarial training methods have resulted in significant improvements in the deep neural nets' robustness against norm-bounded adversarial perturbations, their generalization performance from training samples to test data has been shown to be considerably worse than standard empirical risk minimization methods. Several recent studies seek to connect the generalization behavior of adversaria… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  20. arXiv:2404.08195  [pdf, other

    cs.CV

    Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation

    Authors: Zhiwei Yang, Yucong Meng, Kexue Fu, Shuo Wang, Zhijian Song

    Abstract: Weakly supervised semantic segmentation (WSSS) with image-level labels intends to achieve dense tasks without laborious annotations. However, due to the ambiguous contexts and fuzzy regions, the performance of WSSS, especially the stages of generating Class Activation Maps (CAMs) and refining pseudo masks, widely suffers from ambiguity while being barely noticed by previous literature. In this wor… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  21. arXiv:2403.18458  [pdf, other

    astro-ph.GA astro-ph.IM

    Dust Extinction Measures for $z\sim 8$ Galaxies using Machine Learning on JWST Imaging

    Authors: Kwan Lin Kristy Fu, Christopher J. Conselice, Leonardo Ferreira, Thomas Harvey, Qiao Duan, Nathan Adams, Duncan Austin

    Abstract: We present the results of a machine learning study to measure the dust content of galaxies observed with JWST at z > 6 through the use of trained neural networks based on high-resolution IllustrisTNG simulations. Dust is an important unknown in the evolution and observability of distant galaxies and is degenerate with other stellar population features through spectral energy fitting. As such, we d… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: submitted to MNRAS

  22. arXiv:2403.18238  [pdf, other

    cs.CV

    TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes

    Authors: Liangyu Xu, Wanxuan Lu, Hongfeng Yu, Yongqiang Mao, Hanbo Bi, Chenglong Liu, Xian Sun, Kun Fu

    Abstract: As drone technology advances, using unmanned aerial vehicles for aerial surveys has become the dominant trend in modern low-altitude remote sensing. The surge in aerial video data necessitates accurate prediction for future scenarios and motion states of the interested target, particularly in applications like traffic management and disaster response. Existing video prediction methods focus solely… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 17 pages, 9 figures

  23. arXiv:2403.09675  [pdf, other

    cs.CV cs.GR

    Open-Universe Indoor Scene Generation using LLM Program Synthesis and Uncurated Object Databases

    Authors: Rio Aguina-Kang, Maxim Gumin, Do Heon Han, Stewart Morris, Seung Jean Yoo, Aditya Ganeshan, R. Kenny Jones, Qiuhong Anna Wei, Kailiang Fu, Daniel Ritchie

    Abstract: We present a system for generating indoor scenes in response to text prompts. The prompts are not limited to a fixed vocabulary of scene descriptions, and the objects in generated scenes are not restricted to a fixed set of object categories -- we call this setting indoor scene generation. Unlike most prior work on indoor scene generation, our system does not require a large training dataset of ex… ▽ More

    Submitted 4 February, 2024; originally announced March 2024.

    Comments: See ancillary files for link to supplemental material

  24. arXiv:2403.08973  [pdf, other

    physics.flu-dyn

    Measurements and modeling of induced flow in collective vertical migration

    Authors: Nina Mohebbi, Joonha Hwang, Matthew K. Fu, John O. Dabiri

    Abstract: Hydrodynamic interactions among swimming or flying organisms can lead to complex flows on the scale of the group. These emergent fluid dynamics are often more complex than a linear superposition of individual organism flows, especially at intermediate Reynolds numbers. This paper presents an approach to estimate the flow induced by multiple swimmer wakes in proximity using an analytical model that… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  25. arXiv:2403.06068  [pdf, other

    math.ST

    Hypothesis testing for homogenous of nodes in $β$-models

    Authors: Kang Fu, Jianwei Hu, Meng Sun

    Abstract: The $β$-model has been extensively utilized to model degree heterogeneity in networks, wherein each node is assigned a unique parameter. In this article, we consider the hypothesis testing problem that two nodes $i$ and $j$ of a $β$-model have the same node parameter. We prove that the null distribution of the proposed statistic converges in distribution to the standard normal distribution. Furthe… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  26. arXiv:2403.04306  [pdf, other

    cs.CV cs.AI cs.LG

    Effectiveness Assessment of Recent Large Vision-Language Models

    Authors: Yao Jiang, Xinyu Yan, Ge-Peng Ji, Keren Fu, Meijun Sun, Huan Xiong, Deng-Ping Fan, Fahad Shahbaz Khan

    Abstract: The advent of large vision-language models (LVLMs) represents a remarkable advance in the quest for artificial general intelligence. However, the model's effectiveness in both specialized and general tasks warrants further investigation. This paper endeavors to evaluate the competency of popular LVLMs in specialized and general tasks, respectively, aiming to offer a comprehensive understanding of… ▽ More

    Submitted 11 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Accepted by Visual Intelligence

  27. arXiv:2403.01968  [pdf, other

    cs.CV

    Explicit Motion Handling and Interactive Prompting for Video Camouflaged Object Detection

    Authors: Xin Zhang, Tao Xiao, Gepeng Ji, Xuan Wu, Keren Fu, Qijun Zhao

    Abstract: Camouflage poses challenges in distinguishing a static target, whereas any movement of the target can break this disguise. Existing video camouflaged object detection (VCOD) approaches take noisy motion estimation as input or model motion implicitly, restricting detection performance in complex dynamic scenes. In this paper, we propose a novel Explicit Motion handling and Interactive Prompting fra… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 9 pages, 6 figures

  28. arXiv:2402.18467  [pdf, other

    cs.CV

    Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation

    Authors: Zhiwei Yang, Kexue Fu, Minghong Duan, Linhao Qu, Shuo Wang, Zhijian Song

    Abstract: Weakly supervised semantic segmentation (WSSS) with image-level labels aims to achieve segmentation tasks without dense annotations. However, attributed to the frequent coupling of co-occurring objects and the limited supervision from image-level labels, the challenging co-occurrence problem is widely present and leads to false activation of objects in WSSS. In this work, we devise a 'Separate and… ▽ More

    Submitted 21 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted by CVPR 2024

  29. SFTformer: A Spatial-Frequency-Temporal Correlation-Decoupling Transformer for Radar Echo Extrapolation

    Authors: Liangyu Xu, Wanxuan Lu, Hongfeng Yu, Fanglong Yao, Xian Sun, Kun Fu

    Abstract: Extrapolating future weather radar echoes from past observations is a complex task vital for precipitation nowcasting. The spatial morphology and temporal evolution of radar echoes exhibit a certain degree of correlation, yet they also possess independent characteristics. {Existing methods learn unified spatial and temporal representations in a highly coupled feature space, emphasizing the correla… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 16 pages, 11 figures, TGRS

  30. arXiv:2402.17964  [pdf, other

    physics.bio-ph quant-ph

    Direct measure of DNA bending by quantum magnetic imaging of a nano-mechanical torque-balance

    Authors: Zeeshawn Kazi, Isaac M. Shelby, Ruhee Nirodi, Joseph Turnbull, Hideyuki Watanabe, Kohei M. Itoh, Paul A. Wiggins, Kai-Mei C. Fu

    Abstract: DNA flexibility is a key determinant of biological function, from nucleosome positioning to transcriptional regulation, motivating a direct measurement of the bend-torque response of individual DNA molecules. In this work, DNA bending is detected using a nano-mechanical torque balance formed by tethering a ferromagnetic nanoparticle probe by an individual DNA molecule to a diamond magnetic field i… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  31. arXiv:2402.11450  [pdf, other

    cs.RO

    Learning to Learn Faster from Human Feedback with Language Model Predictive Control

    Authors: Jacky Liang, Fei Xia, Wenhao Yu, Andy Zeng, Montserrat Gonzalez Arenas, Maria Attarian, Maria Bauza, Matthew Bennice, Alex Bewley, Adil Dostmohamed, Chuyuan Kelly Fu, Nimrod Gileadi, Marissa Giustina, Keerthana Gopalakrishnan, Leonard Hasenclever, Jan Humplik, Jasmine Hsu, Nikhil Joshi, Ben Jyenis, Chase Kew, Sean Kirmani, Tsang-Wei Edward Lee, Kuang-Huei Lee, Assaf Hurwitz Michaely, Joss Moore , et al. (25 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for o… ▽ More

    Submitted 31 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  32. arXiv:2402.10435  [pdf, other

    cs.CV

    Dynamic Patch-aware Enrichment Transformer for Occluded Person Re-Identification

    Authors: Xin Zhang, Keren Fu, Qijun Zhao

    Abstract: Person re-identification (re-ID) continues to pose a significant challenge, particularly in scenarios involving occlusions. Prior approaches aimed at tackling occlusions have predominantly focused on aligning physical body features through the utilization of external semantic cues. However, these methods tend to be intricate and susceptible to noise. To address the aforementioned challenges, we pr… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: 12 pages, 6 figures

  33. arXiv:2402.09446  [pdf, other

    cs.GR physics.comp-ph

    MeshAC: A 3D Mesh Generation and Adaptation Package for Multiscale Coupling Methods

    Authors: Kejie Fu, Mingjie Liao, Yangshuai Wang, Jianjun Chen, Lei Zhang

    Abstract: This paper introduces the MeshAC package, which generates three-dimensional adaptive meshes tailored for the efficient and robust implementation of multiscale coupling methods. While Delaunay triangulation is commonly used for mesh generation across the entire computational domain, generating meshes for multiscale coupling methods is more challenging due to intrinsic discrete structures such as de… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  34. arXiv:2401.14579  [pdf

    cs.CV

    Recognizing Multiple Ingredients in Food Images Using a Single-Ingredient Classification Model

    Authors: Kun Fu, Ying Dai

    Abstract: Recognizing food images presents unique challenges due to the variable spatial layout and shape changes of ingredients with different cooking and cutting methods. This study introduces an advanced approach for recognizing ingredients segmented from food images. The method localizes the candidate regions of the ingredients using the locating and sliding window techniques. Then, these regions are as… ▽ More

    Submitted 18 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: 9 pages, 21 figures, 6 tables

  35. arXiv:2401.13127  [pdf, other

    cs.RO cs.MA

    Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities

    Authors: Pierce Howell, Max Rudolph, Reza Torbati, Kevin Fu, Harish Ravichandar

    Abstract: Recent advances in multi-agent reinforcement learning (MARL) are enabling impressive coordination in heterogeneous multi-robot teams. However, existing approaches often overlook the challenge of generalizing learned policies to teams of new compositions, sizes, and robots. While such generalization might not be important in teams of virtual agents that can retrain policies on-demand, it is pivotal… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Presented at the 7th Conference on Robot Learning (CoRL 2023), Atlanta, USA

  36. arXiv:2401.03331  [pdf, other

    cs.CV cs.LG

    Walnut Detection Through Deep Learning Enhanced by Multispectral Synthetic Images

    Authors: Kaiming Fu, Tong Lei, Maryia Halubok, Brian N. Bailey

    Abstract: The accurate identification of walnuts within orchards brings forth a plethora of advantages, profoundly amplifying the efficiency and productivity of walnut orchard management. Nevertheless, the unique characteristics of walnut trees, characterized by their closely resembling shapes, colors, and textures between the walnuts and leaves, present a formidable challenge in precisely distinguishing be… ▽ More

    Submitted 31 October, 2023; originally announced January 2024.

    Comments: This work was presented at IEEE/RSI International Conference on Intelligent Robots and Systems (IROS) Workshop

  37. arXiv:2401.01569  [pdf, other

    cs.CV

    AttentionLut: Attention Fusion-based Canonical Polyadic LUT for Real-time Image Enhancement

    Authors: Kang Fu, Yicong Peng, Zicheng Zhang, Qihang Xu, Xiaohong Liu, Jia Wang, Guangtao Zhai

    Abstract: Recently, many algorithms have employed image-adaptive lookup tables (LUTs) to achieve real-time image enhancement. Nonetheless, a prevailing trend among existing methods has been the employment of linear combinations of basic LUTs to formulate image-adaptive LUTs, which limits the generalization ability of these methods. To address this limitation, we propose a novel framework named AttentionLut… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  38. arXiv:2401.00496  [pdf, other

    cs.CV cs.AI cs.LG

    SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge

    Authors: Dimitrios Psychogyios, Emanuele Colleoni, Beatrice Van Amsterdam, Chih-Yang Li, Shu-Yu Huang, Yuchong Li, Fucang Jia, Baosheng Zou, Guotai Wang, Yang Liu, Maxence Boels, Jiayu Huo, Rachel Sparks, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin, Mengya Xu, An Wang, Yanan Wu, Long Bai, Hongliang Ren, Atsushi Yamada, Yuriko Harai, Yuto Ishikawa, Kazuyuki Hayashi , et al. (25 additional authors not shown)

    Abstract: Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems. Nowadays, learning-based action recognition and segmentation approaches outperform classical methods, relying, however, on large, annotated datasets. Furthermore, action recognition and tool segme… ▽ More

    Submitted 23 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

  39. arXiv:2401.00248  [pdf, other

    cs.CV cs.AI

    Promoting Segment Anything Model towards Highly Accurate Dichotomous Image Segmentation

    Authors: Xianjie Liu, Keren Fu, Qijun Zhao

    Abstract: The Segment Anything Model (SAM) represents a significant breakthrough into foundation models for computer vision, providing a large-scale image segmentation model. However, despite SAM's zero-shot performance, its segmentation masks lack fine-grained details, particularly in accurately delineating object boundaries. We have high expectations regarding whether SAM, as a foundation model, can be im… ▽ More

    Submitted 22 March, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

  40. arXiv:2312.04831  [pdf, other

    cs.CV

    Towards Context-Stable and Visual-Consistent Image Inpainting

    Authors: Yikai Wang, Chenjie Cao, Ke Fan Xiangyang Xue Yanwei Fu

    Abstract: Recent progress in inpainting increasingly relies on generative models, leveraging their strong generation capabilities for addressing large irregular masks. However, this enhanced generation often introduces context-instability, leading to arbitrary object generation within masked regions. This paper proposes a balanced solution, emphasizing the importance of unmasked regions in guiding inpaintin… ▽ More

    Submitted 17 March, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: Project page: https://yikai-wang.github.io/asuka/ where full-size PDF with appendix is available. Dataset: https://github.com/Yikai-Wang/asuka-misato. Yikai Wang and Chenjie Cao contribute equally

  41. arXiv:2312.03758  [pdf, other

    cs.AI cs.CL

    Stock Movement and Volatility Prediction from Tweets, Macroeconomic Factors and Historical Prices

    Authors: Shengkun Wang, YangXiao Bai, Taoran Ji, Kaiqun Fu, Linhan Wang, Chang-Tien Lu

    Abstract: Predicting stock market is vital for investors and policymakers, acting as a barometer of the economic health. We leverage social media data, a potent source of public sentiment, in tandem with macroeconomic indicators as government-compiled statistics, to refine stock market predictions. However, prior research using tweet data for stock market prediction faces three challenges. First, the qualit… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  42. arXiv:2311.16323  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Robust Diamond/\b{eta}-Ga2O3 Hetero-p-n-junction Via Mechanically Integrating Their Building Blocks

    Authors: Imteaz Rahaman, Hunter D. Ellis, Kai Fu

    Abstract: We report a novel approach for crafting robust diamond/\b{eta}-Ga2O3 hetero-p-n-junctions through the mechanical integration of their bulk materials. This resulting heterojunction, with a turn-on voltage of ~2.7 V at room temperature, exhibits resilient electrical performance across a temperature spectrum up to 125°C, displaying minimal hysteresis-measuring as low as 0.2 V at room temperature and… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 13 pages, 4 figures, 1 table, journal and this draft has been submitted to 'Applied Physics Letters'

  43. arXiv:2311.15606  [pdf, other

    physics.optics

    Selective active resonance tuning for multi-mode nonlinear photonic cavities

    Authors: Alan D. Logan, Nicholas S. Yama, Kai-Mei C. Fu

    Abstract: Resonant enhancement of nonlinear photonic processes is critical for the scalability of applications such as long-distance entanglement generation. To implement nonlinear resonant enhancement, multiple resonator modes must be individually tuned onto a precise set of process wavelengths, which requires multiple linearly-independent tuning methods. Using coupled auxiliary resonators to indirectly tu… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 16 pages, 7 figures

  44. arXiv:2311.06435  [pdf, other

    physics.optics quant-ph

    Optomechanical ring resonator for efficient microwave-optical frequency conversion

    Authors: I-Tung Chen, Bingzhao Li, Seokhyeong Lee, Srivatsa Chakravarthi, Kai-Mei Fu, Mo Li

    Abstract: Phonons traveling in solid-state devices are emerging as a universal excitation that can couple to different physical systems through mechanical interaction. At microwave frequencies and in solid-state materials, phonons have a similar wavelength to optical photons, enabling them to interact efficiently with light and produce strong optomechanical effects that are highly desirable for classical an… ▽ More

    Submitted 16 November, 2023; v1 submitted 10 November, 2023; originally announced November 2023.

    Comments: 11 pages, 4 figures

  45. ALERTA-Net: A Temporal Distance-Aware Recurrent Networks for Stock Movement and Volatility Prediction

    Authors: Shengkun Wang, YangXiao Bai, Kaiqun Fu, Linhan Wang, Chang-Tien Lu, Taoran Ji

    Abstract: For both investors and policymakers, forecasting the stock market is essential as it serves as an indicator of economic well-being. To this end, we harness the power of social media data, a rich source of public sentiment, to enhance the accuracy of stock market predictions. Diverging from conventional methods, we pioneer an approach that integrates sentiment analysis, macroeconomic indicators, se… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  46. arXiv:2310.15482  [pdf, other

    cs.CV

    Salient Object Detection in RGB-D Videos

    Authors: Ao Mou, Yukang Lu, Jiahao He, Dingyao Min, Keren Fu, Qijun Zhao

    Abstract: Given the widespread adoption of depth-sensing acquisition devices, RGB-D videos and related data/media have gained considerable traction in various aspects of daily life. Consequently, conducting salient object detection (SOD) in RGB-D videos presents a highly promising and evolving avenue. Despite the potential of this area, SOD in RGB-D videos remains somewhat under-explored, with RGB-D SOD and… ▽ More

    Submitted 21 May, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: IEEE TIP (under major revision)

  47. arXiv:2310.15138  [pdf, other

    cs.RO cs.CV

    Fusion-Driven Tree Reconstruction and Fruit Localization: Advancing Precision in Agriculture

    Authors: Kaiming Fu, Peng Wei, Juan Villacres, Zhaodan Kong, Stavros G. Vougioukas, Brian N. Bailey

    Abstract: Fruit distribution is pivotal in shaping the future of both agriculture and agricultural robotics, paving the way for a streamlined supply chain. This study introduces an innovative methodology that harnesses the synergy of RGB imagery, LiDAR, and IMU data, to achieve intricate tree reconstructions and the pinpoint localization of fruits. Such integration not only offers insights into the fruit di… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: This work was presented at IEEE/RSI International Conference on Intelligent Robots and Systems (IROS) Workshop

  48. arXiv:2310.12484  [pdf, ps, other

    quant-ph

    Creation of color centers in diamond by recoil implantation through dielectric films

    Authors: Yuyang Han, Christian Pederson, Bethany E. Matthews, Nicholas S. Yama, Maxwell F. Parsons, Kai-Mei C. Fu

    Abstract: The need of near-surface color centers in diamond for quantum technologies motivates the controlled doping of specific extrinsic impurities into the crystal lattice. Recent experiments have shown that this can be achieved by momentum transfer from a surface precursor via ion implantation, an approach known as ``recoil implantation.'' Here, we extend this technique to incorporate dielectric precurs… ▽ More

    Submitted 28 December, 2023; v1 submitted 19 October, 2023; originally announced October 2023.

  49. arXiv:2310.05806  [pdf, other

    cond-mat.mes-hall quant-ph

    Isolation of Single Donors in ZnO

    Authors: Ethan R. Hansen, Vasileios Niaouris, Bethany E. Matthews, Christian Zimmermann, Xingyi Wang, Roman Kolodka, Lasse Vines, Steven R. Spurgeon, Kai-Mei C. Fu

    Abstract: The shallow donor in zinc oxide (ZnO) is a promising semiconductor spin qubit with optical access. Single indium donors are isolated in a commercial ZnO substrate using plasma focused ion beam (PFIB) milling. Quantum emitters are identified optically by spatial and frequency filtering. The indium donor assignment is based on the optical bound exciton transition energy and magnetic dependence. The… ▽ More

    Submitted 17 January, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: E. R. Hansen and V. Niaouris contributed equally to this work. 15 pages, 13 figures

  50. arXiv:2310.03941  [pdf, other

    cs.LG cs.SI

    LaTeX: Language Pattern-aware Triggering Event Detection for Adverse Experience during Pandemics

    Authors: Kaiqun Fu, Yangxiao Bai, Weiwei Zhang, Deepthi Kolady

    Abstract: The COVID-19 pandemic has accentuated socioeconomic disparities across various racial and ethnic groups in the United States. While previous studies have utilized traditional survey methods like the Household Pulse Survey (HPS) to elucidate these disparities, this paper explores the role of social media platforms in both highlighting and addressing these challenges. Drawing from real-time data sou… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:1911.08684