Search | arXiv e-print repository

arXiv:2408.15177 [pdf, other]

Regaining Trust: Impact of Transparent User Interface Design on Acceptance of Camera-Based In-Car Health Monitoring Systems

Authors: Hauke Sandhaus, Madiha Zahrah Choksi, Wendy Ju

Abstract: Introducing in-car health monitoring systems offers substantial potential to improve driver safety. However, camera-based sensing technologies introduce significant privacy concerns. This study investigates the impact of transparent user interface design on user acceptance of these systems. We conducted an online study with 42 participants using prototypes varying in transparency, choice, and dece… ▽ More Introducing in-car health monitoring systems offers substantial potential to improve driver safety. However, camera-based sensing technologies introduce significant privacy concerns. This study investigates the impact of transparent user interface design on user acceptance of these systems. We conducted an online study with 42 participants using prototypes varying in transparency, choice, and deception levels. The prototypes included three onboarding designs: (1) a traditional Terms and Conditions text, (2) a Business Nudge design that subtly encouraged users to accept default data-sharing options, and (3) a Transparent Walk-Through that provided clear, step-by-step explanations of data use and privacy policies. Our findings indicate that transparent design significantly affects user experience measures, including perceived creepiness, trust in data use, and trustworthiness of content. Transparent onboarding processes enhanced user experience and trust without significantly increasing onboarding time. These findings offer practical guidance for designing user-friendly and privacy-respecting in-car health monitoring systems. △ Less

Submitted 27 August, 2024; originally announced August 2024.

Comments: About to be published in the AutoUI '24 WiP proceedings

ACM Class: H.5.2; K.6.5

arXiv:2408.12185 [pdf, other]

doi 10.24963/ijcai.2024/520

Rank and Align: Towards Effective Source-free Graph Domain Adaptation

Authors: Junyu Luo, Zhiping Xiao, Yifan Wang, Xiao Luo, Jingyang Yuan, Wei Ju, Langechuan Liu, Ming Zhang

Abstract: Graph neural networks (GNNs) have achieved impressive performance in graph domain adaptation. However, extensive source graphs could be unavailable in real-world scenarios due to privacy and storage concerns. To this end, we investigate an underexplored yet practical problem of source-free graph domain adaptation, which transfers knowledge from source models instead of source graphs to a target do… ▽ More Graph neural networks (GNNs) have achieved impressive performance in graph domain adaptation. However, extensive source graphs could be unavailable in real-world scenarios due to privacy and storage concerns. To this end, we investigate an underexplored yet practical problem of source-free graph domain adaptation, which transfers knowledge from source models instead of source graphs to a target domain. To solve this problem, we introduce a novel GNN-based approach called Rank and Align (RNA), which ranks graph similarities with spectral seriation for robust semantics learning, and aligns inharmonic graphs with harmonic graphs which close to the source domain for subgraph extraction. In particular, to overcome label scarcity, we employ the spectral seriation algorithm to infer the robust pairwise rankings, which can guide semantic learning using a similarity learning objective. To depict distribution shifts, we utilize spectral clustering and the silhouette coefficient to detect harmonic graphs, which the source model can easily classify. To reduce potential domain discrepancy, we extract domain-invariant subgraphs from inharmonic graphs by an adversarial edge sampling process, which guides the invariant learning of GNNs. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our proposed RNA. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: Published in IJCAI2024

arXiv:2408.04144 [pdf, other]

Integrated Dynamic Phenological Feature for Remote Sensing Image Land Cover Change Detection

Authors: Yi Liu, Chenhao Sun, Hao Ye, Xiangying Liu, Weilong Ju

Abstract: Remote sensing image change detection (CD) is essential for analyzing land surface changes over time, with a significant challenge being the differentiation of actual changes from complex scenes while filtering out pseudo-changes. A primary contributor to this challenge is the intra-class dynamic changes due to phenological characteristics in natural areas. To overcome this, we introduce the InPhe… ▽ More Remote sensing image change detection (CD) is essential for analyzing land surface changes over time, with a significant challenge being the differentiation of actual changes from complex scenes while filtering out pseudo-changes. A primary contributor to this challenge is the intra-class dynamic changes due to phenological characteristics in natural areas. To overcome this, we introduce the InPhea model, which integrates phenological features into a remote sensing image CD framework. The model features a detector with a differential attention module for improved feature representation of change information, coupled with high-resolution feature extraction and spatial pyramid blocks to enhance performance. Additionally, a constrainer with four constraint modules and a multi-stage contrastive learning approach is employed to aid in the model's understanding of phenological characteristics. Experiments on the HRSCD, SECD, and PSCD-Wuhan datasets reveal that InPhea outperforms other models, confirming its effectiveness in addressing phenological pseudo-changes and its overall model superiority. △ Less

Submitted 7 August, 2024; originally announced August 2024.

arXiv:2407.14081 [pdf, other]

DisenSemi: Semi-supervised Graph Classification via Disentangled Representation Learning

Authors: Yifan Wang, Xiao Luo, Chong Chen, Xian-Sheng Hua, Ming Zhang, Wei Ju

Abstract: Graph classification is a critical task in numerous multimedia applications, where graphs are employed to represent diverse types of multimedia data, including images, videos, and social networks. Nevertheless, in real-world scenarios, labeled graph data can be limited or scarce. To address this issue, we focus on the problem of semi-supervised graph classification, which involves both supervised… ▽ More Graph classification is a critical task in numerous multimedia applications, where graphs are employed to represent diverse types of multimedia data, including images, videos, and social networks. Nevertheless, in real-world scenarios, labeled graph data can be limited or scarce. To address this issue, we focus on the problem of semi-supervised graph classification, which involves both supervised and unsupervised models learning from labeled and unlabeled data. In contrast to recent approaches that transfer the entire knowledge from the unsupervised model to the supervised one, we argue that an effective transfer should only retain the relevant semantics that align well with the supervised task. In this paper, we propose a novel framework named DisenSemi, which learns disentangled representation for semi-supervised graph classification. Specifically, a disentangled graph encoder is proposed to generate factor-wise graph representations for both supervised and unsupervised models. Then we train two models via supervised objective and mutual information (MI)-based constraints respectively. To ensure the meaningful transfer of knowledge from the unsupervised encoder to the supervised one, we further define an MI-based disentangled consistency regularization between two models and identify the corresponding rationale that aligns well with the current graph classification task. Experimental results on a range of publicly accessible datasets reveal the effectiveness of our DisenSemi. △ Less

Submitted 9 August, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS 2024)

arXiv:2407.06094 [pdf, ps, other]

ERR@HRI 2024 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Interactions

Authors: Micol Spitale, Maria Teresa Parreira, Maia Stiber, Minja Axelsson, Neval Kara, Garima Kankariya, Chien-Ming Huang, Malte Jung, Wendy Ju, Hatice Gunes

Abstract: Despite the recent advancements in robotics and machine learning (ML), the deployment of autonomous robots in our everyday lives is still an open challenge. This is due to multiple reasons among which are their frequent mistakes, such as interrupting people or having delayed responses, as well as their limited ability to understand human speech, i.e., failure in tasks like transcribing speech to t… ▽ More Despite the recent advancements in robotics and machine learning (ML), the deployment of autonomous robots in our everyday lives is still an open challenge. This is due to multiple reasons among which are their frequent mistakes, such as interrupting people or having delayed responses, as well as their limited ability to understand human speech, i.e., failure in tasks like transcribing speech to text. These mistakes may disrupt interactions and negatively influence human perception of these robots. To address this problem, robots need to have the ability to detect human-robot interaction (HRI) failures. The ERR@HRI 2024 challenge tackles this by offering a benchmark multimodal dataset of robot failures during human-robot interactions (HRI), encouraging researchers to develop and benchmark multimodal machine learning models to detect these failures. We created a dataset featuring multimodal non-verbal interaction data, including facial, speech, and pose features from video clips of interactions with a robotic coach, annotated with labels indicating the presence or absence of robot mistakes, user awkwardness, and interaction ruptures, allowing for the training and evaluation of predictive models. Challenge participants have been invited to submit their multimodal ML models for detection of robot errors and to be evaluated against various performance metrics such as accuracy, precision, recall, F1 score, with and without a margin of error reflecting the time-sensitivity of these metrics. The results of this challenge will help the research field in better understanding the robot failures in human-robot interactions and designing autonomous robots that can mitigate their own errors after successfully detecting them. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.00468 [pdf, other]

MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation

Authors: Jinsheng Huang, Liang Chen, Taian Guo, Fu Zeng, Yusheng Zhao, Bohan Wu, Ye Yuan, Haozhe Zhao, Zhihui Guo, Yichi Zhang, Jingyang Yuan, Wei Ju, Luchen Liu, Tianyu Liu, Baobao Chang, Ming Zhang

Abstract: Large Multimodal Models (LMMs) exhibit impressive cross-modal understanding and reasoning abilities, often assessed through multiple-choice questions (MCQs) that include an image, a question, and several options. However, many benchmarks used for such evaluations suffer from systematic biases. Remarkably, Large Language Models (LLMs) without any visual perception capabilities achieve non-trivial p… ▽ More Large Multimodal Models (LMMs) exhibit impressive cross-modal understanding and reasoning abilities, often assessed through multiple-choice questions (MCQs) that include an image, a question, and several options. However, many benchmarks used for such evaluations suffer from systematic biases. Remarkably, Large Language Models (LLMs) without any visual perception capabilities achieve non-trivial performance, undermining the credibility of these evaluations. To address this issue while maintaining the efficiency of MCQ evaluations, we propose MMEvalPro, a benchmark designed to avoid Type-I errors through a trilogy evaluation pipeline and more rigorous metrics. For each original question from existing benchmarks, human annotators augment it by creating one perception question and one knowledge anchor question through a meticulous annotation process. MMEvalPro comprises $2,138$ question triplets, totaling $6,414$ distinct questions. Two-thirds of these questions are manually labeled by human experts, while the rest are sourced from existing benchmarks (MMMU, ScienceQA, and MathVista). Compared with the existing benchmarks, our experiments with the latest LLMs and LMMs demonstrate that MMEvalPro is more challenging (the best LMM lags behind human performance by $31.73\%$, compared to an average gap of $8.03\%$ in previous benchmarks) and more trustworthy (the best LLM trails the best LMM by $23.09\%$, whereas the gap for previous benchmarks is just $14.64\%$). Our in-depth analysis explains the reason for the large performance gap and justifies the trustworthiness of evaluation, underscoring its significant potential for advancing future research. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: 21 pages, code released at https://github.com/chenllliang/MMEvalPro, Homepage at https://mmevalpro.github.io/

arXiv:2405.11868 [pdf, other]

Towards Graph Contrastive Learning: A Survey and Beyond

Authors: Wei Ju, Yifan Wang, Yifang Qin, Zhengyang Mao, Zhiping Xiao, Junyu Luo, Junwei Yang, Yiyang Gu, Dongjie Wang, Qingqing Long, Siyu Yi, Xiao Luo, Ming Zhang

Abstract: In recent years, deep learning on graphs has achieved remarkable success in various domains. However, the reliance on annotated graph data remains a significant bottleneck due to its prohibitive cost and time-intensive nature. To address this challenge, self-supervised learning (SSL) on graphs has gained increasing attention and has made significant progress. SSL enables machine learning models to… ▽ More In recent years, deep learning on graphs has achieved remarkable success in various domains. However, the reliance on annotated graph data remains a significant bottleneck due to its prohibitive cost and time-intensive nature. To address this challenge, self-supervised learning (SSL) on graphs has gained increasing attention and has made significant progress. SSL enables machine learning models to produce informative representations from unlabeled graph data, reducing the reliance on expensive labeled data. While SSL on graphs has witnessed widespread adoption, one critical component, Graph Contrastive Learning (GCL), has not been thoroughly investigated in the existing literature. Thus, this survey aims to fill this gap by offering a dedicated survey on GCL. We provide a comprehensive overview of the fundamental principles of GCL, including data augmentation strategies, contrastive modes, and contrastive optimization objectives. Furthermore, we explore the extensions of GCL to other aspects of data-efficient graph learning, such as weakly supervised learning, transfer learning, and related scenarios. We also discuss practical applications spanning domains such as drug discovery, genomics analysis, recommender systems, and finally outline the challenges and potential future directions in this field. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.04773 [pdf, other]

Hypergraph-enhanced Dual Semi-supervised Graph Classification

Authors: Wei Ju, Zhengyang Mao, Siyu Yi, Yifang Qin, Yiyang Gu, Zhiping Xiao, Yifan Wang, Xiao Luo, Ming Zhang

Abstract: In this paper, we study semi-supervised graph classification, which aims at accurately predicting the categories of graphs in scenarios with limited labeled graphs and abundant unlabeled graphs. Despite the promising capability of graph neural networks (GNNs), they typically require a large number of costly labeled graphs, while a wealth of unlabeled graphs fail to be effectively utilized. Moreove… ▽ More In this paper, we study semi-supervised graph classification, which aims at accurately predicting the categories of graphs in scenarios with limited labeled graphs and abundant unlabeled graphs. Despite the promising capability of graph neural networks (GNNs), they typically require a large number of costly labeled graphs, while a wealth of unlabeled graphs fail to be effectively utilized. Moreover, GNNs are inherently limited to encoding local neighborhood information using message-passing mechanisms, thus lacking the ability to model higher-order dependencies among nodes. To tackle these challenges, we propose a Hypergraph-Enhanced DuAL framework named HEAL for semi-supervised graph classification, which captures graph semantics from the perspective of the hypergraph and the line graph, respectively. Specifically, to better explore the higher-order relationships among nodes, we design a hypergraph structure learning to adaptively learn complex node dependencies beyond pairwise relations. Meanwhile, based on the learned hypergraph, we introduce a line graph to capture the interaction between hyperedges, thereby better mining the underlying semantic structures. Finally, we develop a relational consistency learning to facilitate knowledge transfer between the two branches and provide better mutual guidance. Extensive experiments on real-world graph datasets verify the effectiveness of the proposed method against existing state-of-the-art methods. △ Less

Submitted 28 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: Accepted by Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

arXiv:2405.01467 [pdf, other]

Student Reflections on Self-Initiated GenAI Use in HCI Education

Authors: Hauke Sandhaus, Maria Teresa Parreira, Wendy Ju

Abstract: This study explores students' self-initiated use of Generative Artificial Intelligence (GenAI) tools in an interactive systems design class. Through 12 group interviews, students revealed the dual nature of GenAI in (1) stimulating creativity and (2) speeding up design iterations, alongside concerns over its potential to cause shallow learning and reliance. GenAI's benefits were pronounced in the… ▽ More This study explores students' self-initiated use of Generative Artificial Intelligence (GenAI) tools in an interactive systems design class. Through 12 group interviews, students revealed the dual nature of GenAI in (1) stimulating creativity and (2) speeding up design iterations, alongside concerns over its potential to cause shallow learning and reliance. GenAI's benefits were pronounced in the execution phase of design, aiding rapid prototyping and ideation, while its use in initial insight generation posed risks to depth and reflective practice. This reflection highlights the complex role of GenAI in Human-Computer Interaction education, emphasizing the need for balanced integration to leverage its advantages without compromising fundamental learning outcomes. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: Published to the CHI '24 Workshop: LLMs as Research Tools: Applications and Evaluations in HCI Data Work (https://sites.google.com/view/llmsindatawork/)

ACM Class: K.3.1; K.3.2

arXiv:2404.18375 [pdf, other]

Field Notes on Deploying Research Robots in Public Spaces

Authors: Fanjun Bu, Alexandra Bremers, Mark Colley, Wendy Ju

Abstract: Human-robot interaction requires to be studied in the wild. In the summers of 2022 and 2023, we deployed two trash barrel service robots through the wizard-of-oz protocol in public spaces to study human-robot interactions in urban settings. We deployed the robots at two different public plazas in downtown Manhattan and Brooklyn for a collective of 20 hours of field time. To date, relatively few lo… ▽ More Human-robot interaction requires to be studied in the wild. In the summers of 2022 and 2023, we deployed two trash barrel service robots through the wizard-of-oz protocol in public spaces to study human-robot interactions in urban settings. We deployed the robots at two different public plazas in downtown Manhattan and Brooklyn for a collective of 20 hours of field time. To date, relatively few long-term human-robot interaction studies have been conducted in shared public spaces. To support researchers aiming to fill this gap, we would like to share some of our insights and learned lessons that would benefit both researchers and practitioners on how to deploy robots in public spaces. We share best practices and lessons learned with the HRI research community to encourage more in-the-wild research of robots in public spaces and call for the community to share their lessons learned to a GitHub repository. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: CHI LBW 2024

arXiv:2404.00392 [pdf, other]

Designing a User-centric Framework for Information Quality Ranking of Large-scale Street View Images

Authors: Tahiya Chowdhury, Ilan Mandel, Jorge Ortiz, Wendy Ju

Abstract: Street view imagery (SVI), largely captured via outfitted fleets or mounted dashcams in consumer vehicles is a rapidly growing source of geospatial data used in urban sensing and development. These datasets are often collected opportunistically, are massive in size, and vary in quality which limits the scope and extent of their use in urban planning. Thus far there has not been much work to identi… ▽ More Street view imagery (SVI), largely captured via outfitted fleets or mounted dashcams in consumer vehicles is a rapidly growing source of geospatial data used in urban sensing and development. These datasets are often collected opportunistically, are massive in size, and vary in quality which limits the scope and extent of their use in urban planning. Thus far there has not been much work to identify the obstacles experienced and tools needed by the users of such datasets. This severely limits the opportunities of using emerging street view images in supporting novel research questions that can improve the quality of urban life. This work includes a formative interview study with 5 expert users of large-scale street view datasets from academia, urban planning, and related professions which identifies novel use cases, challenges, and opportunities to increase the utility of these datasets. Based on the user findings, we present a framework to evaluate the quality of information for street images across three attributes (spatial, temporal, and content) that stakeholders can utilize for estimating the value of a dataset, and to improve it over time for their respective use case. We then present a case study using novel street view images where we evaluate our framework and present practical use cases for users. We discuss the implications for designing future systems to support the collection and use of street view data to assist in sensing and planning the urban environment. △ Less

Submitted 30 March, 2024; originally announced April 2024.

arXiv:2403.10994 [pdf, other]

SSUP-HRI: Social Signaling in Urban Public Human-Robot Interaction dataset

Authors: Fanjun Bu, Wendy Ju

Abstract: This paper introduces our dataset featuring human-robot interactions (HRI) in urban public environments. This dataset is rich with social signals that we believe can be modeled to help understand naturalistic human-robot interaction. Our dataset currently comprises approximately 15 hours of video footage recorded from the robots' perspectives, within which we annotated a total of 274 observable in… ▽ More This paper introduces our dataset featuring human-robot interactions (HRI) in urban public environments. This dataset is rich with social signals that we believe can be modeled to help understand naturalistic human-robot interaction. Our dataset currently comprises approximately 15 hours of video footage recorded from the robots' perspectives, within which we annotated a total of 274 observable interactions featuring a wide range of naturalistic human-robot interactions. The data was collected by two mobile trash barrel robots deployed in Astor Place, New York City, over the course of a week. We invite the HRI community to access and utilize our dataset. To the best of our knowledge, this is the first dataset showcasing robot deployments in a complete public, non-controlled setting involving urban residents. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: Workshop on Social Signal Modelling (SS4HRI '24) at HRI 2024

arXiv:2403.06315 [pdf, other]

A Study on Domain Generalization for Failure Detection through Human Reactions in HRI

Authors: Maria Teresa Parreira, Sukruth Gowdru Lingaraju, Adolfo Ramirez-Aristizabal, Manaswi Saha, Michael Kuniavsky, Wendy Ju

Abstract: Machine learning models are commonly tested in-distribution (same dataset); performance almost always drops in out-of-distribution settings. For HRI research, the goal is often to develop generalized models. This makes domain generalization - retaining performance in different settings - a critical issue. In this study, we present a concise analysis of domain generalization in failure detection mo… ▽ More Machine learning models are commonly tested in-distribution (same dataset); performance almost always drops in out-of-distribution settings. For HRI research, the goal is often to develop generalized models. This makes domain generalization - retaining performance in different settings - a critical issue. In this study, we present a concise analysis of domain generalization in failure detection models trained on human facial expressions. Using two distinct datasets of humans reacting to videos where error occurs, one from a controlled lab setting and another collected online, we trained deep learning models on each dataset. When testing these models on the alternate dataset, we observed a significant performance drop. We reflect on the causes for the observed model behavior and leave recommendations. This work emphasizes the need for HRI research focusing on improving model robustness and real-life applicability. △ Less

Submitted 10 March, 2024; originally announced March 2024.

arXiv:2403.04468 [pdf, other]

A Survey of Graph Neural Networks in Real world: Imbalance, Noise, Privacy and OOD Challenges

Authors: Wei Ju, Siyu Yi, Yifan Wang, Zhiping Xiao, Zhengyang Mao, Hourun Li, Yiyang Gu, Yifang Qin, Nan Yin, Senzhang Wang, Xinwang Liu, Xiao Luo, Philip S. Yu, Ming Zhang

Abstract: Graph-structured data exhibits universality and widespread applicability across diverse domains, such as social network analysis, biochemistry, financial fraud detection, and network security. Significant strides have been made in leveraging Graph Neural Networks (GNNs) to achieve remarkable success in these areas. However, in real-world scenarios, the training environment for models is often far… ▽ More Graph-structured data exhibits universality and widespread applicability across diverse domains, such as social network analysis, biochemistry, financial fraud detection, and network security. Significant strides have been made in leveraging Graph Neural Networks (GNNs) to achieve remarkable success in these areas. However, in real-world scenarios, the training environment for models is often far from ideal, leading to substantial performance degradation of GNN models due to various unfavorable factors, including imbalance in data distribution, the presence of noise in erroneous data, privacy protection of sensitive information, and generalization capability for out-of-distribution (OOD) scenarios. To tackle these issues, substantial efforts have been devoted to improving the performance of GNN models in practical real-world scenarios, as well as enhancing their reliability and robustness. In this paper, we present a comprehensive survey that systematically reviews existing GNN models, focusing on solutions to the four mentioned real-world challenges including imbalance, noise, privacy, and OOD in practical scenarios that many existing reviews have not considered. Specifically, we first highlight the four key challenges faced by existing GNNs, paving the way for our exploration of real-world GNN models. Subsequently, we provide detailed discussions on these four aspects, dissecting how these solutions contribute to enhancing the reliability and robustness of GNN models. Last but not least, we outline promising directions and offer future perspectives in the field. △ Less

Submitted 7 March, 2024; originally announced March 2024.

arXiv:2403.01091 [pdf, other]

COOL: A Conjoint Perspective on Spatio-Temporal Graph Neural Network for Traffic Forecasting

Authors: Wei Ju, Yusheng Zhao, Yifang Qin, Siyu Yi, Jingyang Yuan, Zhiping Xiao, Xiao Luo, Xiting Yan, Ming Zhang

Abstract: This paper investigates traffic forecasting, which attempts to forecast the future state of traffic based on historical situations. This problem has received ever-increasing attention in various scenarios and facilitated the development of numerous downstream applications such as urban planning and transportation management. However, the efficacy of existing methods remains sub-optimal due to thei… ▽ More This paper investigates traffic forecasting, which attempts to forecast the future state of traffic based on historical situations. This problem has received ever-increasing attention in various scenarios and facilitated the development of numerous downstream applications such as urban planning and transportation management. However, the efficacy of existing methods remains sub-optimal due to their tendency to model temporal and spatial relationships independently, thereby inadequately accounting for complex high-order interactions of both worlds. Moreover, the diversity of transitional patterns in traffic forecasting makes them challenging to capture for existing approaches, warranting a deeper exploration of their diversity. Toward this end, this paper proposes Conjoint Spatio-Temporal graph neural network (abbreviated as COOL), which models heterogeneous graphs from prior and posterior information to conjointly capture high-order spatio-temporal relationships. On the one hand, heterogeneous graphs connecting sequential observation are constructed to extract composite spatio-temporal relationships via prior message passing. On the other hand, we model dynamic relationships using constructed affinity and penalty graphs, which guide posterior message passing to incorporate complementary semantic information into node representations. Moreover, to capture diverse transitional properties to enhance traffic forecasting, we propose a conjoint self-attention decoder that models diverse temporal patterns from both multi-rank and multi-scale views. Experimental results on four popular benchmark datasets demonstrate that our proposed COOL provides state-of-the-art performance compared with the competitive baselines. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: Accepted by Information Fusion 2024

arXiv:2402.08061 [pdf, other]

Portobello: Extending Driving Simulation from the Lab to the Road

Authors: Fanjun Bu, Stacey Li, David Goedicke, Mark Colley, Gyanendra Sharma, Hiroshi Yasuda, Wendy Ju

Abstract: In automotive user interface design, testing often starts with lab-based driving simulators and migrates toward on-road studies to mitigate risks. Mixed reality (XR) helps translate virtual study designs to the real road to increase ecological validity. However, researchers rarely run the same study in both in-lab and on-road simulators due to the challenges of replicating studies in both physical… ▽ More In automotive user interface design, testing often starts with lab-based driving simulators and migrates toward on-road studies to mitigate risks. Mixed reality (XR) helps translate virtual study designs to the real road to increase ecological validity. However, researchers rarely run the same study in both in-lab and on-road simulators due to the challenges of replicating studies in both physical and virtual worlds. To provide a common infrastructure to port in-lab study designs on-road, we built a platform-portable infrastructure, Portobello, to enable us to run twinned physical-virtual studies. As a proof-of-concept, we extended the on-road simulator XR-OOM with Portobello. We ran a within-subjects, autonomous-vehicle crosswalk cooperation study (N=32) both in-lab and on-road to investigate study design portability and platform-driven influences on study outcomes. To our knowledge, this is the first system that enables the twinning of studies originally designed for in-lab simulators to be carried out in an on-road platform. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: CHI 2024

arXiv:2402.06801 [pdf, other]

Fingerprinting New York City's Scaffolding Problem with Longitudinal Dashcam Data

Authors: Dorin Shapira, Matt Franchi, Wendy Ju

Abstract: Scaffolds, also called sidewalk sheds, are intended to be temporary structures to protect pedestrians from construction and repair hazards. However, some sidewalk sheds are left up for years. Long-term scaffolding becomes eyesores, creates accessibility issues on sidewalks, and gives cover to illicit activity. Today, there are over 8,000 active permits for scaffolds in NYC; the more problematic sc… ▽ More Scaffolds, also called sidewalk sheds, are intended to be temporary structures to protect pedestrians from construction and repair hazards. However, some sidewalk sheds are left up for years. Long-term scaffolding becomes eyesores, creates accessibility issues on sidewalks, and gives cover to illicit activity. Today, there are over 8,000 active permits for scaffolds in NYC; the more problematic scaffolds are likely expired or unpermitted. This research uses computer vision on street-level imagery to develop a longitudinal map of scaffolding throughout the city. Using a dataset of 29,156,833 dashcam images taken between August 2023 and January 2024, we develop an algorithm to track the presence of scaffolding over time. We also design and implement methods to match detected scaffolds to reported locations of active scaffolding permits, enabling the identification of sidewalk sheds without corresponding permits. We identify 850,766 images of scaffolding, tagging 5,156 active sidewalk sheds and estimating 529 unpermitted sheds. We discuss the implications of an in-the-wild scaffolding classifier for urban tech, innovations to governmental inspection processes, and out-of-distribution evaluations outside of New York City. △ Less

Submitted 9 February, 2024; originally announced February 2024.

arXiv:2402.03691 [pdf, other]

doi 10.1145/3610978.3640560

Adversarial Robots as Creative Collaborators

Authors: Shayla Lee, Wendy Ju

Abstract: This research explores whether the interaction between adversarial robots and creative practitioners can push artists to rethink their initial ideas. It also explores how working with these robots may influence artists' views of machines designed for creative tasks or collaboration. Many existing robots developed for creativity and the arts focus on complementing creative practices, but what if ro… ▽ More This research explores whether the interaction between adversarial robots and creative practitioners can push artists to rethink their initial ideas. It also explores how working with these robots may influence artists' views of machines designed for creative tasks or collaboration. Many existing robots developed for creativity and the arts focus on complementing creative practices, but what if robots challenged ideas instead? To begin investigating this, I designed UnsTable, a robot drawing desk that moves the paper while participants (N=19) draw to interfere with the process. This inquiry invites further research into adversarial robots designed to challenge creative practitioners. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.00447 [pdf, ps, other]

A Survey of Data-Efficient Graph Learning

Authors: Wei Ju, Siyu Yi, Yifan Wang, Qingqing Long, Junyu Luo, Zhiping Xiao, Ming Zhang

Abstract: Graph-structured data, prevalent in domains ranging from social networks to biochemical analysis, serve as the foundation for diverse real-world systems. While graph neural networks demonstrate proficiency in modeling this type of data, their success is often reliant on significant amounts of labeled data, posing a challenge in practical scenarios with limited annotation resources. To tackle this… ▽ More Graph-structured data, prevalent in domains ranging from social networks to biochemical analysis, serve as the foundation for diverse real-world systems. While graph neural networks demonstrate proficiency in modeling this type of data, their success is often reliant on significant amounts of labeled data, posing a challenge in practical scenarios with limited annotation resources. To tackle this problem, tremendous efforts have been devoted to enhancing graph machine learning performance under low-resource settings by exploring various approaches to minimal supervision. In this paper, we introduce a novel concept of Data-Efficient Graph Learning (DEGL) as a research frontier, and present the first survey that summarizes the current progress of DEGL. We initiate by highlighting the challenges inherent in training models with large labeled data, paving the way for our exploration into DEGL. Next, we systematically review recent advances on this topic from several key aspects, including self-supervised graph learning, semi-supervised graph learning, and few-shot graph learning. Also, we state promising directions for future research, contributing to the evolution of graph machine learning. △ Less

Submitted 19 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

Comments: Accepted by Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI 2024)

arXiv:2401.16011 [pdf, other]

GPS: Graph Contrastive Learning via Multi-scale Augmented Views from Adversarial Pooling

Authors: Wei Ju, Yiyang Gu, Zhengyang Mao, Ziyue Qiao, Yifang Qin, Xiao Luo, Hui Xiong, Ming Zhang

Abstract: Self-supervised graph representation learning has recently shown considerable promise in a range of fields, including bioinformatics and social networks. A large number of graph contrastive learning approaches have shown promising performance for representation learning on graphs, which train models by maximizing agreement between original graphs and their augmented views (i.e., positive views). U… ▽ More Self-supervised graph representation learning has recently shown considerable promise in a range of fields, including bioinformatics and social networks. A large number of graph contrastive learning approaches have shown promising performance for representation learning on graphs, which train models by maximizing agreement between original graphs and their augmented views (i.e., positive views). Unfortunately, these methods usually involve pre-defined augmentation strategies based on the knowledge of human experts. Moreover, these strategies may fail to generate challenging positive views to provide sufficient supervision signals. In this paper, we present a novel approach named Graph Pooling ContraSt (GPS) to address these issues. Motivated by the fact that graph pooling can adaptively coarsen the graph with the removal of redundancy, we rethink graph pooling and leverage it to automatically generate multi-scale positive views with varying emphasis on providing challenging positives and preserving semantics, i.e., strongly-augmented view and weakly-augmented view. Then, we incorporate both views into a joint contrastive learning framework with similarity learning and consistency learning, where our pooling module is adversarially trained with respect to the encoder for adversarial robustness. Experiments on twelve datasets on both graph classification and transfer learning tasks verify the superiority of the proposed method over its counterparts. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: Accepted by SCIENCE CHINA Information Sciences (SCIS 2024)

arXiv:2401.12590 [pdf, other]

PolyCF: Towards the Optimal Spectral Graph Filters for Collaborative Filtering

Authors: Yifang Qin, Wei Ju, Xiao Luo, Yiyang Gu, Zhiping Xiao, Ming Zhang

Abstract: Collaborative Filtering (CF) is a pivotal research area in recommender systems that capitalizes on collaborative similarities between users and items to provide personalized recommendations. With the remarkable achievements of node embedding-based Graph Neural Networks (GNNs), we explore the upper bounds of expressiveness inherent to embedding-based methodologies and tackle the challenges by refra… ▽ More Collaborative Filtering (CF) is a pivotal research area in recommender systems that capitalizes on collaborative similarities between users and items to provide personalized recommendations. With the remarkable achievements of node embedding-based Graph Neural Networks (GNNs), we explore the upper bounds of expressiveness inherent to embedding-based methodologies and tackle the challenges by reframing the CF task as a graph signal processing problem. To this end, we propose PolyCF, a flexible graph signal filter that leverages polynomial graph filters to process interaction signals. PolyCF exhibits the capability to capture spectral features across multiple eigenspaces through a series of Generalized Gram filters and is able to approximate the optimal polynomial response function for recovering missing interactions. A graph optimization objective and a pair-wise ranking objective are jointly used to optimize the parameters of the convolution kernel. Experiments on three widely adopted datasets demonstrate the superiority of PolyCF over current state-of-the-art CF methods. Moreover, comprehensive studies empirically validate each component's efficacy in the proposed PolyCF. △ Less

Submitted 28 January, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.00713 [pdf, other]

A Survey on Graph Neural Networks in Intelligent Transportation Systems

Authors: Hourun Li, Yusheng Zhao, Zhengyang Mao, Yifang Qin, Zhiping Xiao, Jiaqi Feng, Yiyang Gu, Wei Ju, Xiao Luo, Ming Zhang

Abstract: Intelligent Transportation System (ITS) is vital in improving traffic congestion, reducing traffic accidents, optimizing urban planning, etc. However, due to the complexity of the traffic network, traditional machine learning and statistical methods are relegated to the background. With the advent of the artificial intelligence era, many deep learning frameworks have made remarkable progress in va… ▽ More Intelligent Transportation System (ITS) is vital in improving traffic congestion, reducing traffic accidents, optimizing urban planning, etc. However, due to the complexity of the traffic network, traditional machine learning and statistical methods are relegated to the background. With the advent of the artificial intelligence era, many deep learning frameworks have made remarkable progress in various fields and are now considered effective methods in many areas. As a deep learning method, Graph Neural Networks (GNNs) have emerged as a highly competitive method in the ITS field since 2019 due to their strong ability to model graph-related problems. As a result, more and more scholars pay attention to the applications of GNNs in transportation domains, which have shown excellent performance. However, most of the research in this area is still concentrated on traffic forecasting, while other ITS domains, such as autonomous vehicles and urban planning, still require more attention. This paper aims to review the applications of GNNs in six representative and emerging ITS domains: traffic forecasting, autonomous vehicles, traffic signal control, transportation safety, demand prediction, and parking management. We have reviewed extensive graph-related studies from 2018 to 2023, summarized their methods, features, and contributions, and presented them in informative tables or lists. Finally, we have identified the challenges of applying GNNs to ITS and suggested potential future directions. △ Less

Submitted 2 January, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

arXiv:2312.14358 [pdf]

A utility belt for an agricultural robot: reflection-in-action for applied design research

Authors: Natalie Friedman, Asmita Mehta, Kari Love, Alexandra Bremers, Awsaf Ahmed, Wendy Ju

Abstract: Clothing for robots can help expand a robot's functionality and also clarify the robot's purpose to bystanders. In studying how to design clothing for robots, we can shed light on the functional role of aesthetics in interactive system design. We present a case study of designing a utility belt for an agricultural robot. We use reflection-in-action to consider the ways that observation, in situ ma… ▽ More Clothing for robots can help expand a robot's functionality and also clarify the robot's purpose to bystanders. In studying how to design clothing for robots, we can shed light on the functional role of aesthetics in interactive system design. We present a case study of designing a utility belt for an agricultural robot. We use reflection-in-action to consider the ways that observation, in situ making, and documentation serve to illuminate how pragmatic, aesthetic, and intellectual inquiry are layered in this applied design research project. Themes explored in this pictorial include 1) contextual discovery of materials, tools, and practices, 2) design space exploration of materials in context, 3) improvising spaces for making, and 4) social processes in design. These themes emerged from the qualitative coding of 25 reflection-in-action videos from the researcher. We conclude with feedback on the utility belt prototypes for an agriculture robot and our learnings about context, materials, and people needed to design successful novel clothing forms for robots. △ Less

Submitted 21 December, 2023; originally announced December 2023.

arXiv:2311.06554 [pdf, other]

PGODE: Towards High-quality System Dynamics Modeling

Authors: Xiao Luo, Yiyang Gu, Huiyu Jiang, Hang Zhou, Jinsheng Huang, Wei Ju, Zhiping Xiao, Ming Zhang, Yizhou Sun

Abstract: This paper studies the problem of modeling multi-agent dynamical systems, where agents could interact mutually to influence their behaviors. Recent research predominantly uses geometric graphs to depict these mutual interactions, which are then captured by powerful graph neural networks (GNNs). However, predicting interacting dynamics in challenging scenarios such as out-of-distribution shift and… ▽ More This paper studies the problem of modeling multi-agent dynamical systems, where agents could interact mutually to influence their behaviors. Recent research predominantly uses geometric graphs to depict these mutual interactions, which are then captured by powerful graph neural networks (GNNs). However, predicting interacting dynamics in challenging scenarios such as out-of-distribution shift and complicated underlying rules remains unsolved. In this paper, we propose a new approach named Prototypical Graph ODE (PGODE) to address the problem. The core of PGODE is to incorporate prototype decomposition from contextual knowledge into a continuous graph ODE framework. Specifically, PGODE employs representation disentanglement and system parameters to extract both object-level and system-level contexts from historical trajectories, which allows us to explicitly model their independent influence and thus enhances the generalization capability under system changes. Then, we integrate these disentangled latent representations into a graph ODE model, which determines a combination of various interacting prototypes for enhanced model expressivity. The entire model is optimized using an end-to-end variational inference framework to maximize the likelihood. Extensive experiments in both in-distribution and out-of-distribution settings validate the superiority of PGODE compared to various baselines. △ Less

Submitted 26 June, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

Comments: Accepted by ICML 2024

arXiv:2311.06360 [pdf, other]

Initiative and Materiality: Exploring Mixed-Initiative Calculators with the Tangible Human-A.I. Interaction Framework

Authors: Lunshi Zhou, Alexandra Bremers, Wendy Ju

Abstract: How can interactions with A.I. systems be designed? This paper explores the design space for A.I. interaction to develop tools for designers to think about tangible and physical A.I. interactions. Our proposed framework consists of two dimensions: initiative (human, mixed, or machine) and materiality (physical, combined, or digital form). A particularly interesting area of interactions we identify… ▽ More How can interactions with A.I. systems be designed? This paper explores the design space for A.I. interaction to develop tools for designers to think about tangible and physical A.I. interactions. Our proposed framework consists of two dimensions: initiative (human, mixed, or machine) and materiality (physical, combined, or digital form). A particularly interesting area of interactions we identify is the quadrant of physical, machine-initiated interactions. With our framework, we examine calculator interactions and attempt to expand these to the tangible, mixed-initiative space. We illustrate each area in our proposed framework with one representative example of a calculator -- a common and well-known example of a computing device. We discuss existing examples of calculators and speculative future interactions with mixed-initiative and physical calculator systems. We reflect on the implications of our framework for the larger task of designing human-A.I. collaborative systems. Designers can also apply this framework as a guideline for analogous solutions to problems in the same domain. △ Less

Submitted 10 November, 2023; originally announced November 2023.

Comments: 5 pages

arXiv:2311.04456 [pdf, other]

doi 10.1145/3640794.3665580

(Social) Trouble on the Road: Understanding and Addressing Social Discomfort in Shared Car Trips

Authors: Alexandra Bremers, Natalie Friedman, Sam Lee, Tong Wu, Eric Laurier, Malte Jung, Jorge Ortiz, Wendy Ju

Abstract: Unpleasant social interactions on the road can negatively affect driving safety. At the same time, researchers have attempted to address social discomfort by exploring Conversational User Interfaces (CUIs) as social mediators. Before knowing whether CUIs could reduce social discomfort in a car, it is necessary to understand the nature of social discomfort in shared rides. To this end, we recorded… ▽ More Unpleasant social interactions on the road can negatively affect driving safety. At the same time, researchers have attempted to address social discomfort by exploring Conversational User Interfaces (CUIs) as social mediators. Before knowing whether CUIs could reduce social discomfort in a car, it is necessary to understand the nature of social discomfort in shared rides. To this end, we recorded nine families going on drives and performed interaction analysis on this data. We define three strategies to address social discomfort: contextual mediation, social mediation, and social support. We discuss considerations for engineering and design, and explore the limitations of current large language models in addressing social discomfort on the road. △ Less

Submitted 24 May, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

Comments: 13 pages, ACM CUI'24

arXiv:2310.00215 [pdf, other]

Implicit collaboration with a drawing machine through dance movements

Authors: Itay Grinberg, Alexandra Bremers, Louisa Pancoast, Wendy Ju

Abstract: In this demonstration, we exhibit the initial results of an ongoing body of exploratory work, investigating the potential for creative machines to communicate and collaborate with people through movement as a form of implicit interaction. The paper describes a Wizard-of-Oz demo, where a hidden wizard controls an AxiDraw drawing robot while a participant collaborates with it to draw a custom postca… ▽ More In this demonstration, we exhibit the initial results of an ongoing body of exploratory work, investigating the potential for creative machines to communicate and collaborate with people through movement as a form of implicit interaction. The paper describes a Wizard-of-Oz demo, where a hidden wizard controls an AxiDraw drawing robot while a participant collaborates with it to draw a custom postcard. This demonstration aims to gather perspectives from the computational fabrication community regarding how practitioners of fabrication with machines experience interacting with a mixed-initiative collaborative machine. △ Less

Submitted 29 September, 2023; originally announced October 2023.

arXiv:2309.14673 [pdf, other]

ALEX: Towards Effective Graph Transfer Learning with Noisy Labels

Authors: Jingyang Yuan, Xiao Luo, Yifang Qin, Zhengyang Mao, Wei Ju, Ming Zhang

Abstract: Graph Neural Networks (GNNs) have garnered considerable interest due to their exceptional performance in a wide range of graph machine learning tasks. Nevertheless, the majority of GNN-based approaches have been examined using well-annotated benchmark datasets, leading to suboptimal performance in real-world graph learning scenarios. To bridge this gap, the present paper investigates the problem o… ▽ More Graph Neural Networks (GNNs) have garnered considerable interest due to their exceptional performance in a wide range of graph machine learning tasks. Nevertheless, the majority of GNN-based approaches have been examined using well-annotated benchmark datasets, leading to suboptimal performance in real-world graph learning scenarios. To bridge this gap, the present paper investigates the problem of graph transfer learning in the presence of label noise, which transfers knowledge from a noisy source graph to an unlabeled target graph. We introduce a novel technique termed Balance Alignment and Information-aware Examination (ALEX) to address this challenge. ALEX first employs singular value decomposition to generate different views with crucial structural semantics, which help provide robust node representations using graph contrastive learning. To mitigate both label shift and domain shift, we estimate a prior distribution to build subgraphs with balanced label distributions. Building on this foundation, an adversarial domain discriminator is incorporated for the implicit domain alignment of complex multi-modal distributions. Furthermore, we project node representations into a different space, optimizing the mutual information between the projected features and labels. Subsequently, the inconsistency of similarity structures is evaluated to identify noisy samples with potential overfitting. Comprehensive experiments on various benchmark datasets substantiate the outstanding superiority of the proposed ALEX in different settings. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: Accepted by the ACM International Conference on Multimedia (MM) 2023

arXiv:2309.12028 [pdf, other]

Dynamic Hypergraph Structure Learning for Traffic Flow Forecasting

Authors: Yusheng Zhao, Xiao Luo, Wei Ju, Chong Chen, Xian-Sheng Hua, Ming Zhang

Abstract: This paper studies the problem of traffic flow forecasting, which aims to predict future traffic conditions on the basis of road networks and traffic conditions in the past. The problem is typically solved by modeling complex spatio-temporal correlations in traffic data using spatio-temporal graph neural networks (GNNs). However, the performance of these methods is still far from satisfactory sinc… ▽ More This paper studies the problem of traffic flow forecasting, which aims to predict future traffic conditions on the basis of road networks and traffic conditions in the past. The problem is typically solved by modeling complex spatio-temporal correlations in traffic data using spatio-temporal graph neural networks (GNNs). However, the performance of these methods is still far from satisfactory since GNNs usually have limited representation capacity when it comes to complex traffic networks. Graphs, by nature, fall short in capturing non-pairwise relations. Even worse, existing methods follow the paradigm of message passing that aggregates neighborhood information linearly, which fails to capture complicated spatio-temporal high-order interactions. To tackle these issues, in this paper, we propose a novel model named Dynamic Hypergraph Structure Learning (DyHSL) for traffic flow prediction. To learn non-pairwise relationships, our DyHSL extracts hypergraph structural information to model dynamics in the traffic networks, and updates each node representation by aggregating messages from its associated hyperedges. Additionally, to capture high-order spatio-temporal relations in the road network, we introduce an interactive graph convolution block, which further models the neighborhood interaction for each node. Finally, we integrate these two views into a holistic multi-scale correlation extraction module, which conducts temporal pooling with different scales to model different temporal patterns. Extensive experiments on four popular traffic benchmark datasets demonstrate the effectiveness of our proposed DyHSL compared with a broad range of competing baselines. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: Accepted by 2023 IEEE 39th International Conference on Data Engineering (ICDE 2023)

arXiv:2309.04694 [pdf, other]

Redundancy-Free Self-Supervised Relational Learning for Graph Clustering

Authors: Si-Yu Yi, Wei Ju, Yifang Qin, Xiao Luo, Luchen Liu, Yong-Dao Zhou, Ming Zhang

Abstract: Graph clustering, which learns the node representations for effective cluster assignments, is a fundamental yet challenging task in data analysis and has received considerable attention accompanied by graph neural networks in recent years. However, most existing methods overlook the inherent relational information among the non-independent and non-identically distributed nodes in a graph. Due to t… ▽ More Graph clustering, which learns the node representations for effective cluster assignments, is a fundamental yet challenging task in data analysis and has received considerable attention accompanied by graph neural networks in recent years. However, most existing methods overlook the inherent relational information among the non-independent and non-identically distributed nodes in a graph. Due to the lack of exploration of relational attributes, the semantic information of the graph-structured data fails to be fully exploited which leads to poor clustering performance. In this paper, we propose a novel self-supervised deep graph clustering method named Relational Redundancy-Free Graph Clustering (R$^2$FGC) to tackle the problem. It extracts the attribute- and structure-level relational information from both global and local views based on an autoencoder and a graph autoencoder. To obtain effective representations of the semantic information, we preserve the consistent relation among augmented nodes, whereas the redundant relation is further reduced for learning discriminative embeddings. In addition, a simple yet valid strategy is utilized to alleviate the over-smoothing issue. Extensive experiments are performed on widely used benchmark datasets to validate the superiority of our R$^2$FGC over state-of-the-art baselines. Our codes are available at https://github.com/yisiyu95/R2FGC. △ Less

Submitted 9 September, 2023; originally announced September 2023.

Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS 2024)

arXiv:2309.04295 [pdf, other]

FIMO: A Challenge Formal Dataset for Automated Theorem Proving

Authors: Chengwu Liu, Jianhao Shen, Huajian Xin, Zhengying Liu, Ye Yuan, Haiming Wang, Wei Ju, Chuanyang Zheng, Yichun Yin, Lin Li, Ming Zhang, Qun Liu

Abstract: We present FIMO, an innovative dataset comprising formal mathematical problem statements sourced from the International Mathematical Olympiad (IMO) Shortlisted Problems. Designed to facilitate advanced automated theorem proving at the IMO level, FIMO is currently tailored for the Lean formal language. It comprises 149 formal problem statements, accompanied by both informal problem descriptions and… ▽ More We present FIMO, an innovative dataset comprising formal mathematical problem statements sourced from the International Mathematical Olympiad (IMO) Shortlisted Problems. Designed to facilitate advanced automated theorem proving at the IMO level, FIMO is currently tailored for the Lean formal language. It comprises 149 formal problem statements, accompanied by both informal problem descriptions and their corresponding LaTeX-based informal proofs. Through initial experiments involving GPT-4, our findings underscore the existing limitations in current methodologies, indicating a substantial journey ahead before achieving satisfactory IMO-level automated theorem proving outcomes. △ Less

Submitted 5 December, 2023; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: Added a hyperlink to the dataset made accessible on GitHub

arXiv:2308.16609 [pdf, other]

Towards Long-Tailed Recognition for Graph Classification via Collaborative Experts

Authors: Siyu Yi, Zhengyang Mao, Wei Ju, Yongdao Zhou, Luchen Liu, Xiao Luo, Ming Zhang

Abstract: Graph classification, aiming at learning the graph-level representations for effective class assignments, has received outstanding achievements, which heavily relies on high-quality datasets that have balanced class distribution. In fact, most real-world graph data naturally presents a long-tailed form, where the head classes occupy much more samples than the tail classes, it thus is essential to… ▽ More Graph classification, aiming at learning the graph-level representations for effective class assignments, has received outstanding achievements, which heavily relies on high-quality datasets that have balanced class distribution. In fact, most real-world graph data naturally presents a long-tailed form, where the head classes occupy much more samples than the tail classes, it thus is essential to study the graph-level classification over long-tailed data while still remaining largely unexplored. However, most existing long-tailed learning methods in visions fail to jointly optimize the representation learning and classifier training, as well as neglect the mining of the hard-to-classify classes. Directly applying existing methods to graphs may lead to sub-optimal performance, since the model trained on graphs would be more sensitive to the long-tailed distribution due to the complex topological characteristics. Hence, in this paper, we propose a novel long-tailed graph-level classification framework via Collaborative Multi-expert Learning (CoMe) to tackle the problem. To equilibrate the contributions of head and tail classes, we first develop balanced contrastive learning from the view of representation learning, and then design an individual-expert classifier training based on hard class mining. In addition, we execute gated fusion and disentangled knowledge distillation among the multiple experts to promote the collaboration in a multi-expert framework. Comprehensive experiments are performed on seven widely-used benchmark datasets to demonstrate the superiority of our method CoMe over state-of-the-art baselines. △ Less

Submitted 5 September, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

Comments: Accepted by IEEE Transactions on Big Data (TBD 2024)

arXiv:2308.02335 [pdf, other]

RAHNet: Retrieval Augmented Hybrid Network for Long-tailed Graph Classification

Authors: Zhengyang Mao, Wei Ju, Yifang Qin, Xiao Luo, Ming Zhang

Abstract: Graph classification is a crucial task in many real-world multimedia applications, where graphs can represent various multimedia data types such as images, videos, and social networks. Previous efforts have applied graph neural networks (GNNs) in balanced situations where the class distribution is balanced. However, real-world data typically exhibit long-tailed class distributions, resulting in a… ▽ More Graph classification is a crucial task in many real-world multimedia applications, where graphs can represent various multimedia data types such as images, videos, and social networks. Previous efforts have applied graph neural networks (GNNs) in balanced situations where the class distribution is balanced. However, real-world data typically exhibit long-tailed class distributions, resulting in a bias towards the head classes when using GNNs and limited generalization ability over the tail classes. Recent approaches mainly focus on re-balancing different classes during model training, which fails to explicitly introduce new knowledge and sacrifices the performance of the head classes. To address these drawbacks, we propose a novel framework called Retrieval Augmented Hybrid Network (RAHNet) to jointly learn a robust feature extractor and an unbiased classifier in a decoupled manner. In the feature extractor training stage, we develop a graph retrieval module to search for relevant graphs that directly enrich the intra-class diversity for the tail classes. Moreover, we innovatively optimize a category-centered supervised contrastive loss to obtain discriminative representations, which is more suitable for long-tailed scenarios. In the classifier fine-tuning stage, we balance the classifier weights with two weight regularization techniques, i.e., Max-norm and weight decay. Experiments on various popular benchmarks verify the superiority of the proposed method against state-of-the-art approaches. △ Less

Submitted 7 September, 2023; v1 submitted 4 August, 2023; originally announced August 2023.

Comments: Accepted by the ACM International Conference on Multimedia (MM) 2023

arXiv:2307.10467 [pdf, other]

Towards Sustainable Research Data Management in Human-Computer Interaction

Authors: David Goedicke, Mark Colley, Sebastian S. Feger, Michael Goedicke, Bastian Pfleging, Wendy Ju

Abstract: We discuss important aspects of HCI research regarding Research Data Management (RDM) to achieve better publication processes and higher reuse of HCI research results. Various context elements of RDM for HCI are discussed, including examples of existing and emerging infrastructures for RDM. We briefly discuss existing approaches and come up with additional aspects which need to be addressed. This… ▽ More We discuss important aspects of HCI research regarding Research Data Management (RDM) to achieve better publication processes and higher reuse of HCI research results. Various context elements of RDM for HCI are discussed, including examples of existing and emerging infrastructures for RDM. We briefly discuss existing approaches and come up with additional aspects which need to be addressed. This is to apply the so-called FAIR principle fully, which -- besides being findable and accessible -- also includes interoperability and reusability. We also discuss briefly the kind of research data types that play a role here and propose to build on existing work and involve the HCI scientific community to improve current practices. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2306.08194 [pdf, other]

Learning on Graphs under Label Noise

Authors: Jingyang Yuan, Xiao Luo, Yifang Qin, Yusheng Zhao, Wei Ju, Ming Zhang

Abstract: Node classification on graphs is a significant task with a wide range of applications, including social analysis and anomaly detection. Even though graph neural networks (GNNs) have produced promising results on this task, current techniques often presume that label information of nodes is accurate, which may not be the case in real-world applications. To tackle this issue, we investigate the prob… ▽ More Node classification on graphs is a significant task with a wide range of applications, including social analysis and anomaly detection. Even though graph neural networks (GNNs) have produced promising results on this task, current techniques often presume that label information of nodes is accurate, which may not be the case in real-world applications. To tackle this issue, we investigate the problem of learning on graphs with label noise and develop a novel approach dubbed Consistent Graph Neural Network (CGNN) to solve it. Specifically, we employ graph contrastive learning as a regularization term, which promotes two views of augmented nodes to have consistent representations. Since this regularization term cannot utilize label information, it can enhance the robustness of node representations to label noise. Moreover, to detect noisy labels on the graph, we present a sample selection technique based on the homophily assumption, which identifies noisy nodes by measuring the consistency between the labels with their neighbors. Finally, we purify these confident noisy labels to permit efficient semantic graph learning. Extensive experiments on three well-known benchmark datasets demonstrate the superiority of our CGNN over competing approaches. △ Less

Submitted 13 June, 2023; originally announced June 2023.

Comments: Accepted by IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)

arXiv:2305.19598 [pdf, other]

doi 10.1109/TKDE.2023.3280859

Towards Semi-supervised Universal Graph Classification

Authors: Xiao Luo, Yusheng Zhao, Yifang Qin, Wei Ju, Ming Zhang

Abstract: Graph neural networks have pushed state-of-the-arts in graph classifications recently. Typically, these methods are studied within the context of supervised end-to-end training, which necessities copious task-specific labels. However, in real-world circumstances, labeled data could be limited, and there could be a massive corpus of unlabeled data, even from unknown classes as a complementary. Towa… ▽ More Graph neural networks have pushed state-of-the-arts in graph classifications recently. Typically, these methods are studied within the context of supervised end-to-end training, which necessities copious task-specific labels. However, in real-world circumstances, labeled data could be limited, and there could be a massive corpus of unlabeled data, even from unknown classes as a complementary. Towards this end, we study the problem of semi-supervised universal graph classification, which not only identifies graph samples which do not belong to known classes, but also classifies the remaining samples into their respective classes. This problem is challenging due to a severe lack of labels and potential class shifts. In this paper, we propose a novel graph neural network framework named UGNN, which makes the best of unlabeled data from the subgraph perspective. To tackle class shifts, we estimate the certainty of unlabeled graphs using multiple subgraphs, which facilities the discovery of unlabeled data from unknown categories. Moreover, we construct semantic prototypes in the embedding space for both known and unknown categories and utilize posterior prototype assignments inferred from the Sinkhorn-Knopp algorithm to learn from abundant unlabeled graphs across different subgraph views. Extensive experiments on six datasets verify the effectiveness of UGNN in different settings. △ Less

Submitted 31 May, 2023; originally announced May 2023.

Comments: Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE 2023)

arXiv:2305.15210 [pdf, other]

doi 10.1145/3593013.3594020

Detecting disparities in police deployments using dashcam data

Authors: Matt Franchi, J. D. Zamfirescu-Pereira, Wendy Ju, Emma Pierson

Abstract: Large-scale policing data is vital for detecting inequity in police behavior and policing algorithms. However, one important type of policing data remains largely unavailable within the United States: aggregated police deployment data capturing which neighborhoods have the heaviest police presences. Here we show that disparities in police deployment levels can be quantified by detecting police veh… ▽ More Large-scale policing data is vital for detecting inequity in police behavior and policing algorithms. However, one important type of policing data remains largely unavailable within the United States: aggregated police deployment data capturing which neighborhoods have the heaviest police presences. Here we show that disparities in police deployment levels can be quantified by detecting police vehicles in dashcam images of public street scenes. Using a dataset of 24,803,854 dashcam images from rideshare drivers in New York City, we find that police vehicles can be detected with high accuracy (average precision 0.82, AUC 0.99) and identify 233,596 images which contain police vehicles. There is substantial inequality across neighborhoods in police vehicle deployment levels. The neighborhood with the highest deployment levels has almost 20 times higher levels than the neighborhood with the lowest. Two strikingly different types of areas experience high police vehicle deployments - 1) dense, higher-income, commercial areas and 2) lower-income neighborhoods with higher proportions of Black and Hispanic residents. We discuss the implications of these disparities for policing equity and for algorithms trained on policing data. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: To appear in ACM Conference on Fairness, Accountability, and Transparency (FAccT) '23

arXiv:2304.11688 [pdf, other]

TGNN: A Joint Semi-supervised Framework for Graph-level Classification

Authors: Wei Ju, Xiao Luo, Meng Qu, Yifan Wang, Chong Chen, Minghua Deng, Xian-Sheng Hua, Ming Zhang

Abstract: This paper studies semi-supervised graph classification, a crucial task with a wide range of applications in social network analysis and bioinformatics. Recent works typically adopt graph neural networks to learn graph-level representations for classification, failing to explicitly leverage features derived from graph topology (e.g., paths). Moreover, when labeled data is scarce, these methods are… ▽ More This paper studies semi-supervised graph classification, a crucial task with a wide range of applications in social network analysis and bioinformatics. Recent works typically adopt graph neural networks to learn graph-level representations for classification, failing to explicitly leverage features derived from graph topology (e.g., paths). Moreover, when labeled data is scarce, these methods are far from satisfactory due to their insufficient topology exploration of unlabeled data. We address the challenge by proposing a novel semi-supervised framework called Twin Graph Neural Network (TGNN). To explore graph structural information from complementary views, our TGNN has a message passing module and a graph kernel module. To fully utilize unlabeled data, for each module, we calculate the similarity of each unlabeled graph to other labeled graphs in the memory bank and our consistency loss encourages consistency between two similarity distributions in different embedding spaces. The two twin modules collaborate with each other by exchanging instance similarity knowledge to fully explore the structure information of both labeled and unlabeled data. We evaluate our TGNN on various public datasets and show that it achieves strong performance. △ Less

Submitted 23 April, 2023; originally announced April 2023.

Comments: Accepted by Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI 2022)

arXiv:2304.07042 [pdf, other]

doi 10.1109/TKDE.2024.3349397

Learning Graph ODE for Continuous-Time Sequential Recommendation

Authors: Yifang Qin, Wei Ju, Hongjun Wu, Xiao Luo, Ming Zhang

Abstract: Sequential recommendation aims at understanding user preference by capturing successive behavior correlations, which are usually represented as the item purchasing sequences based on their past interactions. Existing efforts generally predict the next item via modeling the sequential patterns. Despite effectiveness, there exist two natural deficiencies: (i) user preference is dynamic in nature, an… ▽ More Sequential recommendation aims at understanding user preference by capturing successive behavior correlations, which are usually represented as the item purchasing sequences based on their past interactions. Existing efforts generally predict the next item via modeling the sequential patterns. Despite effectiveness, there exist two natural deficiencies: (i) user preference is dynamic in nature, and the evolution of collaborative signals is often ignored; and (ii) the observed interactions are often irregularly-sampled, while existing methods model item transitions assuming uniform intervals. Thus, how to effectively model and predict the underlying dynamics for user preference becomes a critical research problem. To tackle the above challenges, in this paper, we focus on continuous-time sequential recommendation and propose a principled graph ordinary differential equation framework named GDERec. Technically, GDERec is characterized by an autoregressive graph ordinary differential equation consisting of two components, which are parameterized by two tailored graph neural networks (GNNs) respectively to capture user preference from the perspective of hybrid dynamical systems. The two customized GNNs are trained alternately in an autoregressive manner to track the evolution of the underlying system from irregular observations, and thus learn effective representations of users and items beneficial to the sequential recommendation. Extensive experiments on five benchmark datasets demonstrate the superiority of our model over various state-of-the-art recommendation methods. △ Less

Submitted 20 January, 2024; v1 submitted 14 April, 2023; originally announced April 2023.

Comments: Accepted by EEE Transactions on Knowledge and Data Engineering (TKDE 2024)

arXiv:2304.07041 [pdf, other]

doi 10.1145/3624475

A Diffusion model for POI recommendation

Authors: Yifang Qin, Hongjun Wu, Wei Ju, Xiao Luo, Ming Zhang

Abstract: Next Point-of-Interest (POI) recommendation is a critical task in location-based services that aim to provide personalized suggestions for the user's next destination. Previous works on POI recommendation have laid focused on modeling the user's spatial preference. However, existing works that leverage spatial information are only based on the aggregation of users' previous visited positions, whic… ▽ More Next Point-of-Interest (POI) recommendation is a critical task in location-based services that aim to provide personalized suggestions for the user's next destination. Previous works on POI recommendation have laid focused on modeling the user's spatial preference. However, existing works that leverage spatial information are only based on the aggregation of users' previous visited positions, which discourages the model from recommending POIs in novel areas. This trait of position-based methods will harm the model's performance in many situations. Additionally, incorporating sequential information into the user's spatial preference remains a challenge. In this paper, we propose Diff-POI: a Diffusion-based model that samples the user's spatial preference for the next POI recommendation. Inspired by the wide application of diffusion algorithm in sampling from distributions, Diff-POI encodes the user's visiting sequence and spatial character with two tailor-designed graph encoding modules, followed by a diffusion-based sampling strategy to explore the user's spatial visiting trends. We leverage the diffusion process and its reversed form to sample from the posterior distribution and optimized the corresponding score function. We design a joint training and inference framework to optimize and evaluate the proposed Diff-POI. Extensive experiments on four real-world POI recommendation datasets demonstrate the superiority of our Diff-POI over state-of-the-art baseline methods. Further ablation and parameter studies on Diff-POI reveal the functionality and effectiveness of the proposed diffusion-based sampling strategy for addressing the limitations of existing methods. △ Less

Submitted 28 October, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

Comments: Accepted by ACM Transactions on Information Systems (TOIS 2023)

arXiv:2304.06639 [pdf, other]

Towards Prototyping Driverless Vehicle Behaviors, City Design, and Policies Simultaneously

Authors: Hauke Sandhaus, Wendy Ju, Qian Yang

Abstract: Autonomous Vehicles (AVs) can potentially improve urban living by reducing accidents, increasing transportation accessibility and equity, and decreasing emissions. Realizing these promises requires the innovations of AV driving behaviors, city plans and infrastructure, and traffic and transportation policies to join forces. However, the complex interdependencies among AV, city, and policy design i… ▽ More Autonomous Vehicles (AVs) can potentially improve urban living by reducing accidents, increasing transportation accessibility and equity, and decreasing emissions. Realizing these promises requires the innovations of AV driving behaviors, city plans and infrastructure, and traffic and transportation policies to join forces. However, the complex interdependencies among AV, city, and policy design issues can hinder their innovation. We argue the path towards better AV cities is not a process of matching city designs and policies with AVs' technological innovations, but a process of iterative prototyping of all three simultaneously: Innovations can happen step-wise as the knot of AV, city, and policy design loosens and tightens, unwinds and reties. In this paper, we ask: How can innovators innovate AVs, city environments, and policies simultaneously and productively toward better AV cities? The paper has two parts. First, we map out the interconnections among the many AV, city, and policy design decisions, based on a literature review spanning HCI/HRI, transportation science, urban studies, law and policy, operations research, economy, and philosophy. This map can help innovators identify design constraints and opportunities across the traditional AV/city/policy design disciplinary bounds. Second, we review the respective methods for AV, city, and policy design, and identify key barriers in combining them: (1) Organizational barriers to AV-city-policy design collaboration, (2) computational barriers to multi-granularity AV-city-policy simulation, and (3) different assumptions and goals in joint AV-city-policy optimization. We discuss two broad approaches that can potentially address these challenges, namely, "low-fidelity integrative City-AV-Policy Simulation (iCAPS)" and "participatory design optimization". △ Less

Submitted 13 April, 2023; originally announced April 2023.

Comments: Published to the CHI '23 Workshop: Designing Technology and Policy Simultaneously

ACM Class: I.2.9; K.4.1; H.4.m

arXiv:2304.05055 [pdf, other]

doi 10.1016/j.neunet.2024.106207

A Comprehensive Survey on Deep Graph Representation Learning

Authors: Wei Ju, Zheng Fang, Yiyang Gu, Zequn Liu, Qingqing Long, Ziyue Qiao, Yifang Qin, Jianhao Shen, Fang Sun, Zhiping Xiao, Junwei Yang, Jingyang Yuan, Yusheng Zhao, Yifan Wang, Xiao Luo, Ming Zhang

Abstract: Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied in a range of fields, including machine learning and data mining. Classic graph embedding methods follow the basic idea that the embedding vectors of interconnected nodes in the graph can still maintain a… ▽ More Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied in a range of fields, including machine learning and data mining. Classic graph embedding methods follow the basic idea that the embedding vectors of interconnected nodes in the graph can still maintain a relatively close distance, thereby preserving the structural information between the nodes in the graph. However, this is sub-optimal due to: (i) traditional methods have limited model capacity which limits the learning performance; (ii) existing techniques typically rely on unsupervised learning strategies and fail to couple with the latest learning paradigms; (iii) representation learning and downstream tasks are dependent on each other which should be jointly enhanced. With the remarkable success of deep learning, deep graph representation learning has shown great potential and advantages over shallow (traditional) methods, there exist a large number of deep graph representation learning techniques have been proposed in the past decade, especially graph neural networks. In this survey, we conduct a comprehensive survey on current deep graph representation learning algorithms by proposing a new taxonomy of existing state-of-the-art literature. Specifically, we systematically summarize the essential components of graph representation learning and categorize existing approaches by the ways of graph neural network architectures and the most recent advanced learning paradigms. Moreover, this survey also provides the practical and promising applications of deep graph representation learning. Last but not least, we state new perspectives and suggest challenging directions which deserve further investigations in the future. △ Less

Submitted 27 February, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: Accepted by Neural Networks 2024

arXiv:2303.16856 [pdf, other]

Robust Dancer: Long-term 3D Dance Synthesis Using Unpaired Data

Authors: Bin Feng, Tenglong Ao, Zequn Liu, Wei Ju, Libin Liu, Ming Zhang

Abstract: How to automatically synthesize natural-looking dance movements based on a piece of music is an incrementally popular yet challenging task. Most existing data-driven approaches require hard-to-get paired training data and fail to generate long sequences of motion due to error accumulation of autoregressive structure. We present a novel 3D dance synthesis system that only needs unpaired data for tr… ▽ More How to automatically synthesize natural-looking dance movements based on a piece of music is an incrementally popular yet challenging task. Most existing data-driven approaches require hard-to-get paired training data and fail to generate long sequences of motion due to error accumulation of autoregressive structure. We present a novel 3D dance synthesis system that only needs unpaired data for training and could generate realistic long-term motions at the same time. For the unpaired data training, we explore the disentanglement of beat and style, and propose a Transformer-based model free of reliance upon paired data. For the synthesis of long-term motions, we devise a new long-history attention strategy. It first queries the long-history embedding through an attention computation and then explicitly fuses this embedding into the generation pipeline via multimodal adaptation gate (MAG). Objective and subjective evaluations show that our results are comparable to strong baseline methods, despite not requiring paired training data, and are robust when inferring long-term music. To our best knowledge, we are the first to achieve unpaired data training - an ability that enables to alleviate data limitations effectively. Our code is released on https://github.com/BFeng14/RobustDancer △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: Preliminary video demo: https://youtu.be/gJbxG9QlcUU

arXiv:2303.13618 [pdf, ps, other]

Frankenstein's Toolkit: Prototyping Electronics Using Consumer Products

Authors: Ilan Mandel, Wendy Ju

Abstract: In our practice as educators, researchers and designers we have found that centering reverse engineering and reuse has pedagogical, environmental, and economic benefits. Design decisions in the development of new hardware tool-kits should consider how we can use e-waste at hand as integral components of electronics prototyping. Dissection, extraction and modification can give insights into how thi… ▽ More In our practice as educators, researchers and designers we have found that centering reverse engineering and reuse has pedagogical, environmental, and economic benefits. Design decisions in the development of new hardware tool-kits should consider how we can use e-waste at hand as integral components of electronics prototyping. Dissection, extraction and modification can give insights into how things are made at scale. Simultaneously, it can enable prototypes that have greater fidelity or functionality than would otherwise be cost-effective to produce. △ Less

Submitted 23 March, 2023; originally announced March 2023.

arXiv:2303.04835 [pdf, other]

The Bystander Affect Detection (BAD) Dataset for Failure Detection in HRI

Authors: Alexandra Bremers, Maria Teresa Parreira, Xuanyu Fang, Natalie Friedman, Adolfo Ramirez-Aristizabal, Alexandria Pabst, Mirjana Spasojevic, Michael Kuniavsky, Wendy Ju

Abstract: For a robot to repair its own error, it must first know it has made a mistake. One way that people detect errors is from the implicit reactions from bystanders -- their confusion, smirks, or giggles clue us in that something unexpected occurred. To enable robots to detect and act on bystander responses to task failures, we developed a novel method to elicit bystander responses to human and robot e… ▽ More For a robot to repair its own error, it must first know it has made a mistake. One way that people detect errors is from the implicit reactions from bystanders -- their confusion, smirks, or giggles clue us in that something unexpected occurred. To enable robots to detect and act on bystander responses to task failures, we developed a novel method to elicit bystander responses to human and robot errors. Using 46 different stimulus videos featuring a variety of human and machine task failures, we collected a total of 2452 webcam videos of human reactions from 54 participants. To test the viability of the collected data, we used the bystander reaction dataset as input to a deep-learning model, BADNet, to predict failure occurrence. We tested different data labeling methods and learned how they affect model performance, achieving precisions above 90%. We discuss strategies to model bystander reactions and predict failure and how this approach can be used in real-world robotic deployments to detect errors and improve robot performance. As part of this work, we also contribute with the "Bystander Affect Detection" (BAD) dataset of bystander reactions, supporting the development of better prediction models. △ Less

Submitted 8 March, 2023; originally announced March 2023.

Comments: 12 pages

arXiv:2301.11972 [pdf, other]

Using Social Cues to Recognize Task Failures for HRI: Overview, State-of-the-Art, and Future Directions

Authors: Alexandra Bremers, Alexandria Pabst, Maria Teresa Parreira, Wendy Ju

Abstract: Robots that carry out tasks and interact in complex environments will inevitably commit errors. Error detection is thus an essential ability for robots to master to work efficiently and productively. People can leverage social feedback to get an indication of whether an action was successful or not. With advances in computing and artificial intelligence (AI), it is increasingly possible for robots… ▽ More Robots that carry out tasks and interact in complex environments will inevitably commit errors. Error detection is thus an essential ability for robots to master to work efficiently and productively. People can leverage social feedback to get an indication of whether an action was successful or not. With advances in computing and artificial intelligence (AI), it is increasingly possible for robots to achieve a similar capability of collecting social feedback. In this work, we take this one step further and propose a framework for how social cues can be used as feedback signals to recognize task failures for human-robot interaction (HRI). Our proposed framework sets out a research agenda based on insights from the literature on behavioral science, human-robot interaction, and machine learning to focus on three areas: 1) social cues as feedback (from behavioral science), 2) recognizing task failures in robots (from HRI), and 3) approaches for autonomous detection of HRI task failures based on social cues (from machine learning). We propose a taxonomy of error detection based on self-awareness and social feedback. Finally, we provide recommendations for HRI researchers and practitioners interested in developing robots that detect task errors using human social cues. This article is intended for interdisciplinary HRI researchers and practitioners, where the third theme of our analysis provides more technical details aiming toward the practical implementation of these systems. △ Less

Submitted 29 May, 2024; v1 submitted 27 January, 2023; originally announced January 2023.

Comments: 25 pages, 3 figures

arXiv:2210.16591 [pdf, other]

doi 10.1145/3539597.3570408

DisenPOI: Disentangling Sequential and Geographical Influence for Point-of-Interest Recommendation

Authors: Yifang Qin, Yifan Wang, Fang Sun, Wei Ju, Xuyang Hou, Zhe Wang, Jia Cheng, Jun Lei, Ming Zhang

Abstract: Point-of-Interest (POI) recommendation plays a vital role in various location-aware services. It has been observed that POI recommendation is driven by both sequential and geographical influences. However, since there is no annotated label of the dominant influence during recommendation, existing methods tend to entangle these two influences, which may lead to sub-optimal recommendation performanc… ▽ More Point-of-Interest (POI) recommendation plays a vital role in various location-aware services. It has been observed that POI recommendation is driven by both sequential and geographical influences. However, since there is no annotated label of the dominant influence during recommendation, existing methods tend to entangle these two influences, which may lead to sub-optimal recommendation performance and poor interpretability. In this paper, we address the above challenge by proposing DisenPOI, a novel Disentangled dual-graph framework for POI recommendation, which jointly utilizes sequential and geographical relationships on two separate graphs and disentangles the two influences with self-supervision. The key novelty of our model compared with existing approaches is to extract disentangled representations of both sequential and geographical influences with contrastive learning. To be specific, we construct a geographical graph and a sequential graph based on the check-in sequence of a user. We tailor their propagation schemes to become sequence-/geo-aware to better capture the corresponding influences. Preference proxies are extracted from check-in sequence as pseudo labels for the two influences, which supervise the disentanglement via a contrastive loss. Extensive experiments on three datasets demonstrate the superiority of the proposed model. △ Less

Submitted 14 September, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

Comments: Accepted by ACM International Conference on Web Search and Data Mining (WSDM'23)

arXiv:2210.11879 [pdf, other]

GLCC: A General Framework for Graph-Level Clustering

Authors: Wei Ju, Yiyang Gu, Binqi Chen, Gongbo Sun, Yifang Qin, Xingyuming Liu, Xiao Luo, Ming Zhang

Abstract: This paper studies the problem of graph-level clustering, which is a novel yet challenging task. This problem is critical in a variety of real-world applications such as protein clustering and genome analysis in bioinformatics. Recent years have witnessed the success of deep clustering coupled with graph neural networks (GNNs). However, existing methods focus on clustering among nodes given a sing… ▽ More This paper studies the problem of graph-level clustering, which is a novel yet challenging task. This problem is critical in a variety of real-world applications such as protein clustering and genome analysis in bioinformatics. Recent years have witnessed the success of deep clustering coupled with graph neural networks (GNNs). However, existing methods focus on clustering among nodes given a single graph, while exploring clustering on multiple graphs is still under-explored. In this paper, we propose a general graph-level clustering framework named Graph-Level Contrastive Clustering (GLCC) given multiple graphs. Specifically, GLCC first constructs an adaptive affinity graph to explore instance- and cluster-level contrastive learning (CL). Instance-level CL leverages graph Laplacian based contrastive loss to learn clustering-friendly representations while cluster-level CL captures discriminative cluster representations incorporating neighbor information of each sample. Moreover, we utilize neighbor-aware pseudo-labels to reward the optimization of representation learning. The two steps can be alternatively trained to collaborate and benefit each other. Experiments on a range of well-known datasets demonstrate the superiority of our proposed GLCC over competitive baselines. △ Less

Submitted 8 March, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

Comments: Accepted by Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2023)

arXiv:2210.03969 [pdf, other]

Kernel-based Substructure Exploration for Next POI Recommendation

Authors: Wei Ju, Yifang Qin, Ziyue Qiao, Xiao Luo, Yifan Wang, Yanjie Fu, Ming Zhang

Abstract: Point-of-Interest (POI) recommendation, which benefits from the proliferation of GPS-enabled devices and location-based social networks (LBSNs), plays an increasingly important role in recommender systems. It aims to provide users with the convenience to discover their interested places to visit based on previous visits and current status. Most existing methods usually merely leverage recurrent ne… ▽ More Point-of-Interest (POI) recommendation, which benefits from the proliferation of GPS-enabled devices and location-based social networks (LBSNs), plays an increasingly important role in recommender systems. It aims to provide users with the convenience to discover their interested places to visit based on previous visits and current status. Most existing methods usually merely leverage recurrent neural networks (RNNs) to explore sequential influences for recommendation. Despite the effectiveness, these methods not only neglect topological geographical influences among POIs, but also fail to model high-order sequential substructures. To tackle the above issues, we propose a Kernel-Based Graph Neural Network (KBGNN) for next POI recommendation, which combines the characteristics of both geographical and sequential influences in a collaborative way. KBGNN consists of a geographical module and a sequential module. On the one hand, we construct a geographical graph and leverage a message passing neural network to capture the topological geographical influences. On the other hand, we explore high-order sequential substructures in the user-aware sequential graph using a graph kernel neural network to capture user preferences. Finally, a consistency learning framework is introduced to jointly incorporate geographical and sequential information extracted from two separate graphs. In this way, the two modules effectively exchange knowledge to mutually enhance each other. Extensive experiments conducted on two real-world LBSN datasets demonstrate the superior performance of our proposed method over the state-of-the-arts. Our codes are available at https://github.com/Fang6ang/KBGNN. △ Less

Submitted 8 October, 2022; originally announced October 2022.

Comments: Accepted by the IEEE International Conference on Data Mining (ICDM) 2022

arXiv:2205.10550 [pdf, other]

doi 10.1145/3488560.3498429

KGNN: Harnessing Kernel-based Networks for Semi-supervised Graph Classification

Authors: Wei Ju, Junwei Yang, Meng Qu, Weiping Song, Jianhao Shen, Ming Zhang

Abstract: This paper studies semi-supervised graph classification, which is an important problem with various applications in social network analysis and bioinformatics. This problem is typically solved by using graph neural networks (GNNs), which yet rely on a large number of labeled graphs for training and are unable to leverage unlabeled graphs. We address the limitations by proposing the Kernel-based Gr… ▽ More This paper studies semi-supervised graph classification, which is an important problem with various applications in social network analysis and bioinformatics. This problem is typically solved by using graph neural networks (GNNs), which yet rely on a large number of labeled graphs for training and are unable to leverage unlabeled graphs. We address the limitations by proposing the Kernel-based Graph Neural Network (KGNN). A KGNN consists of a GNN-based network as well as a kernel-based network parameterized by a memory network. The GNN-based network performs classification through learning graph representations to implicitly capture the similarity between query graphs and labeled graphs, while the kernel-based network uses graph kernels to explicitly compare each query graph with all the labeled graphs stored in a memory for prediction. The two networks are motivated from complementary perspectives, and thus combing them allows KGNN to use labeled graphs more effectively. We jointly train the two networks by maximizing their agreement on unlabeled graphs via posterior regularization, so that the unlabeled graphs serve as a bridge to let both networks mutually enhance each other. Experiments on a range of well-known benchmark datasets demonstrate that KGNN achieves impressive performance over competitive baselines. △ Less

Submitted 21 May, 2022; originally announced May 2022.

Comments: Published as a full paper at WSDM 2022

Showing 1–50 of 55 results for author: Ju, W