Search | arXiv e-print repository

Towards Analyzing and Mitigating Sycophancy in Large Vision-Language Models

Authors: Yunpu Zhao, Rui Zhang, Junbin Xiao, Changxin Ke, Ruibo Hou, Yifan Hao, Qi Guo, Yunji Chen

Abstract: Large Vision-Language Models (LVLMs) have shown significant capability in vision-language understanding. However, one critical issue that persists in these models is sycophancy, which means models are unduly influenced by leading or deceptive prompts, resulting in biased outputs and hallucinations. Despite the progress in LVLMs, evaluating and mitigating sycophancy is yet much under-explored. In t… ▽ More Large Vision-Language Models (LVLMs) have shown significant capability in vision-language understanding. However, one critical issue that persists in these models is sycophancy, which means models are unduly influenced by leading or deceptive prompts, resulting in biased outputs and hallucinations. Despite the progress in LVLMs, evaluating and mitigating sycophancy is yet much under-explored. In this work, we fill this gap by systematically analyzing sycophancy on various VL benchmarks with curated leading queries and further proposing a text contrastive decoding method for mitigation. While the specific sycophantic behavior varies significantly among models, our analysis reveals the severe deficiency of all LVLMs in resilience of sycophancy across various tasks. For improvement, we propose Leading Query Contrastive Decoding (LQCD), a model-agnostic method focusing on calibrating the LVLMs' over-reliance on leading cues by identifying and suppressing the probabilities of sycophancy tokens at the decoding stage. Extensive experiments show that LQCD effectively mitigate sycophancy, outperforming both prompt engineering methods and common methods for hallucination mitigation. We further demonstrate that LQCD does not hurt but even slightly improves LVLMs' responses to neutral queries, suggesting it being a more effective strategy for general-purpose decoding but not limited to sycophancy. △ Less

Submitted 20 August, 2024; originally announced August 2024.

arXiv:2306.05672 [pdf, other]

I run as fast as a rabbit, can you? A Multilingual Simile Dialogue Dataset

Authors: Longxuan Ma, Weinan Zhang, Shuhan Zhou, Churui Sun, Changxin Ke, Ting Liu

Abstract: A simile is a figure of speech that compares two different things (called the tenor and the vehicle) via shared properties. The tenor and the vehicle are usually connected with comparator words such as "like" or "as". The simile phenomena are unique and complex in a real-life dialogue scene where the tenor and the vehicle can be verbal phrases or sentences, mentioned by different speakers, exist i… ▽ More A simile is a figure of speech that compares two different things (called the tenor and the vehicle) via shared properties. The tenor and the vehicle are usually connected with comparator words such as "like" or "as". The simile phenomena are unique and complex in a real-life dialogue scene where the tenor and the vehicle can be verbal phrases or sentences, mentioned by different speakers, exist in different sentences, or occur in reversed order. However, the current simile research usually focuses on similes in a triplet tuple (tenor, property, vehicle) or a single sentence where the tenor and vehicle are usually entities or noun phrases, which could not reflect complex simile phenomena in real scenarios. In this paper, we propose a novel and high-quality multilingual simile dialogue (MSD) dataset to facilitate the study of complex simile phenomena. The MSD is the largest manually annotated simile data ($\sim$20K) and it contains both English and Chinese data. Meanwhile, the MSD data can also be used on dialogue tasks to test the ability of dialogue systems when using similes. We design 3 simile tasks (recognition, interpretation, and generation) and 2 dialogue tasks (retrieval and generation) with MSD. For each task, we provide experimental results from strong pre-trained or state-of-the-art models. The experiments demonstrate the challenge of MSD and we have released the data/code on GitHub. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: 13 Pages, 1 Figure, 12 Tables, ACL 2023 findings

ACM Class: I.2.7

arXiv:2306.03949 [pdf, other]

Partial Inference in Structured Prediction

Authors: Chuyang Ke, Jean Honorio

Abstract: In this paper, we examine the problem of partial inference in the context of structured prediction. Using a generative model approach, we consider the task of maximizing a score function with unary and pairwise potentials in the space of labels on graphs. Employing a two-stage convex optimization algorithm for label recovery, we analyze the conditions under which a majority of the labels can be re… ▽ More In this paper, we examine the problem of partial inference in the context of structured prediction. Using a generative model approach, we consider the task of maximizing a score function with unary and pairwise potentials in the space of labels on graphs. Employing a two-stage convex optimization algorithm for label recovery, we analyze the conditions under which a majority of the labels can be recovered. We introduce a novel perspective on the Karush-Kuhn-Tucker (KKT) conditions and primal and dual construction, and provide statistical and topological requirements for partial recovery with provable guarantees. △ Less

Submitted 6 June, 2023; originally announced June 2023.

arXiv:2302.03236 [pdf, other]

Exact Inference in High-order Structured Prediction

Authors: Chuyang Ke, Jean Honorio

Abstract: In this paper, we study the problem of inference in high-order structured prediction tasks. In the context of Markov random fields, the goal of a high-order inference task is to maximize a score function on the space of labels, and the score function can be decomposed into sum of unary and high-order potentials. We apply a generative model approach to study the problem of high-order inference, and… ▽ More In this paper, we study the problem of inference in high-order structured prediction tasks. In the context of Markov random fields, the goal of a high-order inference task is to maximize a score function on the space of labels, and the score function can be decomposed into sum of unary and high-order potentials. We apply a generative model approach to study the problem of high-order inference, and provide a two-stage convex optimization algorithm for exact label recovery. We also provide a new class of hypergraph structural properties related to hyperedge expansion that drives the success in general high-order inference problems. Finally, we connect the performance of our algorithm and the hyperedge expansion property using a novel hypergraph Cheeger-type inequality. △ Less

Submitted 6 February, 2023; originally announced February 2023.

Journal ref: International Conference on Machine Learning (ICML), 2023

arXiv:2301.03790 [pdf, other]

A Practical Runtime Security Policy Transformation Framework for Software Defined Networks

Authors: Yunfei Meng, Changbo Ke, Zhiqiu Huang, Guohua Shen, Chunming Liu, Xiaojie Feng

Abstract: Software-defined networking (SDN) has been widely utilized to enforce the security of traditional networks, thereby promoting the process of transforming traditional networks into SDN networks. However, SDN-based security enforcement mechanisms rely heavily on the security policies containing the underlying information of data plane. With increasing the scale of underlying network, the current sec… ▽ More Software-defined networking (SDN) has been widely utilized to enforce the security of traditional networks, thereby promoting the process of transforming traditional networks into SDN networks. However, SDN-based security enforcement mechanisms rely heavily on the security policies containing the underlying information of data plane. With increasing the scale of underlying network, the current security policy management mechanism will confront more and more challenges. The security policy transformation for SDN networks is to research how to transform the high-level security policy without containing the underlying information of data plane into the practical flow entries used by the OpenFlow switches automatically, thereby implementing the automation of security policy management. Based on this insight, a practical runtime security policy transformation framework is proposed in this paper. First of all, we specify the security policies used by SDN networks as a system model of security policy (SPM). From the theoretical level, we establish the system model for SDN network and propose a formal method to transform SPM into the system model of flow entries automatically. From the practical level, we propose a runtime security policy transformation framework to solve the problem of how to find a connected path for each relationship of SPM in the data plane, as well as how to generate the practical flow entries according to the system model of flow entries. In order to validate the feasibility and effectiveness of the framework, we set up an experimental system and implement the framework with POX controller and Mininet emulator. △ Less

Submitted 10 January, 2023; originally announced January 2023.

arXiv:2211.12972

doi 10.1109/TRO.2023.3297048

Uniform Passive Fault-Tolerant Control of a Quadcopter with One, Two, or Three Rotor Failure

Authors: Chenxu Ke, Kai-Yuan Cai, Quan Quan

Abstract: This study proposes a uniform passive fault-tolerant control (FTC) method for a quadcopter that does not rely on fault information subject to one, two adjacent, two opposite, or three rotors failure. The uniform control implies that the passive FTC is able to cover the condition from quadcopter fault-free to rotor failure without controller switching. To achieve the purpose of the passive FTC, the… ▽ More This study proposes a uniform passive fault-tolerant control (FTC) method for a quadcopter that does not rely on fault information subject to one, two adjacent, two opposite, or three rotors failure. The uniform control implies that the passive FTC is able to cover the condition from quadcopter fault-free to rotor failure without controller switching. To achieve the purpose of the passive FTC, the rotors' fault is modeled as a disturbance acting on the virtual control of the quadcopter system. The disturbance estimate is used directly for the passive FTC with rotor failure. To avoid controller switching between normal control and FTC, a dynamic control allocation is used. In addition, the closed-loop stability has been analyzed and a virtual control feedback is adopted to achieve the passive FTC for the quadcopter with two and three rotor failure. To validate the proposed uniform passive FTC method, outdoor experiments are performed for the first time, which have demonstrated that the hovering quadcopter is able to recover from one rotor failure by the proposed controller and continue to fly even if two adjacent, two opposite, or three rotors fail, without any rotor fault information and controller switching. △ Less

Submitted 25 December, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

Comments: We found some important errors in the paper that need to be corrected

Journal ref: 2023

arXiv:2206.04893 [pdf, other]

Provable Guarantees for Sparsity Recovery with Deterministic Missing Data Patterns

Authors: Chuyang Ke, Jean Honorio

Abstract: We study the problem of consistently recovering the sparsity pattern of a regression parameter vector from correlated observations governed by deterministic missing data patterns using Lasso. We consider the case in which the observed dataset is censored by a deterministic, non-uniform filter. Recovering the sparsity pattern in datasets with deterministic missing structure can be arguably more cha… ▽ More We study the problem of consistently recovering the sparsity pattern of a regression parameter vector from correlated observations governed by deterministic missing data patterns using Lasso. We consider the case in which the observed dataset is censored by a deterministic, non-uniform filter. Recovering the sparsity pattern in datasets with deterministic missing structure can be arguably more challenging than recovering in a uniformly-at-random scenario. In this paper, we propose an efficient algorithm for missing value imputation by utilizing the topological property of the censorship filter. We then provide novel theoretical results for exact recovery of the sparsity pattern using the proposed imputation strategy. Our analysis shows that, under certain statistical and topological conditions, the hidden sparsity pattern can be recovered consistently with high probability in polynomial time and logarithmic sample complexity. △ Less

Submitted 10 June, 2022; originally announced June 2022.

arXiv:2205.14056 [pdf, other]

Dual Convexified Convolutional Neural Networks

Authors: Site Bai, Chuyang Ke, Jean Honorio

Abstract: We propose the framework of dual convexified convolutional neural networks (DCCNNs). In this framework, we first introduce a primal learning problem motivated by convexified convolutional neural networks (CCNNs), and then construct the dual convex training program through careful analysis of the Karush-Kuhn-Tucker (KKT) conditions and Fenchel conjugates. Our approach reduces the computational over… ▽ More We propose the framework of dual convexified convolutional neural networks (DCCNNs). In this framework, we first introduce a primal learning problem motivated by convexified convolutional neural networks (CCNNs), and then construct the dual convex training program through careful analysis of the Karush-Kuhn-Tucker (KKT) conditions and Fenchel conjugates. Our approach reduces the computational overhead of constructing a large kernel matrix and more importantly, eliminates the ambiguity of factorizing the matrix. Due to the low-rank structure in CCNNs and the related subdifferential of nuclear norms, there is no closed-form expression to recover the primal solution from the dual solution. To overcome this, we propose a highly novel weight recovery algorithm, which takes the dual solution and the kernel information as the input, and recovers the linear weight and the output of convolutional layer, instead of weight parameter. Furthermore, our recovery algorithm exploits the low-rank structure and imposes a small number of filters indirectly, which reduces the parameter size. As a result, DCCNNs inherit all the statistical benefits of CCNNs, while enjoying a more formal and efficient workflow. △ Less

Submitted 7 December, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

arXiv:2107.01041 [pdf, other]

doi 10.1109/LWC.2021.3095172

Design and Implementation of MIMO Transmission Based on Dual-Polarized Reconfigurable Intelligent Surface

Authors: Xiangyu Chen, Jun Chen Ke, Wankai Tang, Ming Zheng Chen, Jun Yan Dai, Ertugrul Basar, Shi Jin, Qiang Cheng, Tie Jun Cui

Abstract: Multiple-input multiple-output (MIMO) signaling is one of the key technologies of current mobile communication systems. However, the complex and expensive radio frequency (RF) chains have always limited the increase of MIMO scale. In this paper, we propose a MIMO transmission architecture based on a dual-polarized reconfigurable intelligent surface (RIS), which can directly achieve modulation and… ▽ More Multiple-input multiple-output (MIMO) signaling is one of the key technologies of current mobile communication systems. However, the complex and expensive radio frequency (RF) chains have always limited the increase of MIMO scale. In this paper, we propose a MIMO transmission architecture based on a dual-polarized reconfigurable intelligent surface (RIS), which can directly achieve modulation and transmission of multichannel signals without the need for conventional RF chains. Compared with previous works, the proposed architecture can improve the integration of RIS-based transmission systems. A prototype of the dual-polarized RIS-based MIMO transmission system is built and the experimental results confirm the feasibility of the proposed architecture. The dual-polarized RIS-based MIMO transmission architecture provides a promising solution for realizing low-cost ultra-massive MIMO towards future networks. △ Less

Submitted 2 July, 2021; originally announced July 2021.

Comments: This work has been accepted by IEEE WCL

arXiv:2106.07255 [pdf, other]

Federated Myopic Community Detection with One-shot Communication

Authors: Chuyang Ke, Jean Honorio

Abstract: In this paper, we study the problem of recovering the community structure of a network under federated myopic learning. Under this paradigm, we have several clients, each of them having a myopic view, i.e., observing a small subgraph of the network. Each client sends a censored evidence graph to a central server. We provide an efficient algorithm, which computes a consensus signed weighted graph f… ▽ More In this paper, we study the problem of recovering the community structure of a network under federated myopic learning. Under this paradigm, we have several clients, each of them having a myopic view, i.e., observing a small subgraph of the network. Each client sends a censored evidence graph to a central server. We provide an efficient algorithm, which computes a consensus signed weighted graph from clients evidence, and recovers the underlying network structure in the central server. We analyze the topological structure conditions of the network, as well as the signal and noise levels of the clients that allow for recovery of the network structure. Our analysis shows that exact recovery is possible and can be achieved in polynomial time. We also provide information-theoretic limits for the central server to recover the network structure from any single client evidence. Finally, as a byproduct of our analysis, we provide a novel Cheeger-type inequality for general signed weighted graphs. △ Less

Submitted 14 June, 2021; originally announced June 2021.

Journal ref: Artificial Intelligence and Statistics (AISTATS), 2022

arXiv:2105.12935 [pdf, other]

SDN-based Runtime Security Enforcement Approach for Privacy Preservation of Dynamic Web Service Composition

Authors: Yunfei Meng, Zhiqiu Huang, Guohua Shen, Changbo Ke

Abstract: Aiming at the privacy preservation of dynamic Web service composition, this paper proposes a SDN-based runtime security enforcement approach for privacy preservation of dynamic Web service composition. The main idea of this approach is that the owner of service composition leverages the security policy model (SPM) to define the access control relationships that service composition must comply with… ▽ More Aiming at the privacy preservation of dynamic Web service composition, this paper proposes a SDN-based runtime security enforcement approach for privacy preservation of dynamic Web service composition. The main idea of this approach is that the owner of service composition leverages the security policy model (SPM) to define the access control relationships that service composition must comply with in the application plane, then SPM model is transformed into the low-level security policy model (RSPM) containing the information of SDN data plane, and RSPM model is uploaded into the SDN controller. After uploading, the virtual machine access control algorithm integrated in the SDN controller monitors all of access requests towards service composition at runtime. Only the access requests that meet the definition of RSPM model can be forwarded to the target terminal. Any access requests that do not meet the definition of RSPM model will be automatically blocked by Openflow switches or deleted by SDN controller, Thus, this approach can effectively solve the problems of network-layer illegal accesses, identity theft attacks and service leakages when Web service composition is running. In order to verify the feasibility of this approach, this paper implements an experimental system by using POX controller and Mininet virtual network simulator, and evaluates the effectiveness and performance of this approach by using this system. The final experimental results show that the method is completely effective, and the method can always get the correct calculation results in an acceptable time when the scale of RSPM model is gradually increasing. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2102.08019 [pdf, other]

A Thorough View of Exact Inference in Graphs from the Degree-4 Sum-of-Squares Hierarchy

Authors: Kevin Bello, Chuyang Ke, Jean Honorio

Abstract: Performing inference in graphs is a common task within several machine learning problems, e.g., image segmentation, community detection, among others. For a given undirected connected graph, we tackle the statistical problem of exactly recovering an unknown ground-truth binary labeling of the nodes from a single corrupted observation of each edge. Such problem can be formulated as a quadratic comb… ▽ More Performing inference in graphs is a common task within several machine learning problems, e.g., image segmentation, community detection, among others. For a given undirected connected graph, we tackle the statistical problem of exactly recovering an unknown ground-truth binary labeling of the nodes from a single corrupted observation of each edge. Such problem can be formulated as a quadratic combinatorial optimization problem over the boolean hypercube, where it has been shown before that one can (with high probability and in polynomial time) exactly recover the ground-truth labeling of graphs that have an isoperimetric number that grows with respect to the number of nodes (e.g., complete graphs, regular expanders). In this work, we apply a powerful hierarchy of relaxations, known as the sum-of-squares (SoS) hierarchy, to the combinatorial problem. Motivated by empirical evidence on the improvement in exact recoverability, we center our attention on the degree-4 SoS relaxation and set out to understand the origin of such improvement from a graph theoretical perspective. We show that the solution of the dual of the relaxed problem is related to finding edge weights of the Johnson and Kneser graphs, where the weights fulfill the SoS constraints and intuitively allow the input graph to increase its algebraic connectivity. Finally, as byproduct of our analysis, we derive a novel Cheeger-type lower bound for the algebraic connectivity of graphs with signed edge weights. △ Less

Submitted 1 June, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

Comments: 17 pages, 5 figures

Journal ref: Artificial Intelligence and Statistics (AISTATS), 2022

arXiv:2101.12369 [pdf, ps, other]

Information Theoretic Limits of Exact Recovery in Sub-hypergraph Models for Community Detection

Authors: Jiajun Liang, Chuyang Ke, Jean Honorio

Abstract: In this paper, we study the information theoretic bounds for exact recovery in sub-hypergraph models for community detection. We define a general model called the $m-$uniform sub-hypergraph stochastic block model ($m-$ShSBM). Under the $m-$ShSBM, we use Fano's inequality to identify the region of model parameters where any algorithm fails to exactly recover the planted communities with a large pro… ▽ More In this paper, we study the information theoretic bounds for exact recovery in sub-hypergraph models for community detection. We define a general model called the $m-$uniform sub-hypergraph stochastic block model ($m-$ShSBM). Under the $m-$ShSBM, we use Fano's inequality to identify the region of model parameters where any algorithm fails to exactly recover the planted communities with a large probability. We also identify the region where a Maximum Likelihood Estimation (MLE) algorithm succeeds to exactly recover the communities with high probability. Our bounds are tight and pertain to the community detection problems in various models such as the planted hypergraph stochastic block model, the planted densest sub-hypergraph model, and the planted multipartite hypergraph model. △ Less

Submitted 28 January, 2021; originally announced January 2021.

Journal ref: IEEE International Symposium on Information Theory (ISIT), 2021

arXiv:2006.11666 [pdf, ps, other]

Exact Partitioning of High-order Planted Models with a Tensor Nuclear Norm Constraint

Authors: Chuyang Ke, Jean Honorio

Abstract: We study the problem of efficient exact partitioning of the hypergraphs generated by high-order planted models. A high-order planted model assumes some underlying cluster structures, and simulates high-order interactions by placing hyperedges among nodes. Example models include the disjoint hypercliques, the densest subhypergraphs, and the hypergraph stochastic block models. We show that exact par… ▽ More We study the problem of efficient exact partitioning of the hypergraphs generated by high-order planted models. A high-order planted model assumes some underlying cluster structures, and simulates high-order interactions by placing hyperedges among nodes. Example models include the disjoint hypercliques, the densest subhypergraphs, and the hypergraph stochastic block models. We show that exact partitioning of high-order planted models (a NP-hard problem in general) is achievable through solving a computationally efficient convex optimization problem with a tensor nuclear norm constraint. Our analysis provides the conditions for our approach to succeed on recovering the true underlying cluster structures, with high probability. △ Less

Submitted 20 June, 2020; originally announced June 2020.

Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022

arXiv:2005.13206 [pdf, other]

A Security Policy Model Transformation and Verification Approach for Software Defined Networking

Authors: Yunfei Meng, Zhiqiu Huang, Guohua Shen, Changbo Ke

Abstract: Software defined networking (SDN) has been adopted to enforce the security of large-scale and complex networks because of its programmable, abstract, centralized intelligent control and global and real-time traffic view. However, the current SDN-based security enforcement mechanisms require network managers to fully understand the underlying configurations of network. Facing the increasingly compl… ▽ More Software defined networking (SDN) has been adopted to enforce the security of large-scale and complex networks because of its programmable, abstract, centralized intelligent control and global and real-time traffic view. However, the current SDN-based security enforcement mechanisms require network managers to fully understand the underlying configurations of network. Facing the increasingly complex and huge SDN networks, we urgently need a novel security policy management mechanism which can be completely transparent to any underlying information. That is it can permit network managers to define upper-level security policies without containing any underlying information of network, and by means of model transformation system, these upper-level security policies can be transformed into their corresponding lower-level policies containing underlying information automatically. Moreover, it should ensure system model updated by the generated lower-level policies can hold all of security properties defined in upper-level policies. Based on these insights, we propose a security policy model transformation and verification approach for SDN in this paper. We first present the formal definition of a security policy model (SPM) which can be used to specify the security policies used in SDN. Then, we propose a model transformation system based on SDN system model and mapping rules, which can enable network managers to convert SPM model into corresponding underlying network configuration policies automatically, i.e., flow table model (FTM). In order to verify SDN system model updated by the generated FTM models can hold the security properties defined in SPM models, we design a security policy verification system based on model checking. Finally, we utilize a comprehensive case to illustrate the feasibility of the proposed approach. △ Less

Submitted 28 May, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

arXiv:2003.06834 [pdf, other]

SOM-based DDoS Defense Mechanism using SDN for the Internet of Things

Authors: Yunfei Meng, Zhiqiu Huang, Senzhang Wang, Guohua Shen, Changbo Ke

Abstract: To effectively tackle the security threats towards the Internet of things, we propose a SOM-based DDoS defense mechanism using software-defined networking (SDN) in this paper. The main idea of the mechanism is to deploy a SDN-based gateway to protect the device services in the Internet of things. The gateway provides DDoS defense mechanism based on SOM neural network. By means of SOM-based DDoS de… ▽ More To effectively tackle the security threats towards the Internet of things, we propose a SOM-based DDoS defense mechanism using software-defined networking (SDN) in this paper. The main idea of the mechanism is to deploy a SDN-based gateway to protect the device services in the Internet of things. The gateway provides DDoS defense mechanism based on SOM neural network. By means of SOM-based DDoS defense mechanism, the gateway can effectively identify the malicious sensing devices in the IoT, and automatically block those malicious devices after detecting them, so that it can effectively enforce the security and robustness of the system when it is under DDoS attacks. In order to validate the feasibility and effectiveness of the mechanism, we leverage POX controller and Mininet emulator to implement an experimental system, and further implement the aforementioned security enforcement mechanisms with Python. The final experimental results illustrate that the mechanism is truly effective under the different test scenarios. △ Less

Submitted 17 March, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

arXiv:1911.02161 [pdf, other]

Exact Partitioning of High-order Models with a Novel Convex Tensor Cone Relaxation

Authors: Chuyang Ke, Jean Honorio

Abstract: In this paper we propose an algorithm for exact partitioning of high-order models. We define a general class of $m$-degree Homogeneous Polynomial Models, which subsumes several examples motivated from prior literature. Exact partitioning can be formulated as a tensor optimization problem. We relax this high-order combinatorial problem to a convex conic form problem. To this end, we carefully defin… ▽ More In this paper we propose an algorithm for exact partitioning of high-order models. We define a general class of $m$-degree Homogeneous Polynomial Models, which subsumes several examples motivated from prior literature. Exact partitioning can be formulated as a tensor optimization problem. We relax this high-order combinatorial problem to a convex conic form problem. To this end, we carefully define the Carathéodory symmetric tensor cone, and show its convexity, and the convexity of its dual cone. This allows us to construct a primal-dual certificate to show that the solution of the convex relaxation is correct (equal to the unobserved true group assignment) and to analyze the statistical upper bound of exact partitioning. △ Less

Submitted 15 July, 2021; v1 submitted 5 November, 2019; originally announced November 2019.

Journal ref: Journal of Machine Learning Research (JMLR), 23(284): pp. 1-28, 2022

arXiv:1908.10201 [pdf, ps, other]

Behavior-aware Service Access Control Mechanism using Security Policy Monitoring for SOA Systems

Authors: Yunfei Meng, Zhiqiu Huang, Senzhang Wang, Yu Zhou, Guohua Shen, Changbo Ke

Abstract: Service-oriented architecture (SOA) system has been widely utilized at many present business areas. However, SOA system is loosely coupled with multiple services and lacks the relevant security protection mechanisms, thus it can easily be attacked by unauthorized access and information theft. The existed access control mechanism can only prevent unauthorized users from accessing the system, but th… ▽ More Service-oriented architecture (SOA) system has been widely utilized at many present business areas. However, SOA system is loosely coupled with multiple services and lacks the relevant security protection mechanisms, thus it can easily be attacked by unauthorized access and information theft. The existed access control mechanism can only prevent unauthorized users from accessing the system, but they can not prevent those authorized users (insiders) from attacking the system. To address this problem, we propose a behavior-aware service access control mechanism using security policy monitoring for SOA system. In our mechanism, a monitor program can supervise consumer's behaviors in run time. By means of trustful behavior model (TBM), if finding the consumer's behavior is of misusing, the monitor will deny its request. If finding the consumer's behavior is of malicious, the monitor will early terminate the consumer's access authorizations in this session or add the consumer into the Blacklist, whereby the consumer will not access the system from then on. In order to evaluate the feasibility of proposed mechanism, we implement a prototype system. The final results illustrate that our mechanism can effectively monitor consumer's behaviors and make effective responses when malicious behaviors really occur in run time. Moreover, as increasing the rule's number in TBM continuously, our mechanism can still work well. △ Less

Submitted 23 August, 2019; originally announced August 2019.

arXiv:1908.02704 [pdf, other]

Unified Simulation and Test Platform for Control Systems of Unmanned Vehicles

Authors: Xunhua Dai, Chenxu Ke, Quan Quan, Kai-Yuan Cai

Abstract: Control systems on unmanned vehicles are safety-critical systems whose requirements on reliability and safety are ever-increasing. Currently, testing a complex autonomous control system is an expensive and time-consuming process, which requires massive repeated experimental testing during the whole development stage. This paper presents a unified simulation and test platform for vehicle autonomous… ▽ More Control systems on unmanned vehicles are safety-critical systems whose requirements on reliability and safety are ever-increasing. Currently, testing a complex autonomous control system is an expensive and time-consuming process, which requires massive repeated experimental testing during the whole development stage. This paper presents a unified simulation and test platform for vehicle autonomous control systems aiming to significantly improve the development speed and safety level of unmanned vehicles. First, a unified modular modeling framework compatible with different types of vehicles is proposed with methods to ensure modeling credibility. Then, the simulation software system is developed by the model-based design framework, whose modular programming methods and automatic code generation functions ensure the efficiency, credibility, and standardization of the system development process. Finally, an FPGA-based real-time hardware-in-the-loop simulation platform is proposed to ensure the comprehensiveness and credibility of the simulation and test results. In the end, the proposed platform is applied to a multicopter control system. By comparing with experimental results, the accuracy and credibility of the simulation testing results are verified by using the simulation credibility assessment method proposed in our previous work. To verify the practicability of the proposed platform, several successful applications are presented for the multicopter rapid prototyping, estimation algorithm verification, autonomous flight testing, and automatic safety testing with automatic fault injection and result evaluation of unmanned vehicles. △ Less

Submitted 7 August, 2019; originally announced August 2019.

arXiv:1904.00717 [pdf, other]

Smart Routing: Towards Proactive Fault-Handling in Software-Defined Networks

Authors: Ali Malik, Benjamin Aziz, Mo Adda, Chih-Heng Ke

Abstract: Software-defined networking offers numerous benefits against the legacy networking systems through simplifying the process of network management and reducing the cost of network configuration. Currently, the management of failures in the data plane is limited to two mechanisms: proactive and reactive. Such failure recovery techniques are activated after occurrences of failures. Therefore, packet l… ▽ More Software-defined networking offers numerous benefits against the legacy networking systems through simplifying the process of network management and reducing the cost of network configuration. Currently, the management of failures in the data plane is limited to two mechanisms: proactive and reactive. Such failure recovery techniques are activated after occurrences of failures. Therefore, packet loss is highly likely to occur as a result of service disruption and unavailability. This issue is not only related to the slow speed of recovery mechanisms, but also the delay caused by the failure detection process. In this paper, we define a new approach to the management of fault tolerance in software-defined networks where the goal is to eliminate the convergence process altogether, rather than speed up failure detection and recovery. We propose a new framework, called Smart Routing, which works based on the forewarning signs on failures in order to compute alternative paths and isolate the risky links from the routing tables of the data plane devices. We validate our framework through a set of experiments that demonstrate how the underlying model runs. △ Less

Submitted 1 April, 2019; originally announced April 2019.

arXiv:1902.03099 [pdf, other]

Exact Inference with Latent Variables in an Arbitrary Domain

Authors: Chuyang Ke, Jean Honorio

Abstract: We analyze the necessary and sufficient conditions for exact inference of a latent model. In latent models, each entity is associated with a latent variable following some probability distribution. The challenging question we try to solve is: can we perform exact inference without observing the latent variables, even without knowing what the domain of the latent variables is? We show that exact in… ▽ More We analyze the necessary and sufficient conditions for exact inference of a latent model. In latent models, each entity is associated with a latent variable following some probability distribution. The challenging question we try to solve is: can we perform exact inference without observing the latent variables, even without knowing what the domain of the latent variables is? We show that exact inference can be achieved using a semidefinite programming (SDP) approach without knowing either the latent variables or their domain. Our analysis predicts the experimental correctness of SDP with high accuracy, showing the suitability of our focus on the Karush-Kuhn-Tucker (KKT) conditions and the spectrum of a properly defined matrix. As a byproduct of our analysis, we also provide concentration inequalities with dependence on latent variables, both for bounded moment generating functions as well as for the spectra of matrices. To the best of our knowledge, these results are novel and could be useful for many other problems. △ Less

Submitted 27 June, 2020; v1 submitted 28 January, 2019; originally announced February 2019.

arXiv:1806.04414 [pdf]

Controlling spectral energies of all harmonics in programmable way using time-domain digital coding metasurface

Authors: Jie Zhao, Xi Yang, Jun Yan Dai, Qiang Cheng, Xiang Li, Ning Hua Qi, Jun Chen Ke, Guo Dong Bai, Shuo Liu, Shi Jin, Tie Jun Cui

Abstract: Modern wireless communication is one of the most important information technologies, but its system architecture has been unchanged for many years. Here, we propose a much simpler architecture for wireless communication systems based on metasurface. We firstly propose a time-domain digital coding metasurface to reach a simple but efficient method to manipulate spectral distributions of harmonics.… ▽ More Modern wireless communication is one of the most important information technologies, but its system architecture has been unchanged for many years. Here, we propose a much simpler architecture for wireless communication systems based on metasurface. We firstly propose a time-domain digital coding metasurface to reach a simple but efficient method to manipulate spectral distributions of harmonics. Under dynamic modulations of phases on surface reflectivity, we could achieve accurate controls to different harmonics in a programmable way to reach many unusual functions like frequency cloaking and velocity illusion, owing to the temporal gradient introduced by digital signals encoded by '0' and '1' sequences. A theoretical model is presented and experimentally validated to reveal the nonlinear process. Based on the time-domain digital coding metasurface, we propose and realize a new wireless communication system in binary frequency-shift keying (BFSK) frame, which has much more simplified architecture than the traditional BFSK with excellent performance for real-time message transmission. The presented work, from new concept to new system, will find important applications in modern information technologies. △ Less

Submitted 12 June, 2018; originally announced June 2018.

arXiv:1805.10583 [pdf, other]

Dual Swap Disentangling

Authors: Zunlei Feng, Xinchao Wang, Chenglong Ke, Anxiang Zeng, Dacheng Tao, Mingli Song

Abstract: Learning interpretable disentangled representations is a crucial yet challenging task. In this paper, we propose a weakly semi-supervised method, termed as Dual Swap Disentangling (DSD), for disentangling using both labeled and unlabeled data. Unlike conventional weakly supervised methods that rely on full annotations on the group of samples, we require only limited annotations on paired samples t… ▽ More Learning interpretable disentangled representations is a crucial yet challenging task. In this paper, we propose a weakly semi-supervised method, termed as Dual Swap Disentangling (DSD), for disentangling using both labeled and unlabeled data. Unlike conventional weakly supervised methods that rely on full annotations on the group of samples, we require only limited annotations on paired samples that indicate their shared attribute like the color. Our model takes the form of a dual autoencoder structure. To achieve disentangling using the labeled pairs, we follow a "encoding-swap-decoding" process, where we first swap the parts of their encodings corresponding to the shared attribute and then decode the obtained hybrid codes to reconstruct the original input pairs. For unlabeled pairs, we follow the "encoding-swap-decoding" process twice on designated encoding parts and enforce the final outputs to approximate the input pairs. By isolating parts of the encoding and swapping them back and forth, we impose the dimension-wise modularity and portability of the encodings of the unlabeled samples, which implicitly encourages disentangling under the guidance of labeled pairs. This dual swap mechanism, tailored for semi-supervised setting, turns out to be very effective. Experiments on image datasets from a wide domain show that our model yields state-of-the-art disentangling performances. △ Less

Submitted 1 January, 2020; v1 submitted 27 May, 2018; originally announced May 2018.

Comments: Accepted by NeurIPS 2018; Adding the theoretical proof for the disentanglement of labeled pairs

arXiv:1802.06104 [pdf, ps, other]

Information-theoretic Limits for Community Detection in Network Models

Authors: Chuyang Ke, Jean Honorio

Abstract: We analyze the information-theoretic limits for the recovery of node labels in several network models. This includes the Stochastic Block Model, the Exponential Random Graph Model, the Latent Space Model, the Directed Preferential Attachment Model, and the Directed Small-world Model. For the Stochastic Block Model, the non-recoverability condition depends on the probabilities of having edges insid… ▽ More We analyze the information-theoretic limits for the recovery of node labels in several network models. This includes the Stochastic Block Model, the Exponential Random Graph Model, the Latent Space Model, the Directed Preferential Attachment Model, and the Directed Small-world Model. For the Stochastic Block Model, the non-recoverability condition depends on the probabilities of having edges inside a community, and between different communities. For the Latent Space Model, the non-recoverability condition depends on the dimension of the latent space, and how far and spread are the communities in the latent space. For the Directed Preferential Attachment Model and the Directed Small-world Model, the non-recoverability condition depends on the ratio between homophily and neighborhood size. We also consider dynamic versions of the Stochastic Block Model and the Latent Space Model. △ Less

Submitted 21 May, 2018; v1 submitted 16 February, 2018; originally announced February 2018.

Journal ref: Neural Information Processing Systems (NeurIPS), 2018

arXiv:1703.07345 [pdf, other]

On The Projection Operator to A Three-view Cardinality Constrained Set

Authors: Haichuan Yang, Shupeng Gui, Chuyang Ke, Daniel Stefankovic, Ryohei Fujimaki, Ji Liu

Abstract: The cardinality constraint is an intrinsic way to restrict the solution structure in many domains, for example, sparse learning, feature selection, and compressed sensing. To solve a cardinality constrained problem, the key challenge is to solve the projection onto the cardinality constraint set, which is NP-hard in general when there exist multiple overlapped cardinality constraints. In this pape… ▽ More The cardinality constraint is an intrinsic way to restrict the solution structure in many domains, for example, sparse learning, feature selection, and compressed sensing. To solve a cardinality constrained problem, the key challenge is to solve the projection onto the cardinality constraint set, which is NP-hard in general when there exist multiple overlapped cardinality constraints. In this paper, we consider the scenario where the overlapped cardinality constraints satisfy a Three-view Cardinality Structure (TVCS), which reflects the natural restriction in many applications, such as identification of gene regulatory networks and task-worker assignment problem. We cast the projection into a linear programming, and show that for TVCS, the vertex solution of this linear programming is the solution for the original projection problem. We further prove that such solution can be found with the complexity proportional to the number of variables and constraints. We finally use synthetic experiments and two interesting applications in bioinformatics and crowdsourcing to validate the proposed TVCS model and method. △ Less

Submitted 14 June, 2017; v1 submitted 21 March, 2017; originally announced March 2017.

arXiv:1612.09062 [pdf]

Condensedly: comprehending article contents through condensed texts

Authors: Chao-Hsuan Ke, Tsung-Lu Michael Lee, Jung-Hsien Chiang

Abstract: Summary: Abstracts in biomedical articles can provide a quick overview of the articles but detailed information cannot be obtained without reading full-text contents. Full-text articles certainly generate more information and contents; however, accessing full-text documents is usually time consuming. Condensedly is a web-based application, which provides readers an easy and efficient way to access… ▽ More Summary: Abstracts in biomedical articles can provide a quick overview of the articles but detailed information cannot be obtained without reading full-text contents. Full-text articles certainly generate more information and contents; however, accessing full-text documents is usually time consuming. Condensedly is a web-based application, which provides readers an easy and efficient way to access full-text paragraphs using sentences in abstracts as fishing bait to retrieve the big fish reside in full-text. Condensedly is based on the paragraph ranking algorithm, which evaluates and ranks full-text paragraphs based on their association scores with sentences in abstracts. Availability: http://140.116.247.185/~research/Condensedly △ Less

Submitted 29 December, 2016; originally announced December 2016.

arXiv:1612.08669 [pdf]

A Hybrid Both Filter and Wrapper Feature Selection Method for Microarray Classification

Authors: Li-Yeh Chuang, Chao-Hsuan Ke, Cheng-Hong Yang

Abstract: Gene expression data is widely used in disease analysis and cancer diagnosis. However, since gene expression data could contain thousands of genes simultaneously, successful microarray classification is rather difficult. Feature selection is an important pre-treatment for any classification process. Selecting a useful gene subset as a classifier not only decreases the computational time and cost,… ▽ More Gene expression data is widely used in disease analysis and cancer diagnosis. However, since gene expression data could contain thousands of genes simultaneously, successful microarray classification is rather difficult. Feature selection is an important pre-treatment for any classification process. Selecting a useful gene subset as a classifier not only decreases the computational time and cost, but also increases classification accuracy. In this study, we applied the information gain method as a filter approach, and an improved binary particle swarm optimization as a wrapper approach to implement feature selection; selected gene subsets were used to evaluate the performance of classification. Experimental results show that by employing the proposed method fewer gene subsets needed to be selected and better classification accuracy could be obtained. △ Less

Submitted 27 December, 2016; originally announced December 2016.

Comments: 5 pages, 2 figures, 4tables

Journal ref: IMECS2008_pp146-150

arXiv:1611.04049 [pdf, ps, other]

Prognostics of Surgical Site Infections using Dynamic Health Data

Authors: Chuyang Ke, Yan Jin, Heather Evans, Bill Lober, Xiaoning Qian, Ji Liu, Shuai Huang

Abstract: Surgical Site Infection (SSI) is a national priority in healthcare research. Much research attention has been attracted to develop better SSI risk prediction models. However, most of the existing SSI risk prediction models are built on static risk factors such as comorbidities and operative factors. In this paper, we investigate the use of the dynamic wound data for SSI risk prediction. There have… ▽ More Surgical Site Infection (SSI) is a national priority in healthcare research. Much research attention has been attracted to develop better SSI risk prediction models. However, most of the existing SSI risk prediction models are built on static risk factors such as comorbidities and operative factors. In this paper, we investigate the use of the dynamic wound data for SSI risk prediction. There have been emerging mobile health (mHealth) tools that can closely monitor the patients and generate continuous measurements of many wound-related variables and other evolving clinical variables. Since existing prediction models of SSI have quite limited capacity to utilize the evolving clinical data, we develop the corresponding solution to equip these mHealth tools with decision-making capabilities for SSI prediction with a seamless assembly of several machine learning models to tackle the analytic challenges arising from the spatial-temporal data. The basic idea is to exploit the low-rank property of the spatial-temporal data via the bilinear formulation, and further enhance it with automatic missing data imputation by the matrix completion technique. We derive efficient optimization algorithms to implement these models and demonstrate the superior performances of our new predictive model on a real-world dataset of SSI, compared to a range of state-of-the-art methods. △ Less

Submitted 12 November, 2016; originally announced November 2016.

Comments: 23 pages, 8 figures

Showing 1–28 of 28 results for author: Ke, C