Search | arXiv e-print repository

LiD-FL: Towards List-Decodable Federated Learning

Authors: Hong Liu, Liren Shan, Han Bao, Ronghui You, Yuhao Yi, Jiancheng Lv

Abstract: Federated learning is often used in environments with many unverified participants. Therefore, federated learning under adversarial attacks receives significant attention. This paper proposes an algorithmic framework for list-decodable federated learning, where a central server maintains a list of models, with at least one guaranteed to perform well. The framework has no strict restriction on the… ▽ More Federated learning is often used in environments with many unverified participants. Therefore, federated learning under adversarial attacks receives significant attention. This paper proposes an algorithmic framework for list-decodable federated learning, where a central server maintains a list of models, with at least one guaranteed to perform well. The framework has no strict restriction on the fraction of honest workers, extending the applicability of Byzantine federated learning to the scenario with more than half adversaries. Under proper assumptions on the loss function, we prove a convergence theorem for our method. Experimental results, including image classification tasks with both convex and non-convex losses, demonstrate that the proposed algorithm can withstand the malicious majority under various attacks. △ Less

Submitted 15 August, 2024; v1 submitted 9 August, 2024; originally announced August 2024.

Comments: 26 pages, 5 figures

arXiv:2405.19568 [pdf, other]

Organizing Background to Explore Latent Classes for Incremental Few-shot Semantic Segmentation

Authors: Lianlei Shan, Wenzhang Zhou, Wei Li, Xingyu Ding

Abstract: The goal of incremental Few-shot Semantic Segmentation (iFSS) is to extend pre-trained segmentation models to new classes via few annotated images without access to old training data. During incrementally learning novel classes, the data distribution of old classes will be destroyed, leading to catastrophic forgetting. Meanwhile, the novel classes have only few samples, making models impossible to… ▽ More The goal of incremental Few-shot Semantic Segmentation (iFSS) is to extend pre-trained segmentation models to new classes via few annotated images without access to old training data. During incrementally learning novel classes, the data distribution of old classes will be destroyed, leading to catastrophic forgetting. Meanwhile, the novel classes have only few samples, making models impossible to learn the satisfying representations of novel classes. For the iFSS problem, we propose a network called OINet, i.e., the background embedding space \textbf{O}rganization and prototype \textbf{I}nherit Network. Specifically, when training base classes, OINet uses multiple classification heads for the background and sets multiple sub-class prototypes to reserve embedding space for the latent novel classes. During incrementally learning novel classes, we propose a strategy to select the sub-class prototypes that best match the current learning novel classes and make the novel classes inherit the selected prototypes' embedding space. This operation allows the novel classes to be registered in the embedding space using few samples without affecting the distribution of the base classes. Results on Pascal-VOC and COCO show that OINet achieves a new state of the art. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 10 pages, 5 figures

arXiv:2405.18663 [pdf, other]

Lifelong Learning and Selective Forgetting via Contrastive Strategy

Authors: Lianlei Shan, Wenzhang Zhou, Wei Li, Xingyu Ding

Abstract: Lifelong learning aims to train a model with good performance for new tasks while retaining the capacity of previous tasks. However, some practical scenarios require the system to forget undesirable knowledge due to privacy issues, which is called selective forgetting. The joint task of the two is dubbed Learning with Selective Forgetting (LSF). In this paper, we propose a new framework based on c… ▽ More Lifelong learning aims to train a model with good performance for new tasks while retaining the capacity of previous tasks. However, some practical scenarios require the system to forget undesirable knowledge due to privacy issues, which is called selective forgetting. The joint task of the two is dubbed Learning with Selective Forgetting (LSF). In this paper, we propose a new framework based on contrastive strategy for LSF. Specifically, for the preserved classes (tasks), we make features extracted from different samples within a same class compacted. And for the deleted classes, we make the features from different samples of a same class dispersed and irregular, i.e., the network does not have any regular response to samples from a specific deleted class as if the network has no training at all. Through maintaining or disturbing the feature distribution, the forgetting and memory of different classes can be or independent of each other. Experiments are conducted on four benchmark datasets, and our method acieves new state-of-the-art. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 10 pages, 5 figure

arXiv:2405.18078 [pdf, other]

Edge-guided and Class-balanced Active Learning for Semantic Segmentation of Aerial Images

Authors: Lianlei Shan, Weiqiang Wang, Ke Lv, Bin Luo

Abstract: Semantic segmentation requires pixel-level annotation, which is time-consuming. Active Learning (AL) is a promising method for reducing data annotation costs. Due to the gap between aerial and natural images, the previous AL methods are not ideal, mainly caused by unreasonable labeling units and the neglect of class imbalance. Previous labeling units are based on images or regions, which does not… ▽ More Semantic segmentation requires pixel-level annotation, which is time-consuming. Active Learning (AL) is a promising method for reducing data annotation costs. Due to the gap between aerial and natural images, the previous AL methods are not ideal, mainly caused by unreasonable labeling units and the neglect of class imbalance. Previous labeling units are based on images or regions, which does not consider the characteristics of segmentation tasks and aerial images, i.e., the segmentation network often makes mistakes in the edge region, and the edge of aerial images is often interlaced and irregular. Therefore, an edge-guided labeling unit is proposed and supplemented as the new unit. On the other hand, the class imbalance is severe, manifested in two aspects: the aerial image is seriously imbalanced, and the AL strategy does not fully consider the class balance. Both seriously affect the performance of AL in aerial images. We comprehensively ensure class balance from all steps that may occur imbalance, including initial labeled data, subsequent labeled data, and pseudo-labels. Through the two improvements, our method achieves more than 11.2\% gains compared to state-of-the-art methods on three benchmark datasets, Deepglobe, Potsdam, and Vaihingen, and more than 18.6\% gains compared to the baseline. Sufficient ablation studies show that every module is indispensable. Furthermore, we establish a fair and strong benchmark for future research on AL for aerial image segmentation. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.17776 [pdf, other]

The Binary Quantized Neural Network for Dense Prediction via Specially Designed Upsampling and Attention

Authors: Xingyu Ding, Lianlei Shan, Guiqin Zhao, Meiqi Wu, Wenzhang Zhou, Wei Li

Abstract: Deep learning-based information processing consumes long time and requires huge computing resources, especially for dense prediction tasks which require an output for each pixel, like semantic segmentation and salient object detection. There are mainly two challenges for quantization of dense prediction tasks. Firstly, directly applying the upsampling operation that dense prediction tasks require… ▽ More Deep learning-based information processing consumes long time and requires huge computing resources, especially for dense prediction tasks which require an output for each pixel, like semantic segmentation and salient object detection. There are mainly two challenges for quantization of dense prediction tasks. Firstly, directly applying the upsampling operation that dense prediction tasks require is extremely crude and causes unacceptable accuracy reduction. Secondly, the complex structure of dense prediction networks means it is difficult to maintain a fast speed as well as a high accuracy when performing quantization. In this paper, we propose an effective upsampling method and an efficient attention computation strategy to transfer the success of the binary neural networks (BNN) from single prediction tasks to dense prediction tasks. Firstly, we design a simple and robust multi-branch parallel upsampling structure to achieve the high accuracy. Then we further optimize the attention method which plays an important role in segmentation but has huge computation complexity. Our attention method can reduce the computational complexity by a factor of one hundred times but retain the original effect. Experiments on Cityscapes, KITTI road, and ECSSD fully show the effectiveness of our work. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 30 pages, 6 figures

arXiv:2402.14540 [pdf, other]

On Truthful Item-Acquiring Mechanisms for Reward Maximization

Authors: Liang Shan, Shuo Zhang, Jie Zhang, Zihe Wang

Abstract: In this research, we study the problem that a collector acquires items from the owner based on the item qualities the owner declares and an independent appraiser's assessments. The owner is interested in maximizing the probability that the collector acquires the items and is the only one who knows the items' factual quality. The appraiser performs her duties with impartiality, but her assessment m… ▽ More In this research, we study the problem that a collector acquires items from the owner based on the item qualities the owner declares and an independent appraiser's assessments. The owner is interested in maximizing the probability that the collector acquires the items and is the only one who knows the items' factual quality. The appraiser performs her duties with impartiality, but her assessment may be subject to random noises, so it may not accurately reflect the factual quality of the items. The main challenge lies in devising mechanisms that prompt the owner to reveal accurate information, thereby optimizing the collector's expected reward. We consider the menu size of mechanisms as a measure of their practicability and study its impact on the attainable expected reward. For the single-item setting, we design optimal mechanisms with a monotone increasing menu size. Although the reward gap between the simplest and optimal mechanisms is bounded, we show that simple mechanisms with a small menu size cannot ensure any positive fraction of the optimal reward of mechanisms with a larger menu size. For the multi-item setting, we show that an ordinal mechanism that only takes the owner's ordering of the items as input is not incentive-compatible. We then propose a set of Union mechanisms that combine single-item mechanisms. Moreover, we run experiments to examine these mechanisms' robustness against the independent appraiser's assessment accuracy and the items' acquiring rate. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.11769 [pdf, other]

Connection-Aware P2P Trading: Simultaneous Trading and Peer Selection

Authors: Cheng Feng, Kedi Zheng, Lanqing Shan, Hani Alers, Lampros Stergioulas, Hongye Guo, Qixin Chen

Abstract: Peer-to-peer (P2P) trading is seen as a viable solution to handle the growing number of distributed energy resources in distribution networks. However, when dealing with large-scale consumers, there are several challenges that must be addressed. One of these challenges is limited communication capabilities. Additionally, prosumers may have specific preferences when it comes to trading. Both can re… ▽ More Peer-to-peer (P2P) trading is seen as a viable solution to handle the growing number of distributed energy resources in distribution networks. However, when dealing with large-scale consumers, there are several challenges that must be addressed. One of these challenges is limited communication capabilities. Additionally, prosumers may have specific preferences when it comes to trading. Both can result in serious asynchrony in peer-to-peer trading, potentially impacting the effectiveness of negotiations and hindering convergence before the market closes. This paper introduces a connection-aware P2P trading algorithm designed for extensive prosumer trading. The algorithm facilitates asynchronous trading while respecting prosumer's autonomy in trading peer selection, an often overlooked aspect in traditional models. In addition, to optimize the use of limited connection opportunities, a smart trading peer connection selection strategy is developed to guide consumers to communicate strategically to accelerate convergence. A theoretical convergence guarantee is provided for the connection-aware P2P trading algorithm, which further details how smart selection strategies enhance convergence efficiency. Numerical studies are carried out to validate the effectiveness of the connection-aware algorithm and the performance of smart selection strategies in reducing the overall convergence time. △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: Submitted to IEEE PES Transactions

arXiv:2401.17952 [pdf, ps, other]

Error-Tolerant E-Discovery Protocols

Authors: Jinshuo Dong, Jason D. Hartline, Liren Shan, Aravindan Vijayaraghavan

Abstract: We consider the multi-party classification problem introduced by Dong, Hartline, and Vijayaraghavan (2022) in the context of electronic discovery (e-discovery). Based on a request for production from the requesting party, the responding party is required to provide documents that are responsive to the request except for those that are legally privileged. Our goal is to find a protocol that verifie… ▽ More We consider the multi-party classification problem introduced by Dong, Hartline, and Vijayaraghavan (2022) in the context of electronic discovery (e-discovery). Based on a request for production from the requesting party, the responding party is required to provide documents that are responsive to the request except for those that are legally privileged. Our goal is to find a protocol that verifies that the responding party sends almost all responsive documents while minimizing the disclosure of non-responsive documents. We provide protocols in the challenging non-realizable setting, where the instance may not be perfectly separated by a linear classifier. We demonstrate empirically that our protocol successfully manages to find almost all relevant documents, while incurring only a small disclosure of non-responsive documents. We complement this with a theoretical analysis of our protocol in the single-dimensional setting, and other experiments on simulated data which suggest that the non-responsive disclosure incurred by our protocol may be unavoidable. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: 28 pages, 6 figures, CSLAW 2024

arXiv:2401.00973 [pdf, other]

Facebook Report on Privacy of fNIRS data

Authors: Md Imran Hossen, Sai Venkatesh Chilukoti, Liqun Shan, Vijay Srinivas Tida, Xiali Hei

Abstract: The primary goal of this project is to develop privacy-preserving machine learning model training techniques for fNIRS data. This project will build a local model in a centralized setting with both differential privacy (DP) and certified robustness. It will also explore collaborative federated learning to train a shared model between multiple clients without sharing local fNIRS datasets. To preven… ▽ More The primary goal of this project is to develop privacy-preserving machine learning model training techniques for fNIRS data. This project will build a local model in a centralized setting with both differential privacy (DP) and certified robustness. It will also explore collaborative federated learning to train a shared model between multiple clients without sharing local fNIRS datasets. To prevent unintentional private information leakage of such clients' private datasets, we will also implement DP in the federated learning setting. △ Less

Submitted 1 January, 2024; originally announced January 2024.

Comments: 15 pages, 5 figures, 3 tables

MSC Class: I.2.0

arXiv:2312.02400 [pdf, other]

Auto DP-SGD: Dual Improvements of Privacy and Accuracy via Automatic Clipping Threshold and Noise Multiplier Estimation

Authors: Sai Venkatesh Chilukoti, Md Imran Hossen, Liqun Shan, Vijay Srinivas Tida, Xiai Hei

Abstract: DP-SGD has emerged as a popular method to protect personally identifiable information in deep learning applications. Unfortunately, DP-SGD's per-sample gradient clipping and uniform noise addition during training can significantly degrade model utility. To enhance the model's utility, researchers proposed various adaptive DP-SGD methods. However, we examine and discover that these techniques resul… ▽ More DP-SGD has emerged as a popular method to protect personally identifiable information in deep learning applications. Unfortunately, DP-SGD's per-sample gradient clipping and uniform noise addition during training can significantly degrade model utility. To enhance the model's utility, researchers proposed various adaptive DP-SGD methods. However, we examine and discover that these techniques result in greater privacy leakage or lower accuracy than the traditional DP-SGD method, or a lack of evaluation on a complex data set such as CIFAR100. To address these limitations, we propose an Auto DP-SGD. Our method automates clipping threshold estimation based on the DL model's gradient norm and scales the gradients of each training sample without losing gradient information. This helps to improve the algorithm's utility while using a less privacy budget. To further improve accuracy, we introduce automatic noise multiplier decay mechanisms to decrease the noise multiplier after every epoch. Finally, we develop closed-form mathematical expressions using tCDP accountant for automatic noise multiplier and automatic clipping threshold estimation. Through extensive experimentation, we demonstrate that Auto DP-SGD outperforms existing SOTA DP-SGD methods in privacy and accuracy on various benchmark datasets. We also show that privacy can be improved by lowering the scale factor and using learning rate schedulers without significantly reducing accuracy. Specifically, Auto DP-SGD, when used with a step noise multiplier, improves accuracy by 3.20, 1.57, 6.73, and 1.42 for the MNIST, CIFAR10, CIFAR100, and AG News Corpus datasets, respectively. Furthermore, it obtains a substantial reduction in the privacy budget of 94.9, 79.16, 67.36, and 53.37 for the corresponding data sets. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: 25 pages single column, 2 figures

MSC Class: 26; 40

arXiv:2311.17450 [pdf, other]

Continual Learning for Image Segmentation with Dynamic Query

Authors: Weijia Wu, Yuzhong Zhao, Zhuang Li, Lianlei Shan, Hong Zhou, Mike Zheng Shou

Abstract: Image segmentation based on continual learning exhibits a critical drop of performance, mainly due to catastrophic forgetting and background shift, as they are required to incorporate new classes continually. In this paper, we propose a simple, yet effective Continual Image Segmentation method with incremental Dynamic Query (CISDQ), which decouples the representation learning of both old and new k… ▽ More Image segmentation based on continual learning exhibits a critical drop of performance, mainly due to catastrophic forgetting and background shift, as they are required to incorporate new classes continually. In this paper, we propose a simple, yet effective Continual Image Segmentation method with incremental Dynamic Query (CISDQ), which decouples the representation learning of both old and new knowledge with lightweight query embedding. CISDQ mainly includes three contributions: 1) We define dynamic queries with adaptive background class to exploit past knowledge and learn future classes naturally. 2) CISDQ proposes a class/instance-aware Query Guided Knowledge Distillation strategy to overcome catastrophic forgetting by capturing the inter-class diversity and intra-class identity. 3) Apart from semantic segmentation, CISDQ introduce the continual learning for instance segmentation in which instance-wise labeling and supervision are considered. Extensive experiments on three datasets for two tasks (i.e., continual semantic and instance segmentation are conducted to demonstrate that CISDQ achieves the state-of-the-art performance, specifically, obtaining 4.4% and 2.9% mIoU improvements for the ADE 100-10 (6 steps) setting and ADE 100-5 (11 steps) setting. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: Code: https://github.com/weijiawu/CisDQ

Journal ref: TCSVT 2023

arXiv:2309.02765 [pdf, other]

doi 10.4204/EPTCS.386.18

A General Approach to Proving Properties of Fibonacci Representations via Automata Theory

Authors: Jeffrey Shallit, Sonja Linghui Shan

Abstract: We provide a method, based on automata theory, to mechanically prove the correctness of many numeration systems based on Fibonacci numbers. With it, long case-based and induction-based proofs of correctness can be replaced by simply constructing a regular expression (or finite automaton) specifying the rules for valid representations, followed by a short computation. Examples of the systems that c… ▽ More We provide a method, based on automata theory, to mechanically prove the correctness of many numeration systems based on Fibonacci numbers. With it, long case-based and induction-based proofs of correctness can be replaced by simply constructing a regular expression (or finite automaton) specifying the rules for valid representations, followed by a short computation. Examples of the systems that can be handled using our technique include Brown's lazy representation (1965), the far-difference representation developed by Alpert (2009), and three representations proposed by Hajnal (2023). We also provide three additional systems and prove their validity. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: In Proceedings AFL 2023, arXiv:2309.01126

Journal ref: EPTCS 386, 2023, pp. 228-242

arXiv:2308.10160 [pdf, other]

Higher-Order Cheeger Inequality for Partitioning with Buffers

Authors: Konstantin Makarychev, Yury Makarychev, Liren Shan, Aravindan Vijayaraghavan

Abstract: We prove a new generalization of the higher-order Cheeger inequality for partitioning with buffers. Consider a graph $G=(V,E)$. The buffered expansion of a set $S \subseteq V$ with a buffer $B \subseteq V \setminus S$ is the edge expansion of $S$ after removing all the edges from set $S$ to its buffer $B$. An $\varepsilon$-buffered $k$-partitioning is a partitioning of a graph into disjoint compon… ▽ More We prove a new generalization of the higher-order Cheeger inequality for partitioning with buffers. Consider a graph $G=(V,E)$. The buffered expansion of a set $S \subseteq V$ with a buffer $B \subseteq V \setminus S$ is the edge expansion of $S$ after removing all the edges from set $S$ to its buffer $B$. An $\varepsilon$-buffered $k$-partitioning is a partitioning of a graph into disjoint components $P_i$ and buffers $B_i$, in which the size of buffer $B_i$ for $P_i$ is small relative to the size of $P_i$: $|B_i| \le \varepsilon |P_i|$. The buffered expansion of a buffered partition is the maximum of buffered expansions of the $k$ sets $P_i$ with buffers $B_i$. Let $h^{k,\varepsilon}_G$ be the buffered expansion of the optimal $\varepsilon$-buffered $k$-partitioning, then for every $δ>0$, $$h_G^{k,\varepsilon} \le O_δ(1) \cdot \Big( \frac{\log k}{ \varepsilon}\Big) \cdot λ_{\lfloor (1+δ) k\rfloor},$$ where $λ_{\lfloor (1+δ)k\rfloor}$ is the $\lfloor (1+δ)k\rfloor$-th smallest eigenvalue of the normalized Laplacian of $G$. Our inequality is constructive and avoids the ``square-root loss'' that is present in the standard Cheeger inequalities (even for $k=2$). We also provide a complementary lower bound, and a novel generalization to the setting with arbitrary vertex weights and edge costs. Moreover our result implies and generalizes the standard higher-order Cheeger inequalities and another recent Cheeger-type inequality by Kwok, Lau, and Lee (2017) involving robust vertex expansion. △ Less

Submitted 20 August, 2023; originally announced August 2023.

Comments: 45 pages

arXiv:2308.08373 [pdf, ps, other]

Approximation Algorithms for Norm Multiway Cut

Authors: Charlie Carlson, Jafar Jafarov, Konstantin Makarychev, Yury Makarychev, Liren Shan

Abstract: We consider variants of the classic Multiway Cut problem. Multiway Cut asks to partition a graph $G$ into $k$ parts so as to separate $k$ given terminals. Recently, Chandrasekaran and Wang (ESA 2021) introduced $\ell_p$-norm Multiway, a generalization of the problem, in which the goal is to minimize the $\ell_p$ norm of the edge boundaries of $k$ parts. We provide an… ▽ More We consider variants of the classic Multiway Cut problem. Multiway Cut asks to partition a graph $G$ into $k$ parts so as to separate $k$ given terminals. Recently, Chandrasekaran and Wang (ESA 2021) introduced $\ell_p$-norm Multiway, a generalization of the problem, in which the goal is to minimize the $\ell_p$ norm of the edge boundaries of $k$ parts. We provide an $O(\log^{1/2} n\log^{1/2+1/p} k)$ approximation algorithm for this problem, improving upon the approximation guarantee of $O(\log^{3/2} n \log^{1/2} k)$ due to Chandrasekaran and Wang. We also introduce and study Norm Multiway Cut, a further generalization of Multiway Cut. We assume that we are given access to an oracle, which answers certain queries about the norm. We present an $O(\log^{1/2} n \log^{7/2} k)$ approximation algorithm with a weaker oracle and an $O(\log^{1/2} n \log^{5/2} k)$ approximation algorithm with a stronger oracle. Additionally, we show that without any oracle access, there is no $n^{1/4-\varepsilon}$ approximation algorithm for every $\varepsilon > 0$ assuming the Hypergraph Dense-vs-Random Conjecture. △ Less

Submitted 16 August, 2023; originally announced August 2023.

Comments: 25 pages, ESA 2023

arXiv:2307.15128 [pdf, other]

End-to-end Remote Sensing Change Detection of Unregistered Bi-temporal Images for Natural Disasters

Authors: Guiqin Zhao, Lianlei Shan, Weiqiang Wang

Abstract: Change detection based on remote sensing images has been a prominent area of interest in the field of remote sensing. Deep networks have demonstrated significant success in detecting changes in bi-temporal remote sensing images and have found applications in various fields. Given the degradation of natural environments and the frequent occurrence of natural disasters, accurately and swiftly identi… ▽ More Change detection based on remote sensing images has been a prominent area of interest in the field of remote sensing. Deep networks have demonstrated significant success in detecting changes in bi-temporal remote sensing images and have found applications in various fields. Given the degradation of natural environments and the frequent occurrence of natural disasters, accurately and swiftly identifying damaged buildings in disaster-stricken areas through remote sensing images holds immense significance. This paper aims to investigate change detection specifically for natural disasters. Considering that existing public datasets used in change detection research are registered, which does not align with the practical scenario where bi-temporal images are not matched, this paper introduces an unregistered end-to-end change detection synthetic dataset called xBD-E2ECD. Furthermore, we propose an end-to-end change detection network named E2ECDNet, which takes an unregistered bi-temporal image pair as input and simultaneously generates the flow field prediction result and the change detection prediction result. It is worth noting that our E2ECDNet also supports change detection for registered image pairs, as registration can be seen as a special case of non-registration. Additionally, this paper redefines the criteria for correctly predicting a positive case and introduces neighborhood-based change detection evaluation metrics. The experimental results have demonstrated significant improvements. △ Less

Submitted 16 August, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

arXiv:2305.15033 [pdf, other]

SmartTrim: Adaptive Tokens and Attention Pruning for Efficient Vision-Language Models

Authors: Zekun Wang, Jingchang Chen, Wangchunshu Zhou, Haichao Zhu, Jiafeng Liang, Liping Shan, Ming Liu, Dongliang Xu, Qing Yang, Bing Qin

Abstract: Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications. Moreover, the degree of redundancy in token representations and model parameters, such as attention heads, varies significantly for different inputs. In light… ▽ More Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications. Moreover, the degree of redundancy in token representations and model parameters, such as attention heads, varies significantly for different inputs. In light of the challenges, we propose SmartTrim, an adaptive acceleration framework for VLMs, which adjusts the computational overhead per instance. Specifically, we integrate lightweight modules into the original backbone to identify and prune redundant token representations and attention heads within each layer. Furthermore, we devise a self-distillation strategy to enhance the consistency between the predictions of the pruned model and its fully-capacity counterpart. Experimental results across various vision-language tasks consistently demonstrate that SmartTrim accelerates the original model by 2-3 times with minimal performance degradation, highlighting the effectiveness and efficiency compared to previous approaches. Code will be available at https://github.com/kugwzk/SmartTrim. △ Less

Submitted 26 February, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: COLING-LREC 2024

arXiv:2304.12584 [pdf, other]

Learning imaging mechanism directly from optical microscopy observations

Authors: Ze-Hao Wang, Long-Kun Shan, Tong-Tian Weng, Tian-Long Chen, Qi-Yu Wang, Xiang-Dong Chen, Zhang-Yang Wang, Guang-Can Guo, Fang-Wen Sun

Abstract: Optical microscopy image plays an important role in scientific research through the direct visualization of the nanoworld, where the imaging mechanism is described as the convolution of the point spread function (PSF) and emitters. Based on a priori knowledge of the PSF or equivalent PSF, it is possible to achieve more precise exploration of the nanoworld. However, it is an outstanding challenge t… ▽ More Optical microscopy image plays an important role in scientific research through the direct visualization of the nanoworld, where the imaging mechanism is described as the convolution of the point spread function (PSF) and emitters. Based on a priori knowledge of the PSF or equivalent PSF, it is possible to achieve more precise exploration of the nanoworld. However, it is an outstanding challenge to directly extract the PSF from microscopy images. Here, with the help of self-supervised learning, we propose a physics-informed masked autoencoder (PiMAE) that enables a learnable estimation of the PSF and emitters directly from the raw microscopy images. We demonstrate our method in synthetic data and real-world experiments with significant accuracy and noise robustness. PiMAE outperforms DeepSTORM and the Richardson-Lucy algorithm in synthetic data tasks with an average improvement of 19.6\% and 50.7\% (35 tasks), respectively, as measured by the normalized root mean square error (NRMSE) metric. This is achieved without prior knowledge of the PSF, in contrast to the supervised approach used by DeepSTORM and the known PSF assumption in the Richardson-Lucy algorithm. Our method, PiMAE, provides a feasible scheme for achieving the hidden imaging mechanism in optical microscopy and has the potential to learn hidden mechanisms in many more systems. △ Less

Submitted 25 April, 2023; originally announced April 2023.

arXiv:2304.09113 [pdf, ps, other]

Random Cuts are Optimal for Explainable k-Medians

Authors: Konstantin Makarychev, Liren Shan

Abstract: We show that the RandomCoordinateCut algorithm gives the optimal competitive ratio for explainable k-medians in l1. The problem of explainable k-medians was introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian in 2020. Several groups of authors independently proposed a simple polynomial-time randomized algorithm for the problem and showed that this algorithm is O(log k loglog k) competitive.… ▽ More We show that the RandomCoordinateCut algorithm gives the optimal competitive ratio for explainable k-medians in l1. The problem of explainable k-medians was introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian in 2020. Several groups of authors independently proposed a simple polynomial-time randomized algorithm for the problem and showed that this algorithm is O(log k loglog k) competitive. We provide a tight analysis of the algorithm and prove that its competitive ratio is upper bounded by 2ln k +2. This bound matches the Omega(log k) lower bound by Dasgupta et al (2020). △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: 14 pages, 2 figures

arXiv:2304.08053 [pdf, other]

doi 10.1609/aaai.v37i5.25716

Optimal Pricing Schemes for Identical Items with Time-Sensitive Buyers

Authors: Zhengyang Liu, Liang Shan, Zihe Wang

Abstract: Time or money? That is a question! In this paper, we consider this dilemma in the pricing regime, in which we try to find the optimal pricing scheme for identical items with heterogenous time-sensitive buyers. We characterize the revenue-optimal solution and propose an efficient algorithm to find it in a Bayesian setting. Our results also demonstrate the tight ratio between the value of wasted tim… ▽ More Time or money? That is a question! In this paper, we consider this dilemma in the pricing regime, in which we try to find the optimal pricing scheme for identical items with heterogenous time-sensitive buyers. We characterize the revenue-optimal solution and propose an efficient algorithm to find it in a Bayesian setting. Our results also demonstrate the tight ratio between the value of wasted time and the seller's revenue, as well as that of two common-used pricing schemes, the k-step function and the fixed pricing. To explore the nature of the optimal scheme in the general setting, we present the closed forms over the product distribution and show by examples that positive correlation between the valuation of the item and the cost per unit time could help increase revenue. To the best of our knowledge, it is the first step towards understanding the impact of the time factor as a part of the buyer cost in pricing problems, in the computational view. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: 11pages, 2 figures

arXiv:2211.06028 [pdf, ps, other]

Dynamic Curing and Network Design in SIS Epidemic Processes

Authors: Yuhao Yi, Liren Shan, Shijie Wang, Philip E. Paré, Karl H. Johansson

Abstract: This paper studies efficient algorithms for dynamic curing policies and the corresponding network design problems to guarantee the fast extinction of epidemic spread in a susceptible-infected-susceptible (SIS) model. We consider a Markov process-based SIS epidemic model. We provide a computationally efficient curing algorithm based on the curing policy proposed by Drakopoulos, Ozdaglar, and Tsitsi… ▽ More This paper studies efficient algorithms for dynamic curing policies and the corresponding network design problems to guarantee the fast extinction of epidemic spread in a susceptible-infected-susceptible (SIS) model. We consider a Markov process-based SIS epidemic model. We provide a computationally efficient curing algorithm based on the curing policy proposed by Drakopoulos, Ozdaglar, and Tsitsiklis (2014). Since the corresponding optimization problem is NP-hard, finding optimal policies is intractable for large graphs. We provide approximation guarantees on the curing budget of the proposed dynamic curing algorithm. We also present a curing algorithm fair to demographic groups. When the total infection rate is high, the original curing policy includes a waiting period in which no measure is taken to mitigate the spread until the rate slows down. To avoid the waiting period, we study network design problems to reduce the total infection rate by deleting edges or reducing the weight of edges. Then the curing processes become continuous since the total infection rate is restricted by network design. We provide algorithms with provable guarantees for the considered network design problems. In summary, the proposed curing and network design algorithms together provide an effective and computationally efficient approach that mitigates SIS epidemic spread in networks. △ Less

Submitted 14 August, 2024; v1 submitted 11 November, 2022; originally announced November 2022.

Comments: 24 pages, 3 figure

arXiv:2211.03302 [pdf, ps, other]

Optimal Scoring Rules for Multi-dimensional Effort

Authors: Jason D. Hartline, Liren Shan, Yingkai Li, Yifan Wu

Abstract: This paper develops a framework for the design of scoring rules to optimally incentivize an agent to exert a multi-dimensional effort. This framework is a generalization to strategic agents of the classical knapsack problem (cf. Briest, Krysta, and Vöcking, 2005, Singer, 2010) and it is foundational to applying algorithmic mechanism design to the classroom. The paper identifies two simple families… ▽ More This paper develops a framework for the design of scoring rules to optimally incentivize an agent to exert a multi-dimensional effort. This framework is a generalization to strategic agents of the classical knapsack problem (cf. Briest, Krysta, and Vöcking, 2005, Singer, 2010) and it is foundational to applying algorithmic mechanism design to the classroom. The paper identifies two simple families of scoring rules that guarantee constant approximations to the optimal scoring rule. The truncated separate scoring rule is the sum of single dimensional scoring rules that is truncated to the bounded range of feasible scores. The threshold scoring rule gives the maximum score if reports exceed a threshold and zero otherwise. Approximate optimality of one or the other of these rules is similar to the bundling or selling separately result of Babaioff, Immorlica, Lucier, and Weinberg (2014). Finally, we show that the approximate optimality of the best of those two simple scoring rules is robust when the agent's choice of effort is made sequentially. △ Less

Submitted 29 June, 2023; v1 submitted 6 November, 2022; originally announced November 2022.

arXiv:2209.02866 [pdf, ps, other]

Algorithmic Learning Foundations for Common Law

Authors: Jason D. Hartline, Daniel W. Linna Jr., Liren Shan, Alex Tang

Abstract: This paper looks at a common law legal system as a learning algorithm, models specific features of legal proceedings, and asks whether this system learns efficiently. A particular feature of our model is explicitly viewing various aspects of court proceedings as learning algorithms. This viewpoint enables directly pointing out that when the costs of going to court are not commensurate with the ben… ▽ More This paper looks at a common law legal system as a learning algorithm, models specific features of legal proceedings, and asks whether this system learns efficiently. A particular feature of our model is explicitly viewing various aspects of court proceedings as learning algorithms. This viewpoint enables directly pointing out that when the costs of going to court are not commensurate with the benefits of going to court, there is a failure of learning and inaccurate outcomes will persist in cases that settle. Specifically, cases are brought to court at an insufficient rate. On the other hand, when individuals can be compelled or incentivized to bring their cases to court, the system can learn and inaccuracy vanishes over time. △ Less

Submitted 8 September, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

arXiv:2208.06025 [pdf, other]

Automatic Sequences in Negative Bases and Proofs of Some Conjectures of Shevelev

Authors: Jeffrey Shallit, Sonja Linghui Shan, Kai Hsiang Yang

Abstract: We discuss the use of negative bases in automatic sequences. Recently the theorem-prover Walnut has been extended to allow the use of base (-k) to express variables, thus permitting quantification over Z instead of N. This enables us to prove results about two-sided (bi-infinite) automatic sequences. We first explain the theory behind negative bases in Walnut. Next, we use this new version of Waln… ▽ More We discuss the use of negative bases in automatic sequences. Recently the theorem-prover Walnut has been extended to allow the use of base (-k) to express variables, thus permitting quantification over Z instead of N. This enables us to prove results about two-sided (bi-infinite) automatic sequences. We first explain the theory behind negative bases in Walnut. Next, we use this new version of Walnut to give a very simple proof of a strengthened version of a theorem of Shevelev. We use our ideas to resolve two open problems of Shevelev from 2017. We also reprove a 2000 result of Shur involving bi-infinite binary words. △ Less

Submitted 11 August, 2022; originally announced August 2022.

arXiv:2207.10171 [pdf, other]

doi 10.46298/dmtcs.9919

Pseudoperiodic Words and a Question of Shevelev

Authors: Joseph Meleshko, Pascal Ochem, Jeffrey Shallit, Sonja Linghui Shan

Abstract: We generalize the familiar notion of periodicity in sequences to a new kind of pseudoperiodicity, and we prove some basic results about it. We revisit the results of a 2012 paper of Shevelev and reprove his results in a simpler and more unified manner, and provide a complete answer to one of his previously unresolved questions. We consider finding words with specific pseudoperiod and having the sm… ▽ More We generalize the familiar notion of periodicity in sequences to a new kind of pseudoperiodicity, and we prove some basic results about it. We revisit the results of a 2012 paper of Shevelev and reprove his results in a simpler and more unified manner, and provide a complete answer to one of his previously unresolved questions. We consider finding words with specific pseudoperiod and having the smallest possible critical exponent. Finally, we consider the problem of determining whether a finite word is pseudoperiodic of a given size, and show that it is NP-complete. △ Less

Submitted 12 October, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

Journal ref: Discrete Mathematics & Theoretical Computer Science, vol. 25:2, Automata, Logic and Semantics (October 16, 2023) dmtcs:9919

arXiv:2111.03193 [pdf, ps, other]

Explainable k-means. Don't be greedy, plant bigger trees!

Authors: Konstantin Makarychev, Liren Shan

Abstract: We provide a new bi-criteria $\tilde{O}(\log^2 k)$ competitive algorithm for explainable $k$-means clustering. Explainable $k$-means was recently introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). It is described by an easy to interpret and understand (threshold) decision tree or diagram. The cost of the explainable $k$-means clustering equals to the sum of costs of its cluster… ▽ More We provide a new bi-criteria $\tilde{O}(\log^2 k)$ competitive algorithm for explainable $k$-means clustering. Explainable $k$-means was recently introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). It is described by an easy to interpret and understand (threshold) decision tree or diagram. The cost of the explainable $k$-means clustering equals to the sum of costs of its clusters; and the cost of each cluster equals the sum of squared distances from the points in the cluster to the center of that cluster. The best non bi-criteria algorithm for explainable clustering $\tilde{O}(k)$ competitive, and this bound is tight. Our randomized bi-criteria algorithm constructs a threshold decision tree that partitions the data set into $(1+δ)k$ clusters (where $δ\in (0,1)$ is a parameter of the algorithm). The cost of this clustering is at most $\tilde{O}(1/ δ\cdot \log^2 k)$ times the cost of the optimal unconstrained $k$-means clustering. We show that this bound is almost optimal. △ Less

Submitted 27 April, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

Comments: 29 pages, 4 figures

arXiv:2110.11391 [pdf, other]

DEX: Domain Embedding Expansion for Generalized Person Re-identification

Authors: Eugene P. W. Ang, Lin Shan, Alex C. Kot

Abstract: In recent years, supervised Person Re-identification (Person ReID) approaches have demonstrated excellent performance. However, when these methods are applied to inputs from a different camera network, they typically suffer from significant performance degradation. Different from most domain adaptation (DA) approaches addressing this issue, we focus on developing a domain generalization (DG) Perso… ▽ More In recent years, supervised Person Re-identification (Person ReID) approaches have demonstrated excellent performance. However, when these methods are applied to inputs from a different camera network, they typically suffer from significant performance degradation. Different from most domain adaptation (DA) approaches addressing this issue, we focus on developing a domain generalization (DG) Person ReID model that can be deployed without additional fine-tuning or adaptation. In this paper, we propose the Domain Embedding Expansion (DEX) module. DEX dynamically manipulates and augments deep features based on person and domain labels during training, significantly improving the generalization capability and robustness of Person ReID models to unseen domains. We also developed a light version of DEX (DEXLite), applying negative sampling techniques to scale to larger datasets and reduce memory usage for multi-branch networks. Our proposed DEX and DEXLite can be combined with many existing methods, Bag-of-Tricks (BagTricks), the Multi-Granularity Network (MGN), and Part-Based Convolutional Baseline (PCB), in a plug-and-play manner. With DEX and DEXLite, existing methods can gain significant improvements when tested on other unseen datasets, thereby demonstrating the general applicability of our method. Our solution outperforms the state-of-the-art DG Person ReID methods in all large-scale benchmarks as well as in most the small-scale benchmarks. △ Less

Submitted 21 October, 2021; originally announced October 2021.

Comments: Accepted into BMVC 2021

arXiv:2107.00798 [pdf, ps, other]

Near-optimal Algorithms for Explainable k-Medians and k-Means

Authors: Konstantin Makarychev, Liren Shan

Abstract: We consider the problem of explainable $k$-medians and $k$-means introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian~(ICML 2020). In this problem, our goal is to find a threshold decision tree that partitions data into $k$ clusters and minimizes the $k$-medians or $k$-means objective. The obtained clustering is easy to interpret because every decision node of a threshold tree splits data bas… ▽ More We consider the problem of explainable $k$-medians and $k$-means introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian~(ICML 2020). In this problem, our goal is to find a threshold decision tree that partitions data into $k$ clusters and minimizes the $k$-medians or $k$-means objective. The obtained clustering is easy to interpret because every decision node of a threshold tree splits data based on a single feature into two groups. We propose a new algorithm for this problem which is $\tilde O(\log k)$ competitive with $k$-medians with $\ell_1$ norm and $\tilde O(k)$ competitive with $k$-means. This is an improvement over the previous guarantees of $O(k)$ and $O(k^2)$ by Dasgupta et al (2020). We also provide a new algorithm which is $O(\log^{3/2} k)$ competitive for $k$-medians with $\ell_2$ norm. Our first algorithm is near-optimal: Dasgupta et al (2020) showed a lower bound of $Ω(\log k)$ for $k$-medians; in this work, we prove a lower bound of $\tildeΩ(k)$ for $k$-means. We also provide a lower bound of $Ω(\log k)$ for $k$-medians with $\ell_2$ norm. △ Less

Submitted 2 August, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

Comments: 29 pages, 4 figures, ICML 2021

arXiv:2012.03734 [pdf, other]

Sequential Resource Access: Theory and Algorithm

Authors: Lin Chen, Anastasios Giovanidis, Wei Wang, Lin Shan

Abstract: We formulate and analyze a generic sequential resource access problem arising in a variety of engineering fields, where a user disposes a number of heterogeneous computing, communication, or storage resources, each characterized by the probability of successfully executing the user's task and the related access delay and cost, and seeks an optimal access strategy to maximize her utility within a g… ▽ More We formulate and analyze a generic sequential resource access problem arising in a variety of engineering fields, where a user disposes a number of heterogeneous computing, communication, or storage resources, each characterized by the probability of successfully executing the user's task and the related access delay and cost, and seeks an optimal access strategy to maximize her utility within a given time horizon, defined as the expected reward minus the access cost. We develop an algorithmic framework on the (near-)optimal sequential resource access strategy. We first prove that the problem of finding an optimal strategy is NP-hard in general. Given the hardness result, we present a greedy strategy implementable in linear time, and establish the closed-form sufficient condition for its optimality. We then develop a series of polynomial-time approximation algorithms achieving $(ε,δ)$-optimality, with the key component being a pruning process eliminating dominated strategies and, thus maintaining polynomial time and space overhead. △ Less

Submitted 7 December, 2020; originally announced December 2020.

Comments: 10 pages double column, accepted paper at IEEE INFOCOM 2021. This is the author-submitted version

arXiv:2011.11087 [pdf, ps, other]

Edge Deletion Algorithms for Minimizing Spread in SIR Epidemic Models

Authors: Yuhao Yi, Liren Shan, Philip E. Paré, Karl H. Johansson

Abstract: This paper studies algorithmic strategies to effectively reduce the number of infections in susceptible-infected-recovered (SIR) epidemic models. We consider a Markov chain SIR model and its two instantiations in the deterministic SIR (D-SIR) model and the independent cascade SIR (IC-SIR) model. We investigate the problem of minimizing the number of infections by restricting contacts under realist… ▽ More This paper studies algorithmic strategies to effectively reduce the number of infections in susceptible-infected-recovered (SIR) epidemic models. We consider a Markov chain SIR model and its two instantiations in the deterministic SIR (D-SIR) model and the independent cascade SIR (IC-SIR) model. We investigate the problem of minimizing the number of infections by restricting contacts under realistic constraints. Under moderate assumptions on the reproduction number, we prove that the infection numbers are bounded by supermodular functions in the D-SIR model and the IC-SIR model for large classes of random networks. We propose efficient algorithms with approximation guarantees to minimize infections. The theoretical results are illustrated by numerical simulations. △ Less

Submitted 22 November, 2020; originally announced November 2020.

Comments: 29 pages, 4 figures

arXiv:2010.14487 [pdf, ps, other]

Improved Guarantees for k-means++ and k-means++ Parallel

Authors: Konstantin Makarychev, Aravind Reddy, Liren Shan

Abstract: In this paper, we study k-means++ and k-means++ parallel, the two most popular algorithms for the classic k-means clustering problem. We provide novel analyses and show improved approximation and bi-criteria approximation guarantees for k-means++ and k-means++ parallel. Our results give a better theoretical justification for why these algorithms perform extremely well in practice. We also propose… ▽ More In this paper, we study k-means++ and k-means++ parallel, the two most popular algorithms for the classic k-means clustering problem. We provide novel analyses and show improved approximation and bi-criteria approximation guarantees for k-means++ and k-means++ parallel. Our results give a better theoretical justification for why these algorithms perform extremely well in practice. We also propose a new variant of k-means++ parallel algorithm (Exponential Race k-means++) that has the same approximation guarantees as k-means++. △ Less

Submitted 27 October, 2020; originally announced October 2020.

Journal ref: NeurIPS 2020

arXiv:2007.02905 [pdf, ps, other]

Optimization of Scoring Rules

Authors: Jason D. Hartline, Yingkai Li, Liren Shan, Yifan Wu

Abstract: This paper introduces an objective for optimizing proper scoring rules. The objective is to maximize the increase in payoff of a forecaster who exerts a binary level of effort to refine a posterior belief from a prior belief. In this framework we characterize optimal scoring rules in simple settings, give efficient algorithms for computing optimal scoring rules in complex settings, and identify si… ▽ More This paper introduces an objective for optimizing proper scoring rules. The objective is to maximize the increase in payoff of a forecaster who exerts a binary level of effort to refine a posterior belief from a prior belief. In this framework we characterize optimal scoring rules in simple settings, give efficient algorithms for computing optimal scoring rules in complex settings, and identify simple scoring rules that are approximately optimal. In comparison, standard scoring rules in theory and practice -- for example the quadratic rule, scoring rules for the expectation, and scoring rules for multiple tasks that are averages of single-task scoring rules -- can be very far from optimal. △ Less

Submitted 17 April, 2022; v1 submitted 6 July, 2020; originally announced July 2020.

arXiv:2004.08732 [pdf, other]

doi 10.46298/dmtcs.6773

On the existence and non-existence of improper homomorphisms of oriented and $2$-edge-coloured graphs to reflexive targets

Authors: Christopher Duffy, Sonja Linghui Shan

Abstract: We consider non-trivial homomorphisms to reflexive oriented graphs in which some pair of adjacent vertices have the same image. Using a notion of convexity for oriented graphs, we study those oriented graphs that do not admit such homomorphisms. We fully classify those oriented graphs with tree-width $2$ that do not admit such homomorphisms and show that it is NP-complete to decide if a graph admi… ▽ More We consider non-trivial homomorphisms to reflexive oriented graphs in which some pair of adjacent vertices have the same image. Using a notion of convexity for oriented graphs, we study those oriented graphs that do not admit such homomorphisms. We fully classify those oriented graphs with tree-width $2$ that do not admit such homomorphisms and show that it is NP-complete to decide if a graph admits an orientation that does not admit such homomorphisms. We prove analogous results for $2$-edge-coloured graphs. We apply our results on oriented graphs to provide a new tool in the study of chromatic number of orientations of planar graphs -- a long-standing open problem. △ Less

Submitted 18 March, 2021; v1 submitted 18 April, 2020; originally announced April 2020.

MSC Class: 05C60

Journal ref: Discrete Mathematics & Theoretical Computer Science, vol. 23 no. 1, Graph Theory (March 29, 2021) dmtcs:6773

arXiv:1909.02109 [pdf, ps, other]

Stochastic Linear Optimization with Adversarial Corruption

Authors: Yingkai Li, Edmund Y. Lou, Liren Shan

Abstract: We extend the model of stochastic bandits with adversarial corruption (Lykouriset al., 2018) to the stochastic linear optimization problem (Dani et al., 2008). Our algorithm is agnostic to the amount of corruption chosen by the adaptive adversary. The regret of the algorithm only increases linearly in the amount of corruption. Our algorithm involves using Löwner-John's ellipsoid for exploration an… ▽ More We extend the model of stochastic bandits with adversarial corruption (Lykouriset al., 2018) to the stochastic linear optimization problem (Dani et al., 2008). Our algorithm is agnostic to the amount of corruption chosen by the adaptive adversary. The regret of the algorithm only increases linearly in the amount of corruption. Our algorithm involves using Löwner-John's ellipsoid for exploration and dividing time horizon into epochs with exponentially increasing size to limit the influence of corruption. △ Less

Submitted 4 September, 2019; originally announced September 2019.

arXiv:1804.06540 [pdf, ps, other]

Improving information centrality of a node in complex networks by adding edges

Authors: Liren Shan, Yuhao Yi, Zhongzhi Zhang

Abstract: The problem of increasing the centrality of a network node arises in many practical applications. In this paper, we study the optimization problem of maximizing the information centrality $I_v$ of a given node $v$ in a network with $n$ nodes and $m$ edges, by creating $k$ new edges incident to $v$. Since $I_v$ is the reciprocal of the sum of resistance distance $\mathcal{R}_v$ between $v$ and all… ▽ More The problem of increasing the centrality of a network node arises in many practical applications. In this paper, we study the optimization problem of maximizing the information centrality $I_v$ of a given node $v$ in a network with $n$ nodes and $m$ edges, by creating $k$ new edges incident to $v$. Since $I_v$ is the reciprocal of the sum of resistance distance $\mathcal{R}_v$ between $v$ and all nodes, we alternatively consider the problem of minimizing $\mathcal{R}_v$ by adding $k$ new edges linked to $v$. We show that the objective function is monotone and supermodular. We provide a simple greedy algorithm with an approximation factor $\left(1-\frac{1}{e}\right)$ and $O(n^3)$ running time. To speed up the computation, we also present an algorithm to compute $\left(1-\frac{1}{e}-ε\right)$-approximate resistance distance $\mathcal{R}_v$ after iteratively adding $k$ edges, the running time of which is $\widetilde{O} (mkε^{-2})$ for any $ε>0$, where the $\widetilde{O} (\cdot)$ notation suppresses the ${\rm poly} (\log n)$ factors. We experimentally demonstrate the effectiveness and efficiency of our proposed algorithms. △ Less

Submitted 17 April, 2018; originally announced April 2018.

Comments: 7 pages, 2 figures, ijcai-2018

arXiv:1803.00829 [pdf, ps, other]

Independence number and the number of maximum independent sets in pseudofractal scale-free web and Sierpiński gasket

Authors: Liren Shan, Huan Li, Zhongzhi Zhang

Abstract: As a fundamental subject of theoretical computer science, the maximum independent set (MIS) problem not only is of purely theoretical interest, but also has found wide applications in various fields. However, for a general graph determining the size of a MIS is NP-hard, and exact computation of the number of all MISs is even more difficult. It is thus of significant interest to seek special graphs… ▽ More As a fundamental subject of theoretical computer science, the maximum independent set (MIS) problem not only is of purely theoretical interest, but also has found wide applications in various fields. However, for a general graph determining the size of a MIS is NP-hard, and exact computation of the number of all MISs is even more difficult. It is thus of significant interest to seek special graphs for which the MIS problem can be exactly solved. In this paper, we address the MIS problem in the pseudofractal scale-free web and the Sierpiński gasket, which have the same number of vertices and edges. For both graphs, we determine exactly the independence number and the number of all possible MISs. The independence number of the pseudofractal scale-free web is as twice as the one of the Sierpiński gasket. Moreover, the pseudofractal scale-free web has a unique MIS, while the number of MISs in the Sierpiński gasket grows exponentially with the number of vertices. △ Less

Submitted 2 March, 2018; originally announced March 2018.

Comments: 15 pages, 10 figures

arXiv:1802.02556 [pdf, ps, other]

doi 10.1145/3308558.3313490

Current Flow Group Closeness Centrality for Complex Networks

Authors: Huan Li, Richard Peng, Liren Shan, Yuhao Yi, Zhongzhi Zhang

Abstract: Current flow closeness centrality (CFCC) has a better discriminating ability than the ordinary closeness centrality based on shortest paths. In this paper, we extend this notion to a group of vertices in a weighted graph, and then study the problem of finding a subset $S$ of $k$ vertices to maximize its CFCC $C(S)$, both theoretically and experimentally. We show that the problem is NP-hard, but pr… ▽ More Current flow closeness centrality (CFCC) has a better discriminating ability than the ordinary closeness centrality based on shortest paths. In this paper, we extend this notion to a group of vertices in a weighted graph, and then study the problem of finding a subset $S$ of $k$ vertices to maximize its CFCC $C(S)$, both theoretically and experimentally. We show that the problem is NP-hard, but propose two greedy algorithms for minimizing the reciprocal of $C(S)$ with provable guarantees using the monotoncity and supermodularity. The first is a deterministic algorithm with an approximation factor $(1-\frac{k}{k-1}\cdot\frac{1}{e})$ and cubic running time; while the second is a randomized algorithm with a $(1-\frac{k}{k-1}\cdot\frac{1}{e}-ε)$-approximation and nearly-linear running time for any $ε> 0$. Extensive experiments on model and real networks demonstrate that our algorithms are effective and efficient, with the second algorithm being scalable to massive networks with more than a million vertices. △ Less

Submitted 11 February, 2018; v1 submitted 7 February, 2018; originally announced February 2018.

Comments: 31 pages, 4 figures

Journal ref: WWW'2019

arXiv:1703.09023 [pdf, ps, other]

doi 10.1016/j.tcs.2017.03.009

Domination number and minimum dominating sets in pseudofractal scale-free web and Sierpiński graph

Authors: Liren Shan, Huan Li, Zhongzhi Zhang

Abstract: The minimum dominating set (MDS) problem is a fundamental subject of theoretical computer science, and has found vast applications in different areas, including sensor networks, protein interaction networks, and structural controllability. However, the determination of the size of a MDS and the number of all MDSs in a general network is NP-hard, and it thus makes sense to seek particular graphs fo… ▽ More The minimum dominating set (MDS) problem is a fundamental subject of theoretical computer science, and has found vast applications in different areas, including sensor networks, protein interaction networks, and structural controllability. However, the determination of the size of a MDS and the number of all MDSs in a general network is NP-hard, and it thus makes sense to seek particular graphs for which the MDS problem can be solved analytically. In this paper, we study the MDS problem in the pseudofractal scale-free web and the Sierpiński graph, which have the same number of vertices and edges. For both networks, we determine explicitly the domination number, as well as the number of distinct MDSs. We show that the pseudofractal scale-free web has a unique MDS, and its domination number is only half of that for the Sierpiński graph, which has many MDSs. We argue that the scale-free topology is responsible for the difference of the size and number of MDSs between the two studied graphs, which in turn indicates that power-law degree distribution plays an important role in the MDS problem and its applications in scale-free networks. △ Less

Submitted 27 March, 2017; originally announced March 2017.

arXiv:1407.6085 [pdf]

Block Bayesian Sparse Learning Algorithms With Application to Estimating Channels in OFDM Systems

Authors: Guan Gui, Li Xu, Lin Shan

Abstract: Cluster-sparse channels often exist in frequencyselective fading broadband communication systems. The main reason is received scattered waveform exhibits cluster structure which is caused by a few reflectors near the receiver. Conventional sparse channel estimation methods have been proposed for general sparse channel model which without considering the potential cluster-sparse structure informati… ▽ More Cluster-sparse channels often exist in frequencyselective fading broadband communication systems. The main reason is received scattered waveform exhibits cluster structure which is caused by a few reflectors near the receiver. Conventional sparse channel estimation methods have been proposed for general sparse channel model which without considering the potential cluster-sparse structure information. In this paper, we investigate the cluster-sparse channel estimation (CS-CE) problems in the state of the art orthogonal frequencydivision multiplexing (OFDM) systems. Novel Bayesian clustersparse channel estimation (BCS-CE) methods are proposed to exploit the cluster-sparse structure by using block sparse Bayesian learning (BSBL) algorithm. The proposed methods take advantage of the cluster correlation in training matrix so that they can improve estimation performance. In addition, different from our previous method using uniform block partition information, the proposed methods can work well when the prior block partition information of channels is unknown. Computer simulations show that the proposed method has a superior performance when compared with the previous methods. △ Less

Submitted 22 July, 2014; originally announced July 2014.

Comments: 5 pages, 6 figures, will be presented in WPMC2014@Sydney, Australia

arXiv:1407.6081 [pdf]

Adaptive MIMO Channel Estimation using Sparse Variable Step-Size NLMS Algorithms

Authors: Guan Gui, Li Xu, Lin Shan, Fumiyuki Adachi

Abstract: To estimate multiple-input multiple-output (MIMO) channels, invariable step-size normalized least mean square (ISSNLMS) algorithm was applied to adaptive channel estimation (ACE). Since the MIMO channel is often described by sparse channel model due to broadband signal transmission, such sparsity can be exploited by adaptive sparse channel estimation (ASCE) methods using sparse ISS-NLMS algorithms… ▽ More To estimate multiple-input multiple-output (MIMO) channels, invariable step-size normalized least mean square (ISSNLMS) algorithm was applied to adaptive channel estimation (ACE). Since the MIMO channel is often described by sparse channel model due to broadband signal transmission, such sparsity can be exploited by adaptive sparse channel estimation (ASCE) methods using sparse ISS-NLMS algorithms. It is well known that step-size is a critical parameter which controls three aspects: algorithm stability, estimation performance and computational cost. The previous approaches can exploit channel sparsity but their step-sizes are keeping invariant which unable balances well the three aspects and easily cause either estimation performance loss or instability. In this paper, we propose two stable sparse variable step-size NLMS (VSS-NLMS) algorithms to improve the accuracy of MIMO channel estimators. First, ASCE for estimating MIMO channels is formulated in MIMO systems. Second, different sparse penalties are introduced to VSS-NLMS algorithm for ASCE. In addition, difference between sparse ISSNLMS algorithms and sparse VSS-NLMS ones are explained. At last, to verify the effectiveness of the proposed algorithms for ASCE, several selected simulation results are shown to prove that the proposed sparse VSS-NLMS algorithms can achieve better estimation performance than the conventional methods via mean square error (MSE) and bit error rate (BER) metrics. △ Less

Submitted 22 July, 2014; originally announced July 2014.

Comments: 5 papges, 6 figures, submitted for ICCS2014@Macau. arXiv admin note: substantial text overlap with arXiv:1311.1314

arXiv:1403.0192 [pdf]

Compressive sensing based Bayesian sparse channel estimation for OFDM communication systems: high performance and low complexity

Authors: Guan Gui, Li Xu, Lin Shan, Fumiyuki Adachi

Abstract: In orthogonal frequency division modulation (OFDM) communication systems, channel state information (CSI) is required at receiver due to the fact that frequency-selective fading channel leads to disgusting inter-symbol interference (ISI) over data transmission. Broadband channel model is often described by very few dominant channel taps and they can be probed by compressive sensing based sparse ch… ▽ More In orthogonal frequency division modulation (OFDM) communication systems, channel state information (CSI) is required at receiver due to the fact that frequency-selective fading channel leads to disgusting inter-symbol interference (ISI) over data transmission. Broadband channel model is often described by very few dominant channel taps and they can be probed by compressive sensing based sparse channel estimation (SCE) methods, e.g., orthogonal matching pursuit algorithm, which can take the advantage of sparse structure effectively in the channel as for prior information. However, these developed methods are vulnerable to both noise interference and column coherence of training signal matrix. In other words, the primary objective of these conventional methods is to catch the dominant channel taps without a report of posterior channel uncertainty. To improve the estimation performance, we proposed a compressive sensing based Bayesian sparse channel estimation (BSCE) method which can not only exploit the channel sparsity but also mitigate the unexpected channel uncertainty without scarifying any computational complexity. The propose method can reveal potential ambiguity among multiple channel estimators that are ambiguous due to observation noise or correlation interference among columns in the training matrix. Computer simulations show that propose method can improve the estimation performance when comparing with conventional SCE methods. △ Less

Submitted 20 April, 2015; v1 submitted 2 March, 2014; originally announced March 2014.

Comments: 24 pages,16 figures, submitted for a journal

Showing 1–40 of 40 results for author: Shan, L