-
CHECKWHY: Causal Fact Verification via Argument Structure
Authors:
Jiasheng Si,
Yibo Zhao,
Yingjie Zhu,
Haiyang Zhu,
Wenpeng Lu,
Deyu Zhou
Abstract:
With the growing complexity of fact verification tasks, the concern with "thoughtful" reasoning capabilities is increasing. However, recent fact verification benchmarks mainly focus on checking a narrow scope of semantic factoids within claims and lack an explicit logical reasoning process. In this paper, we introduce CheckWhy, a challenging dataset tailored to a novel causal fact verification tas…
▽ More
With the growing complexity of fact verification tasks, the concern with "thoughtful" reasoning capabilities is increasing. However, recent fact verification benchmarks mainly focus on checking a narrow scope of semantic factoids within claims and lack an explicit logical reasoning process. In this paper, we introduce CheckWhy, a challenging dataset tailored to a novel causal fact verification task: checking the truthfulness of the causal relation within claims through rigorous reasoning steps. CheckWhy consists of over 19K "why" claim-evidence-argument structure triplets with supports, refutes, and not enough info labels. Each argument structure is composed of connected evidence, representing the reasoning process that begins with foundational evidence and progresses toward claim establishment. Through extensive experiments on state-of-the-art models, we validate the importance of incorporating the argument structure for causal fact verification. Moreover, the automated and human evaluation of argument structure generation reveals the difficulty in producing satisfying argument structure by fine-tuned models or Chain-of-Thought prompted LLMs, leaving considerable room for future improvements.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Formation of WNL stars for the MW and LMC based on the k-omega model
Authors:
Jijuan Si,
Zhi Li,
Yan Li
Abstract:
We adopt a set of second-order differential equations ($k-ω$ model) to handle core convective overshooting in massive stars, simulate the evolution of WNL stars with different metallicities and initial masses, both rotating and non-rotating models, and compare the results with the classical overshooting model. The results indicate that under the same initial conditions, the $k-ω$ model generally p…
▽ More
We adopt a set of second-order differential equations ($k-ω$ model) to handle core convective overshooting in massive stars, simulate the evolution of WNL stars with different metallicities and initial masses, both rotating and non-rotating models, and compare the results with the classical overshooting model. The results indicate that under the same initial conditions, the $k-ω$ model generally produces larger convective cores and wider overshooting regions, thereby increasing the mass ranges and extending the lifetimes of WNL stars, as well as the likelihood of forming WNL stars. The masses and lifetimes of WNL stars both increase with higher metallicities and initial masses. Under higher-metallicity conditions, the two overshooting schemes significantly differ in their impacts on lifetimes of the WNL stars, but insignificant in the mass ranges of the WNL stars. Rotation may drive the formation of WNL stars in low-mass, metal-poor counterparts, with this effect being more pronounced in the OV model. The surface nitrogen of metal-rich WNL stars formed during the MS phase is likely primarily from the CN-cycle, while it may come from both the CN- and NO-cycles for relatively metal-poor counterparts. Our model can effectively explain the distribution of WNL stars in the Milky Way, but appears to have inadequacies in explaining the WNL stars in the LMC.
△ Less
Submitted 7 August, 2024;
originally announced August 2024.
-
Movable Frequency Diverse Array for Wireless Communication Security
Authors:
Zihao Cheng,
Jiangbo Si,
Zan Li,
Pengpeng Liu,
Yangchao Huang,
Naofal Al-Dhahir
Abstract:
Frequency diverse array (FDA) is a promising antenna technology to achieve physical layer security by varying the frequency of each antenna at the transmitter. However, when the channels of the legitimate user and eavesdropper are highly correlated, FDA is limited by the frequency constraint and cannot provide satisfactory security performance. In this paper, we propose a novel movable FDA (MFDA)…
▽ More
Frequency diverse array (FDA) is a promising antenna technology to achieve physical layer security by varying the frequency of each antenna at the transmitter. However, when the channels of the legitimate user and eavesdropper are highly correlated, FDA is limited by the frequency constraint and cannot provide satisfactory security performance. In this paper, we propose a novel movable FDA (MFDA) antenna technology where the positions of antennas can be dynamically adjusted in a given finite region. Specifically, we aim to maximize the secrecy capacity by jointly optimizing the antenna beamforming vector, antenna frequency vector and antenna position vector. To solve this non-convex optimization problem with coupled variables, we develop a two-stage alternating optimization (AO) algorithm based on block successive upper-bound minimization (BSUM) method. Moreover, to evaluate the security performance provided by MFDA, we introduce two benchmark schemes, i.e., phased array (PA) and FDA. Simulation results demonstrate that MFDA can significantly enhance security performance compared to PA and FDA. In particular, when the frequency constraint is strict, MFDA can further increase the secrecy capacity by adjusting the positions of antennas instead of the frequencies.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
Movable Frequency Diverse Array-Assisted Covert Communication With Multiple Wardens
Authors:
Zihao Cheng,
Jiangbo Si,
Zan Li,
Pengpeng Liu,
Xiaoting Wang,
Naofal Al-Dhahir
Abstract:
The frequency diverse array (FDA) is highly promising for improving covert communication performance by adjusting the frequency of each antenna at the transmitter. However, when faced with the cases of multiple wardens and highly correlated channels, FDA is limited by the frequency constraint and cannot provide satisfactory covert performance. In this paper, we propose a novel movable FDA (MFDA) a…
▽ More
The frequency diverse array (FDA) is highly promising for improving covert communication performance by adjusting the frequency of each antenna at the transmitter. However, when faced with the cases of multiple wardens and highly correlated channels, FDA is limited by the frequency constraint and cannot provide satisfactory covert performance. In this paper, we propose a novel movable FDA (MFDA) antenna technology where positions of the antennas can be dynamically adjusted in a given finite region. Specifically, we aim to maximize the covert rate by jointly optimizing the antenna beamforming vector, antenna frequency vector and antenna position vector. To solve this non-convex optimization problem with coupled variables, we develop a two-stage alternating optimization (AO) algorithm based on the block successive upper-bound minimization (BSUM) method. Moreover, considering the challenge of obtaining perfect channel state information (CSI) at multiple wardens, we study the case of imperfect CSI. Simulation results demonstrate that MFDA can significantly enhance covert performance compared to the conventional FDA. In particular, when the frequency constraint is strict, MFDA can further increase the covert rate by adjusting the positions of antennas instead of the frequencies.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
Two-Timescale Design for Movable Antenna Array-Enabled Multiuser Uplink Communications
Authors:
Guojie Hu,
Qingqing Wu,
Donghui Xu,
Kui Xu,
Jiangbo Si,
Yunlong Cai,
Naofal Al-Dhahir
Abstract:
Movable antenna (MA) technology can flexibly reconfigure wireless channels by adjusting antenna positions in a local region, thus owing great potential for enhancing communication performance. This letter investigates MA technology enabled multiuser uplink communications over general Rician fading channels, which consist of a base station (BS) equipped with the MA array and multiple single-antenna…
▽ More
Movable antenna (MA) technology can flexibly reconfigure wireless channels by adjusting antenna positions in a local region, thus owing great potential for enhancing communication performance. This letter investigates MA technology enabled multiuser uplink communications over general Rician fading channels, which consist of a base station (BS) equipped with the MA array and multiple single-antenna users. Since it is practically challenging to collect all instantaneous channel state information (CSI) by traversing all possible antenna positions at the BS, we instead propose a two-timescale scheme for maximizing the ergodic sum rate. Specifically, antenna positions at the BS are first optimized using only the statistical CSI. Subsequently, the receiving beamforming at the BS (for which we consider the three typical zero-forcing (ZF), minimum mean-square error (MMSE) and MMSE with successive interference cancellation (MMSE-SIC) receivers) is designed based on the instantaneous CSI with optimized antenna positions, thus significantly reducing practical implementation complexities. The formulated problems are highly non-convex and we develop projected gradient ascent (PGA) algorithms to effectively handle them. Simulation results illustrate that compared to conventional fixed-position antenna (FPA) array, the MA array can achieve significant performance gains by reaping an additional spatial degree of freedom.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
Chiral emission of vortex microlasers enabled by collective modes of guided resonances
Authors:
Ye Chen,
Mingjin Wang,
Jiahao Si,
Zixuan Zhang,
Xuefan Yin,
Jingxuan Chen,
NianYuan Lv,
Chenyan Tang,
Wanhua Zheng,
Yuri Kivshar,
Chao Peng
Abstract:
Vortex lasers have attracted substantial attention in recent years owing to their wide array of applications such as micromanipulation, optical multiplexing, and quantum cryptography. In this work, we propose and demonstrate chiral emission of vortex microlaser leveraging the collective modes from omnidirectionally hybridizing the guided mode resonances (GMRs) within photonic crystal (PhC) slabs.…
▽ More
Vortex lasers have attracted substantial attention in recent years owing to their wide array of applications such as micromanipulation, optical multiplexing, and quantum cryptography. In this work, we propose and demonstrate chiral emission of vortex microlaser leveraging the collective modes from omnidirectionally hybridizing the guided mode resonances (GMRs) within photonic crystal (PhC) slabs. Specifically, we encircle a central uniform PhC with a heterogeneous PhC that features a circular lateral boundary. Consequently, the bulk GMRs hybridize into a series of collective modes due to boundary scatterings, resulting in a vortex pattern in real space with a spiral phase front in its radiation. Benefiting from the long lifetime of GMRs as quasi-bound state in the continuum and using asymmetric pumping to lift the chiral symmetry, we demonstrate stable single-mode lasing oscillation with a low optical pumping threshold of $18~\mathrm{kW/cm^2}$ at room temperature. We identify the real-space vortex through polarization-resolved imaging and self-interference patterns, showing a vivid example of applying collective modes to realize compact and energy-efficient vortex microlasers.
△ Less
Submitted 23 July, 2024;
originally announced July 2024.
-
Doping-tunable Fermi surface with persistent topological Hall effect in axion candidate EuIn$_2$As$_2$
Authors:
Jian Yan,
Jianguo Si,
Zhongzhu Jiang,
Hanming Ma,
Yoshiya Uwatoko,
Bao-Tian Wang,
Xuan Luo,
Yuping Sun,
Minoru Yamashita
Abstract:
Rare-earth Zintl compound EuIn$_2$As$_2$ has been theoretically recognized as a candidate for realizing an intrinsic antiferromagnetic (AFM) bulk axion insulator and a higher-order topological state, which provides a fertile platform to explore novel topological transport phenomena. However, the axion state has yet to be realized because EuIn$_2$As$_2$ is highly hole-doped. Here, we synthesized a…
▽ More
Rare-earth Zintl compound EuIn$_2$As$_2$ has been theoretically recognized as a candidate for realizing an intrinsic antiferromagnetic (AFM) bulk axion insulator and a higher-order topological state, which provides a fertile platform to explore novel topological transport phenomena. However, the axion state has yet to be realized because EuIn$_2$As$_2$ is highly hole-doped. Here, we synthesized a series of high-quality Ca-doped EuIn2As2 (Ca$_x$Eu$_{1-x}$In$_2$As$_2$, x = 0 ~ 0.25) single crystals to tune the Fermi energy above the hole pocket. Our Hall measurements reveal that the isovalent Ca substitution decreases the hole carrier density by shrinking the lattice spacing, which is also confirmed by our first-principles calculations. We further find that both the temperature dependence of the magnetic susceptibility with a local maximum at the Néel temperature and the topological Hall effect originating from the finite real-space spin chirality persist in the Ca-doped samples as observed in the pristine EuIn$_2$As$_2$, despite that the nonmagnetic Ca substitution decreases the effective moment and the Néel temperature. These results show that the Ca substitution tunes the Fermi energy while keeping the AFM magnetic structure, suggesting that the axion insulating state may be realized by further Ca substitution.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Self-Supervised Vision Transformer for Enhanced Virtual Clothes Try-On
Authors:
Lingxiao Lu,
Shengyi Wu,
Haoxuan Sun,
Junhong Gou,
Jianlou Si,
Chen Qian,
Jianfu Zhang,
Liqing Zhang
Abstract:
Virtual clothes try-on has emerged as a vital feature in online shopping, offering consumers a critical tool to visualize how clothing fits. In our research, we introduce an innovative approach for virtual clothes try-on, utilizing a self-supervised Vision Transformer (ViT) coupled with a diffusion model. Our method emphasizes detail enhancement by contrasting local clothing image embeddings, gene…
▽ More
Virtual clothes try-on has emerged as a vital feature in online shopping, offering consumers a critical tool to visualize how clothing fits. In our research, we introduce an innovative approach for virtual clothes try-on, utilizing a self-supervised Vision Transformer (ViT) coupled with a diffusion model. Our method emphasizes detail enhancement by contrasting local clothing image embeddings, generated by ViT, with their global counterparts. Techniques such as conditional guidance and focus on key regions have been integrated into our approach. These combined strategies empower the diffusion model to reproduce clothing details with increased clarity and realism. The experimental results showcase substantial advancements in the realism and precision of details in virtual try-on experiences, significantly surpassing the capabilities of existing technologies.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding
Authors:
Libo Qin,
Fuxuan Wei,
Qiguang Chen,
Jingxuan Zhou,
Shijue Huang,
Jiasheng Si,
Wenpeng Lu,
Wanxiang Che
Abstract:
Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Nevertheless, the existing prompting work ignores the cross-task interaction information for SLU, which leads to sub-optimal performance. To solve this proble…
▽ More
Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Nevertheless, the existing prompting work ignores the cross-task interaction information for SLU, which leads to sub-optimal performance. To solve this problem, we present the pioneering work of Cross-task Interactive Prompting (CroPrompt) for SLU, which enables the model to interactively leverage the information exchange across the correlated tasks in SLU. Additionally, we further introduce a multi-task self-consistency mechanism to mitigate the error propagation caused by the intent information injection. We conduct extensive experiments on the standard SLU benchmark and the results reveal that CroPrompt consistently outperforms the existing prompting approaches. In addition, the multi-task self-consistency mechanism can effectively ease the error propagation issue, thereby enhancing the performance. We hope this work can inspire more research on cross-task prompting for SLU.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation
Authors:
Jacob Si,
Wendy Yusi Cheng,
Michael Cooper,
Rahul G. Krishnan
Abstract:
Tabular data are omnipresent in various sectors of industries. Neural networks for tabular data such as TabNet have been proposed to make predictions while leveraging the attention mechanism for interpretability. However, the inferred attention masks are often dense, making it challenging to come up with rationales about the predictive signal. To remedy this, we propose InterpreTabNet, a variant o…
▽ More
Tabular data are omnipresent in various sectors of industries. Neural networks for tabular data such as TabNet have been proposed to make predictions while leveraging the attention mechanism for interpretability. However, the inferred attention masks are often dense, making it challenging to come up with rationales about the predictive signal. To remedy this, we propose InterpreTabNet, a variant of the TabNet model that models the attention mechanism as a latent variable sampled from a Gumbel-Softmax distribution. This enables us to regularize the model to learn distinct concepts in the attention masks via a KL Divergence regularizer. It prevents overlapping feature selection by promoting sparsity which maximizes the model's efficacy and improves interpretability to determine the important features when predicting the outcome. To assist in the interpretation of feature interdependencies from our model, we employ a large language model (GPT-4) and use prompt engineering to map from the learned feature mask onto natural language text describing the learned signal. Through comprehensive experiments on real-world datasets, we demonstrate that InterpreTabNet outperforms previous methods for interpreting tabular data while attaining competitive accuracy.
△ Less
Submitted 11 June, 2024; v1 submitted 1 June, 2024;
originally announced June 2024.
-
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models
Authors:
Wenqi Ouyang,
Yi Dong,
Lei Yang,
Jianlou Si,
Xingang Pan
Abstract:
The remarkable generative capabilities of diffusion models have motivated extensive research in both image and video editing. Compared to video editing which faces additional challenges in the time dimension, image editing has witnessed the development of more diverse, high-quality approaches and more capable software like Photoshop. In light of this gap, we introduce a novel and generic solution…
▽ More
The remarkable generative capabilities of diffusion models have motivated extensive research in both image and video editing. Compared to video editing which faces additional challenges in the time dimension, image editing has witnessed the development of more diverse, high-quality approaches and more capable software like Photoshop. In light of this gap, we introduce a novel and generic solution that extends the applicability of image editing tools to videos by propagating edits from a single frame to the entire video using a pre-trained image-to-video model. Our method, dubbed I2VEdit, adaptively preserves the visual and motion integrity of the source video depending on the extent of the edits, effectively handling global edits, local edits, and moderate shape changes, which existing methods cannot fully achieve. At the core of our method are two main processes: Coarse Motion Extraction to align basic motion patterns with the original video, and Appearance Refinement for precise adjustments using fine-grained attention matching. We also incorporate a skip-interval strategy to mitigate quality degradation from auto-regressive generation across multiple video clips. Experimental results demonstrate our framework's superior performance in fine-grained video editing, proving its capability to produce high-quality, temporally consistent outputs.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Movable Antennas-Enabled Two-User Multicasting: Do We Really Need Alternating Optimization for Minimum Rate Maximization?
Authors:
Guojie Hu,
Qingqing Wu,
Donghui Xu,
Kui Xu,
Jiangbo Si,
Yunlong Cai,
Naofal Al-Dhahir
Abstract:
Movable antenna (MA) technology, which can reconfigure wireless channels by flexibly moving antenna positions in a specified region, has great potential for improving communication performance. In this paper, we consider a new setup of MAs-enabled multicasting, where we adopt a simple setting in which a linear MA array-enabled source (${\rm{S}}$) transmits a common message to two single-antenna us…
▽ More
Movable antenna (MA) technology, which can reconfigure wireless channels by flexibly moving antenna positions in a specified region, has great potential for improving communication performance. In this paper, we consider a new setup of MAs-enabled multicasting, where we adopt a simple setting in which a linear MA array-enabled source (${\rm{S}}$) transmits a common message to two single-antenna users ${\rm{U}}_1$ and ${\rm{U}}_2$. We aim to maximize the minimum rate among these two users, by jointly optimizing the transmit beamforming and antenna positions at ${\rm{S}}$. Instead of utilizing the widely-used alternating optimization (AO) approach, we reveal, with rigorous proof, that the above two variables can be optimized separately: i) the optimal antenna positions can be firstly determined via the successive convex approximation technique, based on the rule of maximizing the correlation between ${\rm{S}}$-${\rm{U}}_1$ and ${\rm{S}}$-${\rm{U}}_2$ channels; ii) afterwards, the optimal closed-form transmit beamforming can be derived via simple arguments. Compared to AO, this new approach yields the same performance but reduces the computational complexities significantly. Moreover, it can provide insightful conclusions which are not possible with AO.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Actor-Critic Reinforcement Learning with Phased Actor
Authors:
Ruofan Wu,
Junmin Zhong,
Jennie Si
Abstract:
Policy gradient methods in actor-critic reinforcement learning (RL) have become perhaps the most promising approaches to solving continuous optimal control problems. However, the trial-and-error nature of RL and the inherent randomness associated with solution approximations cause variations in the learned optimal values and policies. This has significantly hindered their successful deployment in…
▽ More
Policy gradient methods in actor-critic reinforcement learning (RL) have become perhaps the most promising approaches to solving continuous optimal control problems. However, the trial-and-error nature of RL and the inherent randomness associated with solution approximations cause variations in the learned optimal values and policies. This has significantly hindered their successful deployment in real life applications where control responses need to meet dynamic performance criteria deterministically. Here we propose a novel phased actor in actor-critic (PAAC) method, aiming at improving policy gradient estimation and thus the quality of the control policy. Specifically, PAAC accounts for both $Q$ value and TD error in its actor update. We prove qualitative properties of PAAC for learning convergence of the value and policy, solution optimality, and stability of system dynamics. Additionally, we show variance reduction in policy gradient estimation. PAAC performance is systematically and quantitatively evaluated in this study using DeepMind Control Suite (DMC). Results show that PAAC leads to significant performance improvement measured by total cost, learning variance, robustness, learning speed and success rate. As PAAC can be piggybacked onto general policy gradient learning frameworks, we select well-known methods such as direct heuristic dynamic programming (dHDP), deep deterministic policy gradient (DDPG) and their variants to demonstrate the effectiveness of PAAC. Consequently we provide a unified view on these related policy gradient algorithms.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Towards Effective Next POI Prediction: Spatial and Semantic Augmentation with Remote Sensing Data
Authors:
Nan Jiang,
Haitao Yuan,
Jianing Si,
Minxiao Chen,
Shangguang Wang
Abstract:
The next point-of-interest (POI) prediction is a significant task in location-based services, yet its complexity arises from the consolidation of spatial and semantic intent. This fusion is subject to the influences of historical preferences, prevailing location, and environmental factors, thereby posing significant challenges. In addition, the uneven POI distribution further complicates the next…
▽ More
The next point-of-interest (POI) prediction is a significant task in location-based services, yet its complexity arises from the consolidation of spatial and semantic intent. This fusion is subject to the influences of historical preferences, prevailing location, and environmental factors, thereby posing significant challenges. In addition, the uneven POI distribution further complicates the next POI prediction procedure. To address these challenges, we enrich input features and propose an effective deep-learning method within a two-step prediction framework. Our method first incorporates remote sensing data, capturing pivotal environmental context to enhance input features regarding both location and semantics. Subsequently, we employ a region quad-tree structure to integrate urban remote sensing, road network, and POI distribution spaces, aiming to devise a more coherent graph representation method for urban spatial. Leveraging this method, we construct the QR-P graph for the user's historical trajectories to encapsulate historical travel knowledge, thereby augmenting input features with comprehensive spatial and semantic insights. We devise distinct embedding modules to encode these features and employ an attention mechanism to fuse diverse encodings. In the two-step prediction procedure, we initially identify potential spatial zones by predicting user-preferred tiles, followed by pinpointing specific POIs of a designated type within the projected tiles. Empirical findings from four real-world location-based social network datasets underscore the remarkable superiority of our proposed approach over competitive baseline methods.
△ Less
Submitted 22 March, 2024;
originally announced April 2024.
-
Movable Antennas-Assisted Secure Transmission Without Eavesdroppers' Instantaneous CSI
Authors:
Guojie Hu,
Qingqing Wu,
Donghui Xu,
Kui Xu,
Jiangbo Si,
Yunlong Cai,
Naofal Al-Dhahir
Abstract:
Movable antenna (MA) technology is highly promising for improving communication performance, due to its advantage of flexibly adjusting positions of antennas to reconfigure channel conditions. In this paper, we investigate MAs-assisted secure transmission under a legitimate transmitter Alice, a legitimate receiver Bob and multiple eavesdroppers. Specifically, we consider a practical scenario where…
▽ More
Movable antenna (MA) technology is highly promising for improving communication performance, due to its advantage of flexibly adjusting positions of antennas to reconfigure channel conditions. In this paper, we investigate MAs-assisted secure transmission under a legitimate transmitter Alice, a legitimate receiver Bob and multiple eavesdroppers. Specifically, we consider a practical scenario where Alice has no any knowledge about the instantaneous non-line-of-sight component of the wiretap channel. Under this setup, we evaluate the secrecy performance by adopting the secrecy outage probability metric, the tight approximation of which is first derived by interpreting the Rician fading as a special case of Nakagami fading and concurrently exploiting the Laguerre series approximation. Then, we minimize the secrecy outage probability by jointly optimizing the transmit beamforming and positions of antennas at Alice. However, the problem is highly non-convex because the objective includes the complex incomplete gamma function. To tackle this challenge, we, for the first time, effectively approximate the inverse of the incomplete gamma function as a simple linear model. Based on this approximation, we arrive at a simplified problem with a clear structure, which can be solved via the developed alternating projected gradient ascent (APGA) algorithm. Considering the high complexity of the APGA, we further design another scheme where the zero-forcing based beamforming is adopted by Alice, and then we transform the problem into minimizing a simple function which is only related to positions of antennas at Alice.As demonstrated by simulations, our proposed schemes achieve significant performance gains compared to conventional schemes based on fixed-position antennas.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Reliable Conflictive Multi-View Learning
Authors:
Cai Xu,
Jiajun Si,
Ziyu Guan,
Wei Zhao,
Yue Wu,
Xiyue Gao
Abstract:
Multi-view learning aims to combine multiple features to achieve more comprehensive descriptions of data. Most previous works assume that multiple views are strictly aligned. However, real-world multi-view data may contain low-quality conflictive instances, which show conflictive information in different views. Previous methods for this problem mainly focus on eliminating the conflictive data inst…
▽ More
Multi-view learning aims to combine multiple features to achieve more comprehensive descriptions of data. Most previous works assume that multiple views are strictly aligned. However, real-world multi-view data may contain low-quality conflictive instances, which show conflictive information in different views. Previous methods for this problem mainly focus on eliminating the conflictive data instances by removing them or replacing conflictive views. Nevertheless, real-world applications usually require making decisions for conflictive instances rather than only eliminating them. To solve this, we point out a new Reliable Conflictive Multi-view Learning (RCML) problem, which requires the model to provide decision results and attached reliabilities for conflictive multi-view data. We develop an Evidential Conflictive Multi-view Learning (ECML) method for this problem. ECML first learns view-specific evidence, which could be termed as the amount of support to each category collected from data. Then, we can construct view-specific opinions consisting of decision results and reliability. In the multi-view fusion stage, we propose a conflictive opinion aggregation strategy and theoretically prove this strategy can exactly model the relation of multi-view common and view-specific reliabilities. Experiments performed on 6 datasets verify the effectiveness of ECML.
△ Less
Submitted 28 February, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Fluid Antennas-Enabled Multiuser Uplink: A Low-Complexity Gradient Descent for Total Transmit Power Minimization
Authors:
Guojie Hu,
Qingqing Wu,
Kui Xu,
Jian Ouyang,
Jiangbo Si,
Yunlong Cai,
Naofal Al-Dhahir
Abstract:
We investigate multiuser uplink communication from multiple single-antenna users to a base station (BS), which is equipped with a movable-antenna (MA) array and adopts zero-forcing receivers to decode multiple signals. We aim to optimize the MAs' positions at the BS, to minimize the total transmit power of all users subject to the minimum rate requirement. After applying transformations, we show t…
▽ More
We investigate multiuser uplink communication from multiple single-antenna users to a base station (BS), which is equipped with a movable-antenna (MA) array and adopts zero-forcing receivers to decode multiple signals. We aim to optimize the MAs' positions at the BS, to minimize the total transmit power of all users subject to the minimum rate requirement. After applying transformations, we show that the problem is equivalent to minimizing the sum of each eigenvalue's reciprocal of a matrix, which is a function of all MAs' positions. Subsequently, the projected gradient descent (PGD) method is utilized to find a locally optimal solution. In particular, different from the latest related work, we exploit the eigenvalue decomposition to successfully derive a closed-form gradient for the PGD, which facilitates the practical implementation greatly. We demonstrate by simulations that via careful optimization for all MAs' positions in our proposed design, the total transmit power of all users can be decreased significantly as compared to competitive benchmarks.
△ Less
Submitted 8 January, 2024; v1 submitted 9 December, 2023;
originally announced December 2023.
-
Secure Wireless Communication via Movable-Antenna Array
Authors:
Guojie Hu,
Qingqing Wu,
Kui Xu,
Jiangbo Si,
Naofal Al-Dhahir
Abstract:
Movable antenna (MA) array is a novel technology recently developed where positions of transmit/receive antennas can be flexibly adjusted in the specified region to reconfigure the wireless channel and achieve a higher capacity. In this letter, we, for the first time, investigate the MA array-assisted physical-layer security where the confidential information is transmitted from a MA array-enabled…
▽ More
Movable antenna (MA) array is a novel technology recently developed where positions of transmit/receive antennas can be flexibly adjusted in the specified region to reconfigure the wireless channel and achieve a higher capacity. In this letter, we, for the first time, investigate the MA array-assisted physical-layer security where the confidential information is transmitted from a MA array-enabled Alice to a single-antenna Bob, in the presence of multiple single-antenna and colluding eavesdroppers. We aim to maximize the achievable secrecy rate by jointly designing the transmit beamforming and positions of all antennas at Alice subject to the transmit power budget and specified regions for positions of all transmit antennas. The resulting problem is highly non-convex, for which the projected gradient ascent (PGA) and the alternating optimization methods are utilized to obtain a high-quality suboptimal solution. Simulation results demonstrate that since the additional spatial degree of freedom (DoF) can be fully exploited, the MA array significantly enhances the secrecy rate compared to the conventional fixed-position antenna (FPA) array.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Mitigating Estimation Errors by Twin TD-Regularized Actor and Critic for Deep Reinforcement Learning
Authors:
Junmin Zhong,
Ruofan Wu,
Jennie Si
Abstract:
We address the issue of estimation bias in deep reinforcement learning (DRL) by introducing solution mechanisms that include a new, twin TD-regularized actor-critic (TDR) method. It aims at reducing both over and under-estimation errors. With TDR and by combining good DRL improvements, such as distributional learning and long N-step surrogate stage reward (LNSS) method, we show that our new TDR-ba…
▽ More
We address the issue of estimation bias in deep reinforcement learning (DRL) by introducing solution mechanisms that include a new, twin TD-regularized actor-critic (TDR) method. It aims at reducing both over and under-estimation errors. With TDR and by combining good DRL improvements, such as distributional learning and long N-step surrogate stage reward (LNSS) method, we show that our new TDR-based actor-critic learning has enabled DRL methods to outperform their respective baselines in challenging environments in DeepMind Control Suite. Furthermore, they elevate TD3 and SAC respectively to a level of performance comparable to that of D4PG (the current SOTA), and they also improve the performance of D4PG to a new SOTA level measured by mean reward, convergence speed, learning success rate, and learning variance.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Strong electron-phonon coupling in Ba$_{1-x}$Sr$_x$Ni$_2$As$_2$
Authors:
Linxing Song,
Jianguo Si,
Tom Fennell,
Uwe Stuhr,
Guochu Deng,
Jinchen Wang,
Juanjuan Liu,
Lijie Hao,
Huiqian Luo,
Miao Liu,
Sheng Meng,
Shiliang Li
Abstract:
The charge density wave (CDW) or nematicity has been found to coexist with superconductivity in many systems. It is thus interesting that the superconducting transition temperature $T_c$ in the doped BaNi$_2$As$_2$ system can be enhanced up to six times as the CDW or nematicity in the undoped compound is suppressed. Here we show that the transverse acoustic phonons of Ba$_{1-x}$Sr$_x$Ni$_2$As$_2$…
▽ More
The charge density wave (CDW) or nematicity has been found to coexist with superconductivity in many systems. It is thus interesting that the superconducting transition temperature $T_c$ in the doped BaNi$_2$As$_2$ system can be enhanced up to six times as the CDW or nematicity in the undoped compound is suppressed. Here we show that the transverse acoustic phonons of Ba$_{1-x}$Sr$_x$Ni$_2$As$_2$ are strongly damped in a wide doping range and over the whole $Q$ range, which excludes its origin from either CDW or nematicity. The damping of TA phonons can be understood as large electron-phonon coupling and possible strong hybridization between acoustic and optical phonons as shown by the first-principle calculations. The superconductivity can be quantitatively reproduced by the change of electron-phonon coupling constant calculated by the McMillan equation in the BCS framework, which suggests that no quantum fluctuations of any order is needed to promote the superconductivity. On the contrary, the change of $T_c$ in this system should be understood as the six-fold suppression of superconductivity in undoped compounds.
△ Less
Submitted 24 March, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Intelligent Reflecting Surface-Aided Wireless Communication with Movable Elements
Authors:
Guojie Hu,
Qingqing Wu,
Dognhui Xu,
Kui Xu,
Jiangbo Si,
Yunlong Cai,
Naofal Al-Dhahir
Abstract:
Intelligent reflecting surface (IRS) has been recognized as a powerful technology for boosting communication performance. To reduce manufacturing and control costs, it is preferable to consider discrete phase shifts (DPSs) for IRS, which are set by default as uniformly distributed in the range of $[ - π,π)$ in the literature. Such setting, however, cannot achieve a desirable performance over the g…
▽ More
Intelligent reflecting surface (IRS) has been recognized as a powerful technology for boosting communication performance. To reduce manufacturing and control costs, it is preferable to consider discrete phase shifts (DPSs) for IRS, which are set by default as uniformly distributed in the range of $[ - π,π)$ in the literature. Such setting, however, cannot achieve a desirable performance over the general Rician fading where the channel phase concentrates in a narrow range with a higher probability. Motivated by this drawback, we in this paper design optimal non-uniform DPSs for IRS to achieve a desirable performance level. The fundamental challenge is the \textit{possible offset in phase distribution across different cascaded source-element-destination channels}, if adopting conventional IRS where the position of each element is fixed. Such phenomenon leads to different patterns of optimal non-uniform DPSs for each IRS element and thus causes huge manufacturing costs especially when the number of IRS elements is large. Driven by the recently emerging fluid antenna system (or movable antenna technology), we demonstrate that if the position of each IRS element can be flexibly adjusted, the above phase distribution offset can be surprisingly eliminated, leading to the same pattern of DPSs for each IRS element. Armed with this, we then determine the form of unified non-uniform DPSs based on a low-complexity iterative algorithm. Simulations show that our proposed design significantly improves the system performance compared to competitive benchmarks.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
Investigation of low gain avalanche detectors exposed to proton fluences beyond 10$^{15}$ n$_\mathrm{eq}$cm$^{-2}$
Authors:
Josef Sorenson,
Martin Hoeferkamp,
Gregor Kramberger,
Sally Seidel,
Jiahe Si
Abstract:
Low gain avalanche detectors (LGADs) deliver excellent timing resolution, which can mitigate mis-assignment of vertices associated with pileup at the High Luminosity LHC and other future hadron colliders. The most highly irradiated LGADs will be subject to $2.5 \times10^{15} \mathrm{n}_\mathrm{eq} \mathrm{cm}^{-2}$ of hadronic fluence during HL-LHC operation; their performance must tolerate this.…
▽ More
Low gain avalanche detectors (LGADs) deliver excellent timing resolution, which can mitigate mis-assignment of vertices associated with pileup at the High Luminosity LHC and other future hadron colliders. The most highly irradiated LGADs will be subject to $2.5 \times10^{15} \mathrm{n}_\mathrm{eq} \mathrm{cm}^{-2}$ of hadronic fluence during HL-LHC operation; their performance must tolerate this. Hamamatsu Photonics K.K. and Fondazione Bruno Kessler LGADs have been irradiated with 400 and 500 MeV protons respectively in several steps up to $1.5 \times10^{15} \mathrm{n}_\mathrm{eq} \mathrm{cm}^{-2}$. Measurements of the acceptor removal constants of the gain layers, evolution of the timing resolution and charge collection with damage, and inter-channel isolation characteristics, for a variety of design options, are presented here.
△ Less
Submitted 28 December, 2023; v1 submitted 3 November, 2023;
originally announced November 2023.
-
Virtual Accessory Try-On via Keypoint Hallucination
Authors:
Junhong Gou,
Bo Zhang,
Li Niu,
Jianfu Zhang,
Jianlou Si,
Chen Qian,
Liqing Zhang
Abstract:
The virtual try-on task refers to fitting the clothes from one image onto another portrait image. In this paper, we focus on virtual accessory try-on, which fits accessory (e.g., glasses, ties) onto a face or portrait image. Unlike clothing try-on, which relies on human silhouette as guidance, accessory try-on warps the accessory into an appropriate location and shape to generate a plausible compo…
▽ More
The virtual try-on task refers to fitting the clothes from one image onto another portrait image. In this paper, we focus on virtual accessory try-on, which fits accessory (e.g., glasses, ties) onto a face or portrait image. Unlike clothing try-on, which relies on human silhouette as guidance, accessory try-on warps the accessory into an appropriate location and shape to generate a plausible composite image. In contrast to previous try-on methods that treat foreground (i.e., accessories) and background (i.e., human faces or bodies) equally, we propose a background-oriented network to utilize the prior knowledge of human bodies and accessories. Specifically, our approach learns the human body priors and hallucinates the target locations of specified foreground keypoints in the background. Then our approach will inject foreground information with accessory priors into the background UNet. Based on the hallucinated target locations, the warping parameters are calculated to warp the foreground. Moreover, this background-oriented network can also easily incorporate auxiliary human face/body semantic segmentation supervision to further boost performance. Experiments conducted on STRAT dataset validate the effectiveness of our proposed method.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
EXPLAIN, EDIT, GENERATE: Rationale-Sensitive Counterfactual Data Augmentation for Multi-hop Fact Verification
Authors:
Yingjie Zhu,
Jiasheng Si,
Yibo Zhao,
Haiyang Zhu,
Deyu Zhou,
Yulan He
Abstract:
Automatic multi-hop fact verification task has gained significant attention in recent years. Despite impressive results, these well-designed models perform poorly on out-of-domain data. One possible solution is to augment the training data with counterfactuals, which are generated by minimally altering the causal features of the original data. However, current counterfactual data augmentation tech…
▽ More
Automatic multi-hop fact verification task has gained significant attention in recent years. Despite impressive results, these well-designed models perform poorly on out-of-domain data. One possible solution is to augment the training data with counterfactuals, which are generated by minimally altering the causal features of the original data. However, current counterfactual data augmentation techniques fail to handle multi-hop fact verification due to their incapability to preserve the complex logical relationships within multiple correlated texts. In this paper, we overcome this limitation by developing a rationale-sensitive method to generate linguistically diverse and label-flipping counterfactuals while preserving logical relationships. In specific, the diverse and fluent counterfactuals are generated via an Explain-Edit-Generate architecture. Moreover, the checking and filtering modules are proposed to regularize the counterfactual data with logical relations and flipped labels. Experimental results show that the proposed approach outperforms the SOTA baselines and can generate linguistically diverse counterfactual data without disrupting their logical relationships.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow
Authors:
Junhong Gou,
Siyu Sun,
Jianfu Zhang,
Jianlou Si,
Chen Qian,
Liqing Zhang
Abstract:
Virtual try-on is a critical image synthesis task that aims to transfer clothes from one image to another while preserving the details of both humans and clothes. While many existing methods rely on Generative Adversarial Networks (GANs) to achieve this, flaws can still occur, particularly at high resolutions. Recently, the diffusion model has emerged as a promising alternative for generating high…
▽ More
Virtual try-on is a critical image synthesis task that aims to transfer clothes from one image to another while preserving the details of both humans and clothes. While many existing methods rely on Generative Adversarial Networks (GANs) to achieve this, flaws can still occur, particularly at high resolutions. Recently, the diffusion model has emerged as a promising alternative for generating high-quality images in various applications. However, simply using clothes as a condition for guiding the diffusion model to inpaint is insufficient to maintain the details of the clothes. To overcome this challenge, we propose an exemplar-based inpainting approach that leverages a warping module to guide the diffusion model's generation effectively. The warping module performs initial processing on the clothes, which helps to preserve the local details of the clothes. We then combine the warped clothes with clothes-agnostic person image and add noise as the input of diffusion model. Additionally, the warped clothes is used as local conditions for each denoising process to ensure that the resulting output retains as much detail as possible. Our approach, namely Diffusion-based Conditional Inpainting for Virtual Try-ON (DCI-VTON), effectively utilizes the power of the diffusion model, and the incorporation of the warping module helps to produce high-quality and realistic virtual try-on results. Experimental results on VITON-HD demonstrate the effectiveness and superiority of our method.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Modulation-Enhanced Excitation for Continuous-Time Reinforcement Learning via Symmetric Kronecker Products
Authors:
Brent A. Wallace,
Jennie Si
Abstract:
This work introduces new results in continuous-time reinforcement learning (CT-RL) control of affine nonlinear systems to address a major algorithmic challenge due to a lack of persistence of excitation (PE). This PE design limitation has previously stifled CT-RL numerical performance and prevented these algorithms from achieving control synthesis goals. Our new theoretical developments in symmetr…
▽ More
This work introduces new results in continuous-time reinforcement learning (CT-RL) control of affine nonlinear systems to address a major algorithmic challenge due to a lack of persistence of excitation (PE). This PE design limitation has previously stifled CT-RL numerical performance and prevented these algorithms from achieving control synthesis goals. Our new theoretical developments in symmetric Kronecker products enable a proposed modulation-enhanced excitation (MEE) framework to make PE significantly more systematic and intuitive to achieve for real-world designers. MEE is applied to the suite of recently-developed excitable integral reinforcement learning (EIRL) algorithms, yielding a class of enhanced high-performance CT-RL control design methods which, due to the symmetric Kronecker product algebra, retain EIRL's convergence and closed-loop stability guarantees. Through numerical evaluation studies, we demonstrate how our new MEE framework achieves substantial improvements in conditioning when approximately solving the Hamilton-Jacobi-Bellman equation to obtain optimal controls. We use an intuitive example to provide insights on the central excitation issue under discussion, and we demonstrate the effectiveness of the proposed procedure on a real-world hypersonic vehicle (HSV) application.
△ Less
Submitted 31 July, 2023;
originally announced July 2023.
-
Explainable Topic-Enhanced Argument Mining from Heterogeneous Sources
Authors:
Jiasheng Si,
Yingjie Zhu,
Xingyu Shi,
Deyu Zhou,
Yulan He
Abstract:
Given a controversial target such as ``nuclear energy'', argument mining aims to identify the argumentative text from heterogeneous sources. Current approaches focus on exploring better ways of integrating the target-associated semantic information with the argumentative text. Despite their empirical successes, two issues remain unsolved: (i) a target is represented by a word or a phrase, which is…
▽ More
Given a controversial target such as ``nuclear energy'', argument mining aims to identify the argumentative text from heterogeneous sources. Current approaches focus on exploring better ways of integrating the target-associated semantic information with the argumentative text. Despite their empirical successes, two issues remain unsolved: (i) a target is represented by a word or a phrase, which is insufficient to cover a diverse set of target-related subtopics; (ii) the sentence-level topic information within an argument, which we believe is crucial for argument mining, is ignored. To tackle the above issues, we propose a novel explainable topic-enhanced argument mining approach. Specifically, with the use of the neural topic model and the language model, the target information is augmented by explainable topic representations. Moreover, the sentence-level topic information within the argument is captured by minimizing the distance between its latent topic distribution and its semantic representation through mutual learning. Experiments have been conducted on the benchmark dataset in both the in-target setting and the cross-target setting. Results demonstrate the superiority of the proposed model against the state-of-the-art baselines.
△ Less
Submitted 22 July, 2023;
originally announced July 2023.
-
CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility
Authors:
Guohai Xu,
Jiayi Liu,
Ming Yan,
Haotian Xu,
Jinghui Si,
Zhuoran Zhou,
Peng Yi,
Xing Gao,
Jitao Sang,
Rong Zhang,
Ji Zhang,
Chao Peng,
Fei Huang,
Jingren Zhou
Abstract:
With the rapid evolution of large language models (LLMs), there is a growing concern that they may pose risks or have negative social impacts. Therefore, evaluation of human values alignment is becoming increasingly important. Previous work mainly focuses on assessing the performance of LLMs on certain knowledge and reasoning abilities, while neglecting the alignment to human values, especially in…
▽ More
With the rapid evolution of large language models (LLMs), there is a growing concern that they may pose risks or have negative social impacts. Therefore, evaluation of human values alignment is becoming increasingly important. Previous work mainly focuses on assessing the performance of LLMs on certain knowledge and reasoning abilities, while neglecting the alignment to human values, especially in a Chinese context. In this paper, we present CValues, the first Chinese human values evaluation benchmark to measure the alignment ability of LLMs in terms of both safety and responsibility criteria. As a result, we have manually collected adversarial safety prompts across 10 scenarios and induced responsibility prompts from 8 domains by professional experts. To provide a comprehensive values evaluation of Chinese LLMs, we not only conduct human evaluation for reliable comparison, but also construct multi-choice prompts for automatic evaluation. Our findings suggest that while most Chinese LLMs perform well in terms of safety, there is considerable room for improvement in terms of responsibility. Moreover, both the automatic and human evaluation are important for assessing the human values alignment in different aspects. The benchmark and code is available on ModelScope and Github.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Continuous-Time Reinforcement Learning: New Design Algorithms with Theoretical Insights and Performance Guarantees
Authors:
Brent A. Wallace,
Jennie Si
Abstract:
Continuous-time nonlinear optimal control problems hold great promise in real-world applications. After decades of development, reinforcement learning (RL) has achieved some of the greatest successes as a general nonlinear control design method. However, a recent comprehensive analysis of state-of-the-art continuous-time RL (CT-RL) methods, namely, adaptive dynamic programming (ADP)-based CT-RL al…
▽ More
Continuous-time nonlinear optimal control problems hold great promise in real-world applications. After decades of development, reinforcement learning (RL) has achieved some of the greatest successes as a general nonlinear control design method. However, a recent comprehensive analysis of state-of-the-art continuous-time RL (CT-RL) methods, namely, adaptive dynamic programming (ADP)-based CT-RL algorithms, reveals they face significant design challenges due to their complexity, numerical conditioning, and dimensional scaling issues. Despite advanced theoretical results, existing ADP CT-RL synthesis methods are inadequate in solving even small, academic problems. The goal of this work is thus to introduce a suite of new CT-RL algorithms for control of affine nonlinear systems. Our design approach relies on two important factors. First, our methods are applicable to physical systems that can be partitioned into smaller subproblems. This constructive consideration results in reduced dimensionality and greatly improved intuitiveness of design. Second, we introduce a new excitation framework to improve persistence of excitation (PE) and numerical conditioning performance via classical input/output insights. Such a design-centric approach is the first of its kind in the ADP CT-RL community. In this paper, we progressively introduce a suite of (decentralized) excitable integral reinforcement learning (EIRL) algorithms. We provide convergence and closed-loop stability guarantees, and we demonstrate these guarantees on a significant application problem of controlling an unstable, nonminimum phase hypersonic vehicle (HSV).
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Vehicle Dispatching and Routing of On-Demand Intercity Ride-Pooling Services: A Multi-Agent Hierarchical Reinforcement Learning Approach
Authors:
Jinhua Si,
Fang He,
Xi Lin,
Xindi Tang
Abstract:
The integrated development of city clusters has given rise to an increasing demand for intercity travel. Intercity ride-pooling service exhibits considerable potential in upgrading traditional intercity bus services by implementing demand-responsive enhancements. Nevertheless, its online operations suffer the inherent complexities due to the coupling of vehicle resource allocation among cities and…
▽ More
The integrated development of city clusters has given rise to an increasing demand for intercity travel. Intercity ride-pooling service exhibits considerable potential in upgrading traditional intercity bus services by implementing demand-responsive enhancements. Nevertheless, its online operations suffer the inherent complexities due to the coupling of vehicle resource allocation among cities and pooled-ride vehicle routing. To tackle these challenges, this study proposes a two-level framework designed to facilitate online fleet management. Specifically, a novel multi-agent feudal reinforcement learning model is proposed at the upper level of the framework to cooperatively assign idle vehicles to different intercity lines, while the lower level updates the routes of vehicles using an adaptive large neighborhood search heuristic. Numerical studies based on the realistic dataset of Xiamen and its surrounding cities in China show that the proposed framework effectively mitigates the supply and demand imbalances, and achieves significant improvement in both the average daily system profit and order fulfillment ratio.
△ Less
Submitted 20 March, 2024; v1 submitted 13 July, 2023;
originally announced July 2023.
-
TFR: Texture Defect Detection with Fourier Transform using Normal Reconstructed Template of Simple Autoencoder
Authors:
Jongwook Si,
Sungyoung Kim
Abstract:
Texture is an essential information in image representation, capturing patterns and structures. As a result, texture plays a crucial role in the manufacturing industry and is extensively studied in the fields of computer vision and pattern recognition. However, real-world textures are susceptible to defects, which can degrade image quality and cause various issues. Therefore, there is a need for a…
▽ More
Texture is an essential information in image representation, capturing patterns and structures. As a result, texture plays a crucial role in the manufacturing industry and is extensively studied in the fields of computer vision and pattern recognition. However, real-world textures are susceptible to defects, which can degrade image quality and cause various issues. Therefore, there is a need for accurate and effective methods to detect texture defects. In this study, a simple autoencoder and Fourier transform are employed for texture defect detection. The proposed method combines Fourier transform analysis with the reconstructed template obtained from the simple autoencoder. Fourier transform is a powerful tool for analyzing the frequency domain of images and signals. Moreover, since texture defects often exhibit characteristic changes in specific frequency ranges, analyzing the frequency domain enables effective defect detection. The proposed method demonstrates effectiveness and accuracy in detecting texture defects. Experimental results are presented to evaluate its performance and compare it with existing approaches.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
PP-GAN : Style Transfer from Korean Portraits to ID Photos Using Landmark Extractor with GAN
Authors:
Jongwook Si,
Sungyoung Kim
Abstract:
The objective of a style transfer is to maintain the content of an image while transferring the style of another image. However, conventional research on style transfer has a significant limitation in preserving facial landmarks, such as the eyes, nose, and mouth, which are crucial for maintaining the identity of the image. In Korean portraits, the majority of individuals wear "Gat", a type of hea…
▽ More
The objective of a style transfer is to maintain the content of an image while transferring the style of another image. However, conventional research on style transfer has a significant limitation in preserving facial landmarks, such as the eyes, nose, and mouth, which are crucial for maintaining the identity of the image. In Korean portraits, the majority of individuals wear "Gat", a type of headdress exclusively worn by men. Owing to its distinct characteristics from the hair in ID photos, transferring the "Gat" is challenging. To address this issue, this study proposes a deep learning network that can perform style transfer, including the "Gat", while preserving the identity of the face. Unlike existing style transfer approaches, the proposed method aims to preserve texture, costume, and the "Gat" on the style image. The Generative Adversarial Network forms the backbone of the proposed network. The color, texture, and intensity were extracted differently based on the characteristics of each block and layer of the pre-trained VGG-16, and only the necessary elements during training were preserved using a facial landmark mask. The head area was presented using the eyebrow area to transfer the "Gat". Furthermore, the identity of the face was retained, and style correlation was considered based on the Gram matrix. The proposed approach demonstrated superior transfer and preservation performance compared to previous studies.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Restoration of the JPEG Maximum Lossy Compressed Face Images with Hourglass Block based on Early Stopping Discriminator
Authors:
Jongwook Si,
Sungyoung Kim
Abstract:
When a JPEG image is compressed using the loss compression method with a high compression rate, a blocking phenomenon can occur in the image, making it necessary to restore the image to its original quality. In particular, restoring compressed images that are unrecognizable presents an innovative challenge. Therefore, this paper aims to address the restoration of JPEG images that have suffered sig…
▽ More
When a JPEG image is compressed using the loss compression method with a high compression rate, a blocking phenomenon can occur in the image, making it necessary to restore the image to its original quality. In particular, restoring compressed images that are unrecognizable presents an innovative challenge. Therefore, this paper aims to address the restoration of JPEG images that have suffered significant loss due to maximum compression using a GAN-based net-work method. The generator in this network is based on the U-Net architecture and features a newly presented hourglass structure that can preserve the charac-teristics of deep layers. Additionally, the network incorporates two loss functions, LF Loss and HF Loss, to generate natural and high-performance images. HF Loss uses a pretrained VGG-16 network and is configured using a specific layer that best represents features, which can enhance performance for the high-frequency region. LF Loss, on the other hand, is used to handle the low-frequency region. These two loss functions facilitate the generation of images by the generator that can deceive the discriminator while accurately generating both high and low-frequency regions. The results show that the blocking phe-nomenon in lost compressed images was removed, and recognizable identities were generated. This study represents a significant improvement over previous research in terms of image restoration performance.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Chili Pepper Disease Diagnosis via Image Reconstruction Using GrabCut and Generative Adversarial Serial Autoencoder
Authors:
Jongwook Si,
Sungyoung Kim
Abstract:
With the recent development of smart farms, researchers are very interested in such fields. In particular, the field of disease diagnosis is the most important factor. Disease diagnosis belongs to the field of anomaly detection and aims to distinguish whether plants or fruits are normal or abnormal. The problem can be solved by binary or multi-classification based on CNN, but it can also be solved…
▽ More
With the recent development of smart farms, researchers are very interested in such fields. In particular, the field of disease diagnosis is the most important factor. Disease diagnosis belongs to the field of anomaly detection and aims to distinguish whether plants or fruits are normal or abnormal. The problem can be solved by binary or multi-classification based on CNN, but it can also be solved by image reconstruction. However, due to the limitation of the performance of image generation, SOTA's methods propose a score calculation method using a latent vector error. In this paper, we propose a network that focuses on chili peppers and proceeds with background removal through Grabcut. It shows high performance through image-based score calculation method. Due to the difficulty of reconstructing the input image, the difference between the input and output images is large. However, the serial autoencoder proposed in this paper uses the difference between the two fake images except for the actual input as a score. We propose a method of generating meaningful images using the GAN structure and classifying three results simultaneously by one discriminator. The proposed method showed higher performance than previous researches, and image-based scores showed the best performanc
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
Energy-efficient superparamagnetic Ising machine and its application to traveling salesman problems
Authors:
Jia Si,
Shuhan Yang,
Yunuo Cen,
Jiaer Chen,
Zhaoyang Yao,
Dong-Jun Kim,
Kaiming Cai,
Jerald Yoo,
Xuanyao Fong,
Hyunsoo Yang
Abstract:
The growth of artificial intelligence and IoT has created a significant computational load for solving non-deterministic polynomial-time (NP)-hard problems, which are difficult to solve using conventional computers. The Ising computer, based on the Ising model and annealing process, has been highly sought for finding approximate solutions to NP-hard problems by observing the convergence of dynamic…
▽ More
The growth of artificial intelligence and IoT has created a significant computational load for solving non-deterministic polynomial-time (NP)-hard problems, which are difficult to solve using conventional computers. The Ising computer, based on the Ising model and annealing process, has been highly sought for finding approximate solutions to NP-hard problems by observing the convergence of dynamic spin states. However, it faces several challenges, including high power consumption due to artificial spins and randomness emulated by complex circuits, as well as low scalability caused by the rapidly growing connectivity when considering large-scale problems. Here, we present an experimental Ising annealing computer based on superparamagnetic tunnel junctions (SMTJs) with all-to-all connections, which successfully solves a 70-city travelling salesman problem (4761-node Ising problem). By taking advantage of the intrinsic randomness of SMTJs, implementing a proper global annealing scheme, and using an efficient algorithm, our SMTJ-based Ising annealer shows superior performance in terms of power consumption and energy efficiency compared to other Ising schemes. Additionally, our approach provides a promising way to solve complex problems with limited hardware resources. Moreover, we propose a crossbar array architecture for scalable integration using conventional magnetic random access memories. Our results demonstrate that the SMTJ-based Ising annealing computer with high energy efficiency, speed, and scalability is a strong candidate for future unconventional computing schemes.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Consistent Multi-Granular Rationale Extraction for Explainable Multi-hop Fact Verification
Authors:
Jiasheng Si,
Yingjie Zhu,
Deyu Zhou
Abstract:
The success of deep learning models on multi-hop fact verification has prompted researchers to understand the behavior behind their veracity. One possible way is erasure search: obtaining the rationale by entirely removing a subset of input without compromising the veracity prediction. Although extensively explored, existing approaches fall within the scope of the single-granular (tokens or senten…
▽ More
The success of deep learning models on multi-hop fact verification has prompted researchers to understand the behavior behind their veracity. One possible way is erasure search: obtaining the rationale by entirely removing a subset of input without compromising the veracity prediction. Although extensively explored, existing approaches fall within the scope of the single-granular (tokens or sentences) explanation, which inevitably leads to explanation redundancy and inconsistency. To address such issues, this paper explores the viability of multi-granular rationale extraction with consistency and faithfulness for explainable multi-hop fact verification. In particular, given a pretrained veracity prediction model, both the token-level explainer and sentence-level explainer are trained simultaneously to obtain multi-granular rationales via differentiable masking. Meanwhile, three diagnostic properties (fidelity, consistency, salience) are introduced and applied to the training process, to ensure that the extracted rationales satisfy faithfulness and consistency. Experimental results on three multi-hop fact verification datasets show that the proposed approach outperforms some state-of-the-art baselines.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
The properties of small magnetic flux ropes inside the solar wind come from coronal holes, active regions, and quiet Sun
Authors:
Changhao Zhai,
Hui Fu,
Jiachen Si,
Zhenghua Huang,
Lidong Xia
Abstract:
The origination and generation mechanisms of small magnetic flux ropes (SFRs), which are important structures in solar wind, are not clearly known. In present study, 1993 SFRs immersed in coronal holes, active regions, and quiet Sun solar wind are analyzed and compared. We find that the properties of SFRs immersed in three types of solar wind are signicantly different. The SFRs are further classif…
▽ More
The origination and generation mechanisms of small magnetic flux ropes (SFRs), which are important structures in solar wind, are not clearly known. In present study, 1993 SFRs immersed in coronal holes, active regions, and quiet Sun solar wind are analyzed and compared. We find that the properties of SFRs immersed in three types of solar wind are signicantly different. The SFRs are further classifed into hot-SFRs, cold-SFRs, and normal-SFRs, according to whether the O7+/O6+ is 30% elevated or dropped inside SFRs as compared with background solar wind. Our studies show that the parameters of normal-SFRs are similar to background in all three types of solar wind. The properties of hot-SFRs and cold-SFRs seem to be lying in two extremes. Statistically, the hot-SFRs (cold-SFRs) are associated with longer (shorter) duration, lower (higher) speeds and proton temperatures, higher (lower) charge states, helium abundance, and FIP bias as compared with normal-SFRs and background solar wind. The anti-correlations between speed and O7+/O6+ inside hot-SFRs (normal-SFRs) are different from (similar to) those in background solar wind. Most of hot-SFRs and cold-SFRs should come from the Sun. Hot-SFRs may come from streamers associated with plasma blobs and/or small-scale activities on the Sun. Cold-SFRs may be accompanied by small-scale eruptions with lower-temperature materials. Both hot-SFRs and cold-SFRs could also be formed by magnetic erosions of ICMEs that do not contain or contain cold-filament materials. The characteristics of normal-SFRs can be explained reasonably by the two originations, from the Sun and generated in the heliosphere both.
△ Less
Submitted 23 April, 2023;
originally announced April 2023.
-
BPJDet: Extended Object Representation for Generic Body-Part Joint Detection
Authors:
Huayi Zhou,
Fei Jiang,
Jiaxin Si,
Yue Ding,
Hongtao Lu
Abstract:
Detection of human body and its parts has been intensively studied. However, most of CNNs-based detectors are trained independently, making it difficult to associate detected parts with body. In this paper, we focus on the joint detection of human body and its parts. Specifically, we propose a novel extended object representation integrating center-offsets of body parts, and construct an end-to-en…
▽ More
Detection of human body and its parts has been intensively studied. However, most of CNNs-based detectors are trained independently, making it difficult to associate detected parts with body. In this paper, we focus on the joint detection of human body and its parts. Specifically, we propose a novel extended object representation integrating center-offsets of body parts, and construct an end-to-end generic Body-Part Joint Detector (BPJDet). In this way, body-part associations are neatly embedded in a unified representation containing both semantic and geometric contents. Therefore, we can optimize multi-loss to tackle multi-tasks synergistically. Moreover, this representation is suitable for anchor-based and anchor-free detectors. BPJDet does not suffer from error-prone post matching, and keeps a better trade-off between speed and accuracy. Furthermore, BPJDet can be generalized to detect body-part or body-parts of either human or quadruped animals. To verify the superiority of BPJDet, we conduct experiments on datasets of body-part (CityPersons, CrowdHuman and BodyHands) and body-parts (COCOHumanParts and Animals5C). While keeping high detection accuracy, BPJDet achieves state-of-the-art association performance on all datasets. Besides, we show benefits of advanced body-part association capability by improving performance of two representative downstream applications: accurate crowd head detection and hand contact estimation. Project is available in https://hnuzhy.github.io/projects/BPJDet.
△ Less
Submitted 13 January, 2024; v1 submitted 21 April, 2023;
originally announced April 2023.
-
Temperature dependent anisotropy and two-band superconductivity revealed by lower critical field in organic superconductor $κ$-(BEDT-TTF)$_{2}$Cu[N(CN)$_{2}$]Br
Authors:
Huijing Mu,
Jin Si,
Qingui Yang,
Ying Xiang,
Haipeng Yang,
Hai-Hu Wen
Abstract:
Resistivity and magnetization have been measured at different temperatures and magnetic fields in organic superconductors $κ$-(BEDT-TTF)$_{2}$Cu[N(CN)$_{2}$]Br. The lower critical field and upper critical field are determined, which allow to depict a complete phase diagram. Through the comparison between the upper critical fields with magnetic field perpendicular and parallel to the conducting ac-…
▽ More
Resistivity and magnetization have been measured at different temperatures and magnetic fields in organic superconductors $κ$-(BEDT-TTF)$_{2}$Cu[N(CN)$_{2}$]Br. The lower critical field and upper critical field are determined, which allow to depict a complete phase diagram. Through the comparison between the upper critical fields with magnetic field perpendicular and parallel to the conducting ac-planes, and the scaling of the in-plane resistivity with field along different directions, we found that the anisotropy $Γ$ is strongly temperature dependent. It is found that $Γ$ is quite large (above 20) near $T_{c}$, which satisfies the 2D model, but approaches a small value in the low-temperature region. The 2D-Tinkham model can also be used to fit the data at high temperatures. This is explained as a crossover from the orbital depairing mechanism in high-temperature and low-field region to the paramagnetic depairing mechanism in the high-field and low-temperature region. The temperature dependence of lower critical field $H_{c1} (T)$ shows a concave shape in wide temperature region. It is found that neither a single $d$-wave nor a single $s$-wave gap can fit the $H_{c1} (T)$, however a two-gap model containing an $s$-wave and a $d$-wave can fit the data rather well, suggesting two-band superconductivity and an unconventional pairing mechanism in this organic superconductor.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Multi-head Uncertainty Inference for Adversarial Attack Detection
Authors:
Yuqi Yang,
Songyun Yang,
Jiyang Xie. Zhongwei Si,
Kai Guo,
Ke Zhang,
Kongming Liang
Abstract:
Deep neural networks (DNNs) are sensitive and susceptible to tiny perturbation by adversarial attacks which causes erroneous predictions. Various methods, including adversarial defense and uncertainty inference (UI), have been developed in recent years to overcome the adversarial attacks. In this paper, we propose a multi-head uncertainty inference (MH-UI) framework for detecting adversarial attac…
▽ More
Deep neural networks (DNNs) are sensitive and susceptible to tiny perturbation by adversarial attacks which causes erroneous predictions. Various methods, including adversarial defense and uncertainty inference (UI), have been developed in recent years to overcome the adversarial attacks. In this paper, we propose a multi-head uncertainty inference (MH-UI) framework for detecting adversarial attack examples. We adopt a multi-head architecture with multiple prediction heads (i.e., classifiers) to obtain predictions from different depths in the DNNs and introduce shallow information for the UI. Using independent heads at different depths, the normalized predictions are assumed to follow the same Dirichlet distribution, and we estimate distribution parameter of it by moment matching. Cognitive uncertainty brought by the adversarial attacks will be reflected and amplified on the distribution. Experimental results show that the proposed MH-UI framework can outperform all the referred UI methods in the adversarial attack detection task with different settings.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
-
Exploring Faithful Rationale for Multi-hop Fact Verification via Salience-Aware Graph Learning
Authors:
Jiasheng Si,
Yingjie Zhu,
Deyu Zhou
Abstract:
The opaqueness of the multi-hop fact verification model imposes imperative requirements for explainability. One feasible way is to extract rationales, a subset of inputs, where the performance of prediction drops dramatically when being removed. Though being explainable, most rationale extraction methods for multi-hop fact verification explore the semantic information within each piece of evidence…
▽ More
The opaqueness of the multi-hop fact verification model imposes imperative requirements for explainability. One feasible way is to extract rationales, a subset of inputs, where the performance of prediction drops dramatically when being removed. Though being explainable, most rationale extraction methods for multi-hop fact verification explore the semantic information within each piece of evidence individually, while ignoring the topological information interaction among different pieces of evidence. Intuitively, a faithful rationale bears complementary information being able to extract other rationales through the multi-hop reasoning process. To tackle such disadvantages, we cast explainable multi-hop fact verification as subgraph extraction, which can be solved based on graph convolutional network (GCN) with salience-aware graph learning. In specific, GCN is utilized to incorporate the topological interaction information among multiple pieces of evidence for learning evidence representation. Meanwhile, to alleviate the influence of noisy evidence, the salience-aware graph perturbation is induced into the message passing of GCN. Moreover, the multi-task model with three diagnostic properties of rationale is elaborately designed to improve the quality of an explanation without any explicit annotations. Experimental results on the FEVEROUS benchmark show significant gains over previous state-of-the-art methods for both rationale extraction and fact verification.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
StuArt: Individualized Classroom Observation of Students with Automatic Behavior Recognition and Tracking
Authors:
Huayi Zhou,
Fei Jiang,
Jiaxin Si,
Lili Xiong,
Hongtao Lu
Abstract:
Each student matters, but it is hardly for instructors to observe all the students during the courses and provide helps to the needed ones immediately. In this paper, we present StuArt, a novel automatic system designed for the individualized classroom observation, which empowers instructors to concern the learning status of each student. StuArt can recognize five representative student behaviors…
▽ More
Each student matters, but it is hardly for instructors to observe all the students during the courses and provide helps to the needed ones immediately. In this paper, we present StuArt, a novel automatic system designed for the individualized classroom observation, which empowers instructors to concern the learning status of each student. StuArt can recognize five representative student behaviors (hand-raising, standing, sleeping, yawning, and smiling) that are highly related to the engagement and track their variation trends during the course. To protect the privacy of students, all the variation trends are indexed by the seat numbers without any personal identification information. Furthermore, StuArt adopts various user-friendly visualization designs to help instructors quickly understand the individual and whole learning status. Experimental results on real classroom videos have demonstrated the superiority and robustness of the embedded algorithms. We expect our system promoting the development of large-scale individualized guidance of students. More information is in https://github.com/hnuzhy/StuArt.
△ Less
Submitted 13 March, 2023; v1 submitted 6 November, 2022;
originally announced November 2022.
-
Joint Multi-Person Body Detection and Orientation Estimation via One Unified Embedding
Authors:
Huayi Zhou,
Fei Jiang,
Jiaxin Si,
Hongtao Lu
Abstract:
Human body orientation estimation (HBOE) is widely applied into various applications, including robotics, surveillance, pedestrian analysis and autonomous driving. Although many approaches have been addressing the HBOE problem from specific under-controlled scenes to challenging in-the-wild environments, they assume human instances are already detected and take a well cropped sub-image as the inpu…
▽ More
Human body orientation estimation (HBOE) is widely applied into various applications, including robotics, surveillance, pedestrian analysis and autonomous driving. Although many approaches have been addressing the HBOE problem from specific under-controlled scenes to challenging in-the-wild environments, they assume human instances are already detected and take a well cropped sub-image as the input. This setting is less efficient and prone to errors in real application, such as crowds of people. In the paper, we propose a single-stage end-to-end trainable framework for tackling the HBOE problem with multi-persons. By integrating the prediction of bounding boxes and direction angles in one embedding, our method can jointly estimate the location and orientation of all bodies in one image directly. Our key idea is to integrate the HBOE task into the multi-scale anchor channel predictions of persons for concurrently benefiting from engaged intermediate features. Therefore, our approach can naturally adapt to difficult instances involving low resolution and occlusion as in object detection. We validated the efficiency and effectiveness of our method in the recently presented benchmark MEBOW with extensive experiments. Besides, we completed ambiguous instances ignored by the MEBOW dataset, and provided corresponding weak body-orientation labels to keep the integrity and consistency of it for supporting studies toward multi-persons. Our work is available at https://github.com/hnuzhy/JointBDOE.
△ Less
Submitted 16 March, 2023; v1 submitted 27 October, 2022;
originally announced October 2022.
-
Long N-step Surrogate Stage Reward to Reduce Variances of Deep Reinforcement Learning in Complex Problems
Authors:
Junmin Zhong,
Ruofan Wu,
Jennie Si
Abstract:
High variances in reinforcement learning have shown impeding successful convergence and hurting task performance. As reward signal plays an important role in learning behavior, multi-step methods have been considered to mitigate the problem, and are believed to be more effective than single step methods. However, there is a lack of comprehensive and systematic study on this important aspect to dem…
▽ More
High variances in reinforcement learning have shown impeding successful convergence and hurting task performance. As reward signal plays an important role in learning behavior, multi-step methods have been considered to mitigate the problem, and are believed to be more effective than single step methods. However, there is a lack of comprehensive and systematic study on this important aspect to demonstrate the effectiveness of multi-step methods in solving highly complex continuous control problems. In this study, we introduce a new long $N$-step surrogate stage (LNSS) reward approach to effectively account for complex environment dynamics while previous methods are usually feasible for limited number of steps. The LNSS method is simple, low computational cost, and applicable to value based or policy gradient reinforcement learning. We systematically evaluate LNSS in OpenAI Gym and DeepMind Control Suite to address some complex benchmark environments that have been challenging to obtain good results by DRL in general. We demonstrate performance improvement in terms of total reward, convergence speed, and coefficient of variation (CV) by LNSS. We also provide analytical insights on how LNSS exponentially reduces the upper bound on the variances of Q value from a respective single step method
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Active beam steering enabled by photonic crystal surface emitting laser
Authors:
Mingjin Wang,
Zihao Chen,
Yuanbo Xu,
Jingxuan Chen,
Jiahao Si,
Zheng Zhang Chao Peng,
Wanhua Zheng
Abstract:
Emitting light towards on-demand directions is important for various optoelectronic applications, such as optical communication, displaying, and ranging. However, almost all existing directional emitters are assemblies of passive optical antennae and external light sources, which are usually bulky, fragile, and with unendurable loss of light power. Here we theoretically propose and experimentally…
▽ More
Emitting light towards on-demand directions is important for various optoelectronic applications, such as optical communication, displaying, and ranging. However, almost all existing directional emitters are assemblies of passive optical antennae and external light sources, which are usually bulky, fragile, and with unendurable loss of light power. Here we theoretically propose and experimentally demonstrate a new conceptual design of directional emitter, by using a single surface-emitting laser source itself to achieve dynamically controlled beam steering. The laser is built on photonic crystals that operates near the band edges in the continuum. By shrinking laser sizes into tens-of-wavelength, the optical modes quantize in three-dimensional momentum space, and each of them directionally radiates towards the far-field. Further utilizing the luminescence spectrum shifting effect under current injection, we consecutively select a sequence of modes into lasing action and show the laser maintaining in single mode operation with linewidths at a minimum of $1.8$ MHz and emitting power of $\sim$ ten milliwatts, and we demonstrate fast beam steering across a range of $3.2^\circ \times 4^\circ$ in a time scale of $500$ nanoseconds. Our work proposes a novel method for on-chip active beam steering, which could pave the way for the development of automotive, industrial, and robotic applications.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
Weak-shot Semantic Segmentation via Dual Similarity Transfer
Authors:
Junjie Chen,
Li Niu,
Siyuan Zhou,
Jianlou Si,
Chen Qian,
Liqing Zhang
Abstract:
Semantic segmentation is an important and prevalent task, but severely suffers from the high cost of pixel-level annotations when extending to more classes in wider applications. To this end, we focus on the problem named weak-shot semantic segmentation, where the novel classes are learnt from cheaper image-level labels with the support of base classes having off-the-shelf pixel-level labels. To t…
▽ More
Semantic segmentation is an important and prevalent task, but severely suffers from the high cost of pixel-level annotations when extending to more classes in wider applications. To this end, we focus on the problem named weak-shot semantic segmentation, where the novel classes are learnt from cheaper image-level labels with the support of base classes having off-the-shelf pixel-level labels. To tackle this problem, we propose SimFormer, which performs dual similarity transfer upon MaskFormer. Specifically, MaskFormer disentangles the semantic segmentation task into two sub-tasks: proposal classification and proposal segmentation for each proposal. Proposal segmentation allows proposal-pixel similarity transfer from base classes to novel classes, which enables the mask learning of novel classes. We also learn pixel-pixel similarity from base classes and distill such class-agnostic semantic similarity to the semantic masks of novel classes, which regularizes the segmentation model with pixel-level semantic relationship across images. In addition, we propose a complementary loss to facilitate the learning of novel classes. Comprehensive experiments on the challenging COCO-Stuff-10K and ADE20K datasets demonstrate the effectiveness of our method. Codes are available at https://github.com/bcmi/SimFormer-Weak-Shot-Semantic-Segmentation.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
A Cooperative Deception Strategy for Covert Communication in Presence of a Multi-antenna Adversary
Authors:
Jiangbo Si,
Zizhen Liu,
Zan Li,
Hang Hu,
Lei Guan,
Chao Wang,
Naofal Al-Dhahir
Abstract:
Covert transmission is investigated for a cooperative deception strategy, where a cooperative jammer (Jammer) tries to attract a multi-antenna adversary (Willie) and degrade the adversary's reception ability for the signal from a transmitter (Alice). For this strategy, we formulate an optimization problem to maximize the covert rate when three different types of channel state information (CSI) are…
▽ More
Covert transmission is investigated for a cooperative deception strategy, where a cooperative jammer (Jammer) tries to attract a multi-antenna adversary (Willie) and degrade the adversary's reception ability for the signal from a transmitter (Alice). For this strategy, we formulate an optimization problem to maximize the covert rate when three different types of channel state information (CSI) are available. The total power is optimally allocated between Alice and Jammer subject to Kullback-Leibler (KL) divergence constraint. Different from the existing literature, in our proposed strategy, we also determine the optimal transmission power at the jammer when Alice is silent, while existing works always assume that the jammer's power is fixed. Specifically, we apply the S-procedure to convert infinite constraints into linear-matrix-inequalities (LMI) constraints. When statistical CSI at Willie is available, we convert double integration to single integration using asymptotic approximation and substitution method. In addition, the transmission strategy without jammer deception is studied as a benchmark. Finally, our simulation results show that for the proposed strategy, the covert rate is increased with the number of antennas at Willie. Moreover, compared to the benchmark, our proposed strategy is more robust in face of imperfect CSI.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
A uniqueness property for Bergman functions on the Siegel upper half-space
Authors:
Congwen Liu,
Jiajia Si,
Heng Xu
Abstract:
In this paper, we show that the Bergman functions on the Siegel upper half-space enjoy the following uniqueness property: if $f\in A_t^p(\calU)$ and $\bfL^α f\equiv 0$ for some nonnegative multi-index $α$, then $f\equiv 0$, where $\bfL^α:=(\bfL_1)^{α_1} \cdots (\bfL_n)^{α_n}$ with $\bfL_j = \frac{\partial }{\partial z_j} + 2i \bar{z}_j \frac{\partial }{\partial z_n}$ for $j=1,\ldots, n-1$ and…
▽ More
In this paper, we show that the Bergman functions on the Siegel upper half-space enjoy the following uniqueness property: if $f\in A_t^p(\calU)$ and $\bfL^α f\equiv 0$ for some nonnegative multi-index $α$, then $f\equiv 0$, where $\bfL^α:=(\bfL_1)^{α_1} \cdots (\bfL_n)^{α_n}$ with $\bfL_j = \frac{\partial }{\partial z_j} + 2i \bar{z}_j \frac{\partial }{\partial z_n}$ for $j=1,\ldots, n-1$ and $\bfL_n = \frac{\partial }{\partial z_n}$. As a consequence, we obtain a new integral representation for the Bergman functions on the Siegel upper half-space. In the end, as an application, we derive a result that relates the Bergman norm to a "derivative norm", which suggests an alternative definition of the Bloch space and a notion of the Besov spaces over the Siegel upper half-space.
△ Less
Submitted 27 August, 2022;
originally announced August 2022.
-
Harvesting the triplet excitons of quasi-two-dimensional perovskite toward highly efficient white light-emitting diodes
Authors:
Yue Yu,
Chenjing Zhao,
Lin Ma,
Lihe Yan,
Bo Jiao,
Jingrui Li,
Jun Xi,
Jinhai Si,
Yuren Li,
Yanmin Xu,
Hua Dong,
Jingfei Dai,
Fang Yuan,
Peichao Zhu,
Alex K. -Y. Jen,
Zhaoxin Wu
Abstract:
Utilization of triplet excitons, which generally emit poorly, is always fundamental to realize highly efficient organic light-emitting diodes (LEDs). While triplet harvest and energy transfer via electron exchange between triplet donor and acceptor are fully understood in doped organic phosphorescence and delayed fluorescence systems, the utilization and energy transfer of triplet excitons in quas…
▽ More
Utilization of triplet excitons, which generally emit poorly, is always fundamental to realize highly efficient organic light-emitting diodes (LEDs). While triplet harvest and energy transfer via electron exchange between triplet donor and acceptor are fully understood in doped organic phosphorescence and delayed fluorescence systems, the utilization and energy transfer of triplet excitons in quasi-two-dimensional (quasi-2D) perovskite are still ambiguous. Here, we use an orange-phosphorescence-emitting ultrathin organic layer to probe triplet behavior in the sky-blue-emitting quasi-2D perovskite. The delicate white LEDs architecture enables a carefully tailored Dexter-like energy-transfer mode that largely rescues the triplet excitons in quasi-2D perovskite. Our white organic-inorganic LEDs achieve maximum forward-viewing external quantum efficiency of 8.6% and luminance over 15000 cd m-2, exhibiting a significant efficiency enhancement versus the corresponding sky-blue perovskite LED (4.6%). The efficient management of energy transfer between excitons in quasi-2D perovskite and Frenkel excitons in organic layer opens the door to fully utilizing excitons for white organic-inorganic LEDs.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
Characterization of the (Cu,C)Ba$_2$Ca$_3$Cu$_4$O$_{11+δ}$ single crystals grown under high pressure
Authors:
Chengping He,
Xue Ming,
Jin Si,
Xiyu Zhu,
Jinhua Wang,
Hai-Hu Wen
Abstract:
By using high pressure and high temperature (3.7 GPa, 1120 $^{\circ}$C) synthesis technique, we have grown (Cu,C)Ba$_2$Ca$_3$Cu$_4$O$_{11+δ}$ single crystals. X-ray diffraction, scanning electron microscopy, resistivity and magnetization measurements are carried out and all show that the samples have good quality. The single crystal has onset and zero-resistance transition temperatures of about 11…
▽ More
By using high pressure and high temperature (3.7 GPa, 1120 $^{\circ}$C) synthesis technique, we have grown (Cu,C)Ba$_2$Ca$_3$Cu$_4$O$_{11+δ}$ single crystals. X-ray diffraction, scanning electron microscopy, resistivity and magnetization measurements are carried out and all show that the samples have good quality. The single crystal has onset and zero-resistance transition temperatures of about 111 K and 109.6 K, indicating a very narrow transition width, which is consistent with a rather sharp magnetization transition. Magnetization hysteresis loops (MHLs) are also measured, showing a pronounced second peak effect in the intermediate temperature region. The magnetic critical current density calculated from the MHLs at 77 K and 1.5 T is about 6.4$\times$10$^4$ A/cm$^2$. By using a criterion of 1$\%$ normal state resistivity, we have determined the irreversibility line which exhibits an irreversibility field of about 8 T at 77 K. Compared with other layered systems, it is easy to find that the irreversibility line is rather high and could be further improved with the optimized transition temperature of about 118 K as previously discovered in polycrystalline samples.
△ Less
Submitted 22 November, 2021;
originally announced November 2021.