Zum Hauptinhalt springen

Showing 201–250 of 343 results for author: Hsu, W

.
  1. arXiv:1910.07800  [pdf, other

    eess.IV cs.CV

    Organ At Risk Segmentation with Multiple Modality

    Authors: Kuan-Lun Tseng, Winston Hsu, Chun-ting Wu, Ya-Fang Shih, Fan-Yun Sun

    Abstract: With the development of image segmentation in computer vision, biomedical image segmentation have achieved remarkable progress on brain tumor segmentation and Organ At Risk (OAR) segmentation. However, most of the research only uses single modality such as Computed Tomography (CT) scans while in real world scenario doctors often use multiple modalities to get more accurate result. To better levera… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

  2. arXiv:1910.01712  [pdf, other

    cs.CV

    360-Indoor: Towards Learning Real-World Objects in 360° Indoor Equirectangular Images

    Authors: Shih-Han Chou, Cheng Sun, Wen-Yen Chang, Wan-Ting Hsu, Min Sun, Jianlong Fu

    Abstract: While there are several widely used object detection datasets, current computer vision algorithms are still limited in conventional images. Such images narrow our vision in a restricted region. On the other hand, 360° images provide a thorough sight. In this paper, our goal is to provide a standard dataset to facilitate the vision and machine learning communities in 360° domain. To facilitate the… ▽ More

    Submitted 3 October, 2019; originally announced October 2019.

  3. arXiv:1910.00144  [pdf, other

    astro-ph.SR astro-ph.IM

    Real-time solar image classification: assessing spectral, pixel-based approaches

    Authors: J. Marcus Hughes, Vicki W. Hsu, Daniel B. Seaton, Hazel M. Bain, Jonathan M. Darnel, Larisza Krista

    Abstract: In order to utilize solar imagery for real-time feature identification and large-scale data science investigations of solar structures, we need maps of the Sun where phenomena, or themes, are labeled. Since solar imagers produce observations every few minutes, it is not feasible to label all images by hand. Here, we compare three machine learning algorithms performing solar image classification us… ▽ More

    Submitted 30 September, 2019; originally announced October 2019.

  4. arXiv:1909.04495  [pdf, other

    cs.IR cs.CL cs.CR cs.LG

    Natural Adversarial Sentence Generation with Gradient-based Perturbation

    Authors: Yu-Lun Hsieh, Minhao Cheng, Da-Cheng Juan, Wei Wei, Wen-Lian Hsu, Cho-Jui Hsieh

    Abstract: This work proposes a novel algorithm to generate natural language adversarial input for text classification models, in order to investigate the robustness of these models. It involves applying gradient-based perturbation on the sentence embeddings that are used as the features for the classifier, and learning a decoder for generation. We employ this method to a sentiment analysis model and verify… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

  5. Theory of reflectionless scattering modes

    Authors: William R. Sweeney, Chia Wei Hsu, A. Douglas Stone

    Abstract: We develop the theory of a special type of scattering state in which a set of asymptotic channels are chosen as inputs and the complementary set as outputs, and there is zero reflection back into the input channels. In general an infinite number of such solutions exist at discrete complex frequencies. Our results apply to linear electromagnetic and acoustic wave scattering and also to quantum scat… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Journal ref: Phys. Rev. A 102, 063511 (2020)

  6. arXiv:1908.08990  [pdf, other

    cs.CV cs.LG stat.ML

    Deep Multi-Kernel Convolutional LSTM Networks and an Attention-Based Mechanism for Videos

    Authors: Sebastian Agethen, Winston H. Hsu

    Abstract: Action recognition greatly benefits motion understanding in video analysis. Recurrent networks such as long short-term memory (LSTM) networks are a popular choice for motion-aware sequence learning tasks. Recently, a convolutional extension of LSTM was proposed, in which input-to-hidden and hidden-to-hidden transitions are modeled through convolution with a single kernel. This implies an unavoidab… ▽ More

    Submitted 30 July, 2019; originally announced August 2019.

  7. arXiv:1908.08344  [pdf, other

    cs.CV

    Indoor Depth Completion with Boundary Consistency and Self-Attention

    Authors: Yu-Kai Huang, Tsung-Han Wu, Yueh-Cheng Liu, Winston H. Hsu

    Abstract: Depth estimation features are helpful for 3D recognition. Commodity-grade depth cameras are able to capture depth and color image in real-time. However, glossy, transparent or distant surface cannot be scanned properly by the sensor. As a result, enhancement and restoration from sensing depth is an important task. Depth completion aims at filling the holes that sensors fail to detect, which is sti… ▽ More

    Submitted 8 June, 2022; v1 submitted 22 August, 2019; originally announced August 2019.

    Comments: Accepted by ICCVW (RLQ) 2019. The code is available at https://github.com/tsunghan-wu/Depth-Completion

  8. arXiv:1908.00478  [pdf, other

    cs.CV

    A Unified Point-Based Framework for 3D Segmentation

    Authors: Hung-Yueh Chiang, Yen-Liang Lin, Yueh-Cheng Liu, Winston H. Hsu

    Abstract: 3D point cloud segmentation remains challenging for structureless and textureless regions. We present a new unified point-based framework for 3D point cloud segmentation that effectively optimizes pixel-level features, geometrical structures and global context priors of an entire scene. By back-projecting 2D image features into 3D coordinates, our network learns 2D textural appearance and 3D struc… ▽ More

    Submitted 18 August, 2019; v1 submitted 1 August, 2019; originally announced August 2019.

  9. arXiv:1907.12289  [pdf, other

    econ.GN

    Cities and space: Common power laws and spatial fractal structures

    Authors: Tomoya Mori, Tony E. Smith, Wen-Tai Hsu

    Abstract: City size distributions are known to be well approximated by power laws across a wide range of countries. But such distributions are also meaningful at other spatial scales, such as within certain regions of a country. Using data from China, France, Germany, India, Japan, and the US, we first document that large cities are significantly more spaced out than would be expected by chance alone. We ne… ▽ More

    Submitted 29 July, 2019; originally announced July 2019.

    Comments: 7 pages, 6 figures

  10. arXiv:1907.07768  [pdf, other

    cs.IR cs.CR cs.LG cs.SI stat.ML

    A Novel Approach for Detection and Ranking of Trendy and Emerging Cyber Threat Events in Twitter Streams

    Authors: Avishek Bose, Vahid Behzadan, Carlos Aguirre, William H. Hsu

    Abstract: We present a new machine learning and text information extraction approach to detection of cyber threat events in Twitter that are novel (previously non-extant) and developing (marked by significance with respect to similarity with a previously detected event). While some existing approaches to event detection measure novelty and trendiness, typically as independent criteria and occasionally as a… ▽ More

    Submitted 12 July, 2019; originally announced July 2019.

    Comments: 9 pages, 3 figures, and 5 tables

  11. arXiv:1907.04355  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Transfer Learning from Audio-Visual Grounding to Speech Recognition

    Authors: Wei-Ning Hsu, David Harwath, James Glass

    Abstract: Transfer learning aims to reduce the amount of data required to excel at a new task by re-using the knowledge acquired from learning other related tasks. This paper proposes a novel transfer learning scenario, which distills robust phonetic features from grounding models that are trained to tell whether a pair of image and speech are semantically correlated, without using any textual transcripts.… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: Accepted to Interspeech 2019. 4 pages, 2 figures

  12. arXiv:1907.03049  [pdf, other

    cs.CV cs.CL

    Video Question Generation via Cross-Modal Self-Attention Networks Learning

    Authors: Yu-Siang Wang, Hung-Ting Su, Chen-Hsi Chang, Zhe-Yu Liu, Winston H. Hsu

    Abstract: We introduce a novel task, Video Question Generation (Video QG). A Video QG model automatically generates questions given a video clip and its corresponding dialogues. Video QG requires a range of skills -- sentence comprehension, temporal relation, the interplay between vision and language, and the ability to ask meaningful questions. To address this, we propose a novel semantic rich cross-modal… ▽ More

    Submitted 16 February, 2020; v1 submitted 5 July, 2019; originally announced July 2019.

    Comments: Accepted by ICASSP 2020

  13. arXiv:1907.02907  [pdf, other

    stat.ML cs.LG

    Hybridized Threshold Clustering for Massive Data

    Authors: Jianmei Luo, ChandraVyas Annakula, Aruna Sai Kannamareddy, Jasjeet S. Sekhon, William Henry Hsu, Michael Higgins

    Abstract: As the size $n$ of datasets become massive, many commonly-used clustering algorithms (for example, $k$-means or hierarchical agglomerative clustering (HAC) require prohibitive computational cost and memory. In this paper, we propose a solution to these clustering problems by extending threshold clustering (TC) to problems of instance selection. TC is a recently developed clustering algorithm desig… ▽ More

    Submitted 5 July, 2019; originally announced July 2019.

  14. arXiv:1907.01131  [pdf, other

    cs.CV

    Learnable Gated Temporal Shift Module for Deep Video Inpainting

    Authors: Ya-Liang Chang, Zhe Yu Liu, Kuan-Ying Lee, Winston Hsu

    Abstract: How to efficiently utilize temporal information to recover videos in a consistent way is the main issue for video inpainting problems. Conventional 2D CNNs have achieved good performance on image inpainting but often lead to temporally inconsistent results where frames will flicker when applied to videos (see https://www.youtube.com/watch?v=87Vh1HDBjD0&list=PLPoVtv-xp_dL5uckIzz1PKwNjg1yI0I94&index… ▽ More

    Submitted 9 July, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

    Comments: Accepted to BMVC 2019

  15. arXiv:1906.06460  [pdf, other

    physics.optics cond-mat.dis-nn

    Angular memory effect of transmission eigenchannels

    Authors: Hasan Yılmaz, Chia Wei Hsu, Arthur Goetschy, Stefan Bittner, Stefan Rotter, Alexey Yamilov, Hui Cao

    Abstract: The optical memory effect has emerged as a powerful tool for imaging through multiple-scattering media; however, the finite angular range of the memory effect limits the field of view. Here, we demonstrate experimentally that selective coupling of incident light into a high-transmission channel increases the angular memory-effect range. This enhancement is attributed to the robustness of the high-… ▽ More

    Submitted 18 October, 2019; v1 submitted 14 June, 2019; originally announced June 2019.

    Journal ref: Phys. Rev. Lett. 123, 203901 (2019)

  16. arXiv:1906.01126  [pdf, ps, other

    cs.LG cs.AI cs.CR stat.ML

    Sequential Triggers for Watermarking of Deep Reinforcement Learning Policies

    Authors: Vahid Behzadan, William Hsu

    Abstract: This paper proposes a novel scheme for the watermarking of Deep Reinforcement Learning (DRL) policies. This scheme provides a mechanism for the integration of a unique identifier within the policy in the form of its response to a designated sequence of state transitions, while incurring minimal impact on the nominal performance of the policy. The applications of this watermarking scheme include de… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

  17. arXiv:1906.01121  [pdf, ps, other

    cs.LG cs.AI cs.CR stat.ML

    Adversarial Exploitation of Policy Imitation

    Authors: Vahid Behzadan, William Hsu

    Abstract: This paper investigates a class of attacks targeting the confidentiality aspect of security in Deep Reinforcement Learning (DRL) policies. Recent research have established the vulnerability of supervised machine learning models (e.g., classifiers) to model extraction attacks. Such attacks leverage the loosely-restricted ability of the attacker to iteratively query the model for labels, thereby all… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

  18. arXiv:1906.01119  [pdf, ps, other

    cs.LG cs.AI cs.CR stat.ML

    Analysis and Improvement of Adversarial Training in DQN Agents With Adversarially-Guided Exploration (AGE)

    Authors: Vahid Behzadan, William Hsu

    Abstract: This paper investigates the effectiveness of adversarial training in enhancing the robustness of Deep Q-Network (DQN) policies to state-space perturbations. We first present a formal analysis of adversarial training in DQN agents and its performance with respect to the proportion of adversarial perturbations to nominal observations used for training. Next, we consider the sample-inefficiency of cu… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

  19. arXiv:1906.01110  [pdf, ps, other

    cs.LG cs.AI cs.CR eess.SY stat.ML

    RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies

    Authors: Vahid Behzadan, William Hsu

    Abstract: This paper investigates the resilience and robustness of Deep Reinforcement Learning (DRL) policies to adversarial perturbations in the state space. We first present an approach for the disentanglement of vulnerabilities caused by representation learning of DRL agents from those that stem from the sensitivity of the DRL policies to distributional shifts in state transitions. Building on this appro… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

  20. arXiv:1904.10247  [pdf, other

    cs.CV

    Free-form Video Inpainting with 3D Gated Convolution and Temporal PatchGAN

    Authors: Ya-Liang Chang, Zhe Yu Liu, Kuan-Ying Lee, Winston Hsu

    Abstract: Free-form video inpainting is a very challenging task that could be widely used for video editing such as text removal. Existing patch-based methods could not handle non-repetitive structures such as faces, while directly applying image-based inpainting models to videos will result in temporal inconsistency (see http://bit.ly/2Fu1n6b ). In this paper, we introduce a deep learn-ing based free-form… ▽ More

    Submitted 23 July, 2019; v1 submitted 23 April, 2019; originally announced April 2019.

    Comments: Accepted to ICCV 2019

  21. arXiv:1904.09722  [pdf, other

    cs.CV

    FishNet: A Camera Localizer using Deep Recurrent Networks

    Authors: Hsin-I Chen, Sebastian Agethen, Chiamin Wu, Winston Hsu, Bing-Yu Chen

    Abstract: This paper proposes a robust localization system that employs deep learning for better scene representation, and enhances the accuracy of 6-DOF camera pose estimation. Inspired by the fact that global scene structure can be revealed by wide field-of-view, we leverage the large overlap of a fisheye camera between adjacent frames, and the powerful high-level feature representations of deep learning.… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

  22. arXiv:1904.06726  [pdf, other

    cs.CV

    VORNet: Spatio-temporally Consistent Video Inpainting for Object Removal

    Authors: Ya-Liang Chang, Zhe Yu Liu, Winston Hsu

    Abstract: Video object removal is a challenging task in video processing that often requires massive human efforts. Given the mask of the foreground object in each frame, the goal is to complete (inpaint) the object region and generate a video without the target object. While recently deep learning based methods have achieved great success on the image inpainting task, they often lead to inconsistent result… ▽ More

    Submitted 14 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPRW 2019

  23. arXiv:1904.03240  [pdf, ps, other

    cs.CL cs.LG cs.SD eess.AS

    An Unsupervised Autoregressive Model for Speech Representation Learning

    Authors: Yu-An Chung, Wei-Ning Hsu, Hao Tang, James Glass

    Abstract: This paper proposes a novel unsupervised autoregressive neural model for learning generic speech representations. In contrast to other speech representation learning methods that aim to remove noise or speaker variabilities, ours is designed to preserve information for a wide range of downstream tasks. In addition, the proposed model does not require any phonetic or word boundary labels, allowing… ▽ More

    Submitted 18 June, 2019; v1 submitted 5 April, 2019; originally announced April 2019.

    Comments: Accepted to Interspeech 2019. Code available at: https://github.com/iamyuanchung/Autoregressive-Predictive-Coding

  24. arXiv:1904.02032  [pdf, other

    cs.IR cs.CL

    OpBerg: Discovering causal sentences using optimal alignments

    Authors: Justin Wood, Nicholas J. Matiasz, Alcino J. Silva, William Hsu, Alexej Abyzov, Wei Wang

    Abstract: The biological literature is rich with sentences that describe causal relations. Methods that automatically extract such sentences can help biologists to synthesize the literature and even discover latent relations that had not been articulated explicitly. Current methods for extracting causal sentences are based on either machine learning or a predefined database of causal terms. Machine learning… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

  25. arXiv:1903.02157  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Tailoring excitonic states of van der Waals bilayers through stacking configuration, band alignment and valley-spin

    Authors: Wei-Ting Hsu, Bo-Han Lin, Li-Syuan Lu, Ming-Hao Lee, Ming-Wen Chu, Lain-Jong Li, Wang Yao, Wen-Hao Chang, Chih-Kang Shih

    Abstract: Excitons in monolayer semiconductors have large optical transition dipole for strong coupling with light field. Interlayer excitons in heterobilayers, with layer separation of electron and hole components, feature large electric dipole that enables strong coupling with electric field and exciton-exciton interaction, at the cost that the optical dipole is substantially quenched (by several orders o… ▽ More

    Submitted 5 March, 2019; originally announced March 2019.

  26. arXiv:1903.01214  [pdf

    cs.CV

    Understanding the Mechanism of Deep Learning Framework for Lesion Detection in Pathological Images with Breast Cancer

    Authors: Wei-Wen Hsu, Chung-Hao Chen, Chang Hoa, Yu-Ling Hou, Xiang Gao, Yun Shao, Xueli Zhang, Jingjing Wang, Tao He, Yanghong Tai

    Abstract: The computer-aided detection (CADe) systems are developed to assist pathologists in slide assessment, increasing diagnosis efficiency and reducing missing inspections. Many studies have shown such a CADe system with deep learning approaches outperforms the one using conventional methods that rely on hand-crafted features based on field-knowledge. However, most developers who adopted deep learning… ▽ More

    Submitted 4 March, 2019; originally announced March 2019.

    Comments: v1

  27. arXiv:1902.08295  [pdf, other

    cs.LG stat.ML

    Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

    Authors: Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob , et al. (66 additional authors not shown)

    Abstract: Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly w… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

  28. Thermodynamics of $f(R)$ Gravity with Disformal Transformation

    Authors: Chao-Qiang Geng, Wei-Cheng Hsu, Jhih-Rong Lu, Ling-Wei Luo

    Abstract: We study thermodynamics in $f(R)$ gravity with the disformal transformation. The transformation applied to the matter Lagrangian has the form of $\g_{\m\n} = A(φ,X)g_{\m\n} + B(φ,X)\pa_\m\f\pa_\n\f$ with the assumption of the Minkowski matter metric $\g_{\m\n} = \e_{\m\n}$, where $φ$ is the disformal scalar and $X$ is the corresponding kinetic term of $φ$. We verify the generalized first and secon… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

    Comments: 23 pages, no figure, published version in Entropy 21, 172 (2019)

    Journal ref: Entropy 21, 172 (2019)

  29. Revised JNLPBA Corpus: A Revised Version of Biomedical NER Corpus for Relation Extraction Task

    Authors: Ming-Siang Huang, Po-Ting Lai, Richard Tzong-Han Tsai, Wen-Lian Hsu

    Abstract: The advancement of biomedical named entity recognition (BNER) and biomedical relation extraction (BRE) researches promotes the development of text mining in biological domains. As a cornerstone of BRE, robust BNER system is required to identify the mentioned NEs in plain texts for further relation extraction stage. However, the current BNER corpora, which play important roles in these tasks, paid… ▽ More

    Submitted 29 January, 2019; originally announced January 2019.

    Comments: 17 pages

    Journal ref: Briefings in Bioinformatics, 2020, bbaa054

  30. Bound states in the continuum through environmental design

    Authors: Alexander Cerjan, Chia Wei Hsu, Mikael C. Rechtsman

    Abstract: We propose a new paradigm for realizing bound states in the continuum (BICs) by engineering the environment of a system to control the number of available radiation channels. Using this method, we demonstrate that a photonic crystal slab embedded in a photonic crystal environment can exhibit both isolated points and lines of BICs in different regions of its Brillouin zone. Finally, we demonstrate… ▽ More

    Submitted 21 January, 2019; originally announced January 2019.

    Comments: 7 pages, 3 figures

    Journal ref: Phys. Rev. Lett. 123, 023902 (2019)

  31. arXiv:1811.02629  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

    Authors: Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, Marcel Prastawa, Esther Alberts, Jana Lipkova, John Freymann, Justin Kirby, Michel Bilello, Hassan Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Benedikt Wiestler, Rivka Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko , et al. (402 additional authors not shown)

    Abstract: Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles dissem… ▽ More

    Submitted 23 April, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: The International Multimodal Brain Tumor Segmentation (BraTS) Challenge

  32. Spatio-temporal correlations in multimode fibers for pulse delivery

    Authors: Wen Xiong, Chia Wei Hsu, Hui Cao

    Abstract: Long-range speckle correlations play an essential role in wave transport through disordered media, but have rarely been studied in other complex systems. Here we discover spatio-temporal intensity correlations for an optical pulse propagating through a multimode fiber with strong random mode coupling. Positive long-range correlations arise from multiple scattering in fiber mode space and depend on… ▽ More

    Submitted 6 November, 2018; originally announced November 2018.

    Journal ref: Nature Communications 10, 2973 (2019)

  33. arXiv:1811.02328  [pdf, other

    cs.CV

    Super-Identity Convolutional Neural Network for Face Hallucination

    Authors: Kaipeng Zhang, Zhanpeng Zhang, Chia-Wen Cheng, Winston H. Hsu, Yu Qiao, Wei Liu, Tong Zhang

    Abstract: Face hallucination is a generative task to super-resolve the facial image with low resolution while human perception of face heavily relies on identity information. However, previous face hallucination approaches largely ignore facial identity recovery. This paper proposes Super-Identity Convolutional Neural Network (SICNN) to recover identity information for generating faces closed to the real id… ▽ More

    Submitted 6 November, 2018; originally announced November 2018.

    Comments: Published in ECCV 2018

  34. arXiv:1810.11166  [pdf, other

    cond-mat.mtrl-sci

    Essential properties of Li/Li$^+$ graphite intercalation compounds

    Authors: Shih-Yang Lin, Wei-Bang Li, Ngoc Thanh Thuy Tran, Wen-Dung Hsu, Hsin-Yi Liu, Ming Fa-Lin

    Abstract: The essential properties of graphite-based 3D systems are thoroughly investigated by the first-principles method. Such materials cover a simple hexagonal graphite, a Bernal graphite, and the stage-1 to stage-4 Li/Li$^+$ graphite intercalation compounds. The delicate calculations and the detailed analyses are done for their optimal stacking configurations, bong lengths, interlayer distances, free e… ▽ More

    Submitted 25 October, 2018; originally announced October 2018.

  35. arXiv:1810.07217  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Hierarchical Generative Modeling for Controllable Speech Synthesis

    Authors: Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang

    Abstract: This paper proposes a neural sequence-to-sequence text-to-speech (TTS) model which can control latent attributes in the generated speech that are rarely annotated in the training data, such as speaking style, accent, background noise, and recording conditions. The model is formulated as a conditional generative model based on the variational autoencoder (VAE) framework, with two levels of hierarch… ▽ More

    Submitted 27 December, 2018; v1 submitted 16 October, 2018; originally announced October 2018.

    Comments: 27 pages, accepted to ICLR 2019

  36. Scattering concentration bounds: Brightness theorems for waves

    Authors: Hanwen Zhang, Chia Wei Hsu, Owen D. Miller

    Abstract: The brightness theorem---brightness is nonincreasing in passive systems---is a foundational conservation law, with applications ranging from photovoltaics to displays, yet it is restricted to the field of ray optics. For general linear wave scattering, we show that power per scattering channel generalizes brightness, and we derive power-concentration bounds for systems of arbitrary coherence. The… ▽ More

    Submitted 2 September, 2019; v1 submitted 5 October, 2018; originally announced October 2018.

    Journal ref: Optica 6, 1321 (2019)

  37. arXiv:1809.04458  [pdf, other

    eess.AS cs.CL cs.LG

    Unsupervised Representation Learning of Speech for Dialect Identification

    Authors: Suwon Shon, Wei-Ning Hsu, James Glass

    Abstract: In this paper, we explore the use of a factorized hierarchical variational autoencoder (FHVAE) model to learn an unsupervised latent representation for dialect identification (DID). An FHVAE can learn a latent space that separates the more static attributes within an utterance from the more dynamic attributes by encoding them into two different sets of latent variables. Useful factors for dialect… ▽ More

    Submitted 12 September, 2018; originally announced September 2018.

    Comments: Accepted at SLT 2018

  38. arXiv:1809.01540  [pdf

    cs.CR

    Fail-Stop Group Signature Scheme

    Authors: Yi-Yuan Chiang, Wang-Hsin Hsu, Wen-Yen Lin, Jonathan Jen-Rong Chen

    Abstract: In this paper, we propose a Fail-Stop Group Signature Scheme (FSGSS). FSGSS combines the features of the Group Signature and the Fail-Stop Signature to enhance the security level of the original Group Signature. Assuming that the FSGSS encounters an attack by a hacker armed with a supercomputer, this scheme can prove that the digital signature is indeed forged. Based on the above objectives, this… ▽ More

    Submitted 5 September, 2018; originally announced September 2018.

    Comments: 11 pages, 2 figures, 1 table

  39. arXiv:1808.10128  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis

    Authors: Yu-An Chung, Yuxuan Wang, Wei-Ning Hsu, Yu Zhang, RJ Skerry-Ryan

    Abstract: Although end-to-end text-to-speech (TTS) models such as Tacotron have shown excellent results, they typically require a sizable set of high-quality <text, audio> pairs for training, which are expensive to collect. In this paper, we propose a semi-supervised training framework to improve the data efficiency of Tacotron. The idea is to allow Tacotron to utilize textual and acoustic knowledge contain… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

  40. arXiv:1807.11037  [pdf, other

    cs.CV

    Efficient Uncertainty Estimation for Semantic Segmentation in Videos

    Authors: Po-Yu Huang, Wan-Ting Hsu, Chun-Yueh Chiu, Ting-Fan Wu, Min Sun

    Abstract: Uncertainty estimation in deep learning becomes more important recently. A deep learning model can't be applied in real applications if we don't know whether the model is certain about the decision or not. Some literature proposes the Bayesian neural network which can estimate the uncertainty by Monte Carlo Dropout (MC dropout). However, MC dropout needs to forward the model $N$ times which result… ▽ More

    Submitted 29 July, 2018; originally announced July 2018.

    Comments: 16 pages. ECCV 2018

  41. arXiv:1807.08805  [pdf, other

    physics.optics cond-mat.mes-hall

    Perfectly absorbing exceptional points and chiral absorbers

    Authors: William R. Sweeney, Chia Wei Hsu, Stefan Rotter, A. Douglas Stone

    Abstract: We identify a new kind of physically realizable exceptional point (EP) corresponding to degenerate coherent perfect absorption, in which two purely incoming solutions of the wave operator for electromagnetic or acoustic waves coalesce to a single state. Such non-hermitian degeneracies can occur at a real-valued frequency without any associated noise or non-linearity, in contrast to EPs in lasers.… ▽ More

    Submitted 25 July, 2018; v1 submitted 23 July, 2018; originally announced July 2018.

    Comments: 6 pages main text, 4 supplemental; 3 figures

    Journal ref: Phys. Rev. Lett. 122, 093901 (2019)

  42. arXiv:1807.06821  [pdf, other

    cs.CV

    Computed Tomography Image Enhancement using 3D Convolutional Neural Network

    Authors: Meng Li, Shiwen Shen, Wen Gao, William Hsu, Jason Cong

    Abstract: Computed tomography (CT) is increasingly being used for cancer screening, such as early detection of lung cancer. However, CT studies have varying pixel spacing due to differences in acquisition parameters. Thick slice CTs have lower resolution, hindering tasks such as nodule characterization during computer-aided detection due to partial volume effect. In this study, we propose a novel 3D enhance… ▽ More

    Submitted 18 July, 2018; originally announced July 2018.

  43. arXiv:1806.11149  [pdf, other

    physics.optics cond-mat.dis-nn

    Statistical Description of Transport in Multimode Fibers with Mode-Dependent Loss

    Authors: P. Chiarawongse, H. Li, W. Xiong, C. W. Hsu, H. Cao, T. Kottos

    Abstract: We analyze coherent wave transport in a new physical setting associated with multimode wave systems where reflection is completely suppressed and mode-dependent losses together with mode-mixing are dictating the wave propagation. An additional physical constraint is the fact that in realistic circumstances the access to the scattering (or transmission) matrix is incomplete. We have addressed all t… ▽ More

    Submitted 28 June, 2018; originally announced June 2018.

    Journal ref: New Journal of Physics 20, 113028 (2018)

  44. arXiv:1806.04872  [pdf, other

    cs.CL cs.LG cs.NE cs.SD eess.AS

    Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition

    Authors: Wei-Ning Hsu, Hao Tang, James Glass

    Abstract: The current trend in automatic speech recognition is to leverage large amounts of labeled data to train supervised neural network models. Unfortunately, obtaining data for a wide range of domains to train robust models can be costly. However, it is relatively inexpensive to collect large amounts of unlabeled data from domains that we want the models to generalize to. In this paper, we propose a no… ▽ More

    Submitted 13 June, 2018; originally announced June 2018.

    Comments: to appear in Interspeech 2018

  45. arXiv:1806.04841  [pdf, ps, other

    cs.CL cs.LG cs.SD eess.AS

    A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition

    Authors: Hao Tang, Wei-Ning Hsu, Francois Grondin, James Glass

    Abstract: Speech recognizers trained on close-talking speech do not generalize to distant speech and the word error rate degradation can be as large as 40% absolute. Most studies focus on tackling distant speech recognition as a separate problem, leaving little effort to adapting close-talking speech recognizers to distant speech. In this work, we review several approaches from a domain adaptation perspecti… ▽ More

    Submitted 13 June, 2018; originally announced June 2018.

    Comments: Interspeech, 2018

  46. Transverse localization of transmission eigenchannels

    Authors: Hasan Yılmaz, Chia Wei Hsu, Alexey Yamilov, Hui Cao

    Abstract: Transmission eigenchannels are building blocks of coherent wave transport in diffusive media, and selective excitation of individual eigenchannels can lead to diverse transport behavior. An essential yet poorly understood property is the transverse spatial profile of each eigenchannel, which is critical for coupling into and out of it. Here, we discover that the transmission eigenchannels of a dis… ▽ More

    Submitted 7 June, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

    Journal ref: Nat. Photonics Vol. 13, pp. 352-358 (2019)

  47. arXiv:1806.01116  [pdf, other

    cs.DC cs.LG

    Machine Learning for Predictive Analytics of Compute Cluster Jobs

    Authors: Dan Andresen, William Hsu, Huichen Yang, Adedolapo Okanlawon

    Abstract: We address the problem of predicting whether sufficient memory and CPU resources have been requested for jobs at submission time. For this purpose, we examine the task of training a supervised machine learning system to predict the outcome - whether the job will fail specifically due to insufficient resources - as a classification task. Sufficiently high accuracy, precision, and recall at this tas… ▽ More

    Submitted 19 May, 2018; originally announced June 2018.

    Comments: 7 pages, CSC'18 - The 16th Int'l Conf on Scientific Computing

  48. arXiv:1806.00712  [pdf, ps, other

    cs.CV cs.AI

    An Interpretable Deep Hierarchical Semantic Convolutional Neural Network for Lung Nodule Malignancy Classification

    Authors: Shiwen Shen, Simon X. Han, Denise R. Aberle, Alex A. T. Bui, Willliam Hsu

    Abstract: While deep learning methods are increasingly being applied to tasks such as computer-aided diagnosis, these models are difficult to interpret, do not incorporate prior domain knowledge, and are often considered as a "black-box." The lack of model interpretability hinders them from being fully understood by target users such as radiologists. In this paper, we present a novel interpretable deep hier… ▽ More

    Submitted 2 June, 2018; originally announced June 2018.

  49. Quantum Noise Theory of Exceptional Point Sensors

    Authors: Mengzhen Zhang, William Sweeney, Chia Wei Hsu, Lan Yang, A. D. Stone, Liang Jiang

    Abstract: Distinct from closed quantum systems, non-Hermitian system can have exceptional points (EPs) where both eigenvalues and eigenvectors coalesce. Recently, it has been proposed and demonstrated that EPs can enhance the performance of sensors in terms of amplification of detected signal. Meanwhile, the noise might also be amplified at EPs and it is not obvious whether exceptional points will still imp… ▽ More

    Submitted 25 January, 2019; v1 submitted 30 May, 2018; originally announced May 2018.

    Comments: 5 pages, 2 figures

    Journal ref: Phys. Rev. Lett. 123, 180501 (2019)

  50. arXiv:1805.11264  [pdf, other

    stat.ML cs.CL cs.LG cs.SD eess.AS

    Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data

    Authors: Wei-Ning Hsu, James Glass

    Abstract: Multimodal sensory data resembles the form of information perceived by humans for learning, and are easy to obtain in large quantities. Compared to unimodal data, synchronization of concepts between modalities in such data provides supervision for disentangling the underlying explanatory factors of each modality. Previous work leveraging multimodal data has mainly focused on retaining only the mod… ▽ More

    Submitted 29 May, 2018; originally announced May 2018.