Zum Hauptinhalt springen

Showing 151–200 of 270 results for author: Gan, Z

.
  1. arXiv:2009.14167  [pdf, other

    cs.CL cs.LG

    Contrastive Distillation on Intermediate Representations for Language Model Compression

    Authors: Siqi Sun, Zhe Gan, Yu Cheng, Yuwei Fang, Shuohang Wang, Jingjing Liu

    Abstract: Existing language model compression methods mostly use a simple L2 loss to distill knowledge in the intermediate representations of a large BERT model to a smaller one. Although widely used, this objective by design assumes that all the dimensions of hidden representations are independent, failing to capture important structural knowledge in the intermediate layers of the teacher network. To achie… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: Accepted by EMNLP 2020

  2. arXiv:2009.06097  [pdf, other

    cs.CL

    Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding

    Authors: Shuohang Wang, Luowei Zhou, Zhe Gan, Yen-Chun Chen, Yuwei Fang, Siqi Sun, Yu Cheng, Jingjing Liu

    Abstract: Transformer has become ubiquitous in the deep learning field. One of the key ingredients that destined its success is the self-attention mechanism, which allows fully-connected contextual encoding over input tokens. However, despite its effectiveness in modeling short sequences, self-attention suffers when handling inputs with extreme long-range dependencies, as its complexity grows quadratically… ▽ More

    Submitted 7 June, 2021; v1 submitted 13 September, 2020; originally announced September 2020.

    Comments: ACL Findings 2021, 11 pages

  3. arXiv:2009.05167  [pdf, other

    cs.CL

    Accelerating Real-Time Question Answering via Question Generation

    Authors: Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu, Chenguang Zhu

    Abstract: Although deep neural networks have achieved tremendous success for question answering (QA), they are still suffering from heavy computational and energy cost for real product deployment. Further, existing QA systems are bottlenecked by the encoding time of real-time questions with neural networks, thus suffering from detectable latency in deployment for large-volume traffic. To reduce the computat… ▽ More

    Submitted 1 September, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

  4. arXiv:2009.05166  [pdf, other

    cs.CL

    FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding

    Authors: Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu

    Abstract: Large-scale cross-lingual language models (LM), such as mBERT, Unicoder and XLM, have achieved great success in cross-lingual representation learning. However, when applied to zero-shot cross-lingual transfer tasks, most existing methods use only single-language input for LM finetuning, without leveraging the intrinsic cross-lingual alignment between different languages that proves essential for m… ▽ More

    Submitted 15 December, 2020; v1 submitted 10 September, 2020; originally announced September 2020.

    Comments: Accepted to AAAI 2021; Top-1 Performance on XTREME (https://sites.research.google/xtreme, September 8, 2020) and XGLUE (https://microsoft.github.io/XGLUE, September 14, 2020) benchmark

  5. arXiv:2006.14744  [pdf, other

    cs.CL cs.CV cs.LG

    Graph Optimal Transport for Cross-Domain Alignment

    Authors: Liqun Chen, Zhe Gan, Yu Cheng, Linjie Li, Lawrence Carin, Jingjing Liu

    Abstract: Cross-domain alignment between two sets of entities (e.g., objects in an image, words in a sentence) is fundamental to both computer vision and natural language processing. Existing methods mainly focus on designing advanced attention mechanisms to simulate soft alignment, with no training signals to explicitly encourage alignment. The learned attention matrices are also dense and lacks interpreta… ▽ More

    Submitted 24 July, 2020; v1 submitted 25 June, 2020; originally announced June 2020.

    Journal ref: ICML 2020

  6. arXiv:2006.12013  [pdf, other

    cs.LG stat.ML

    CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information

    Authors: Pengyu Cheng, Weituo Hao, Shuyang Dai, Jiachang Liu, Zhe Gan, Lawrence Carin

    Abstract: Mutual information (MI) minimization has gained considerable interests in various machine learning tasks. However, estimating and minimizing MI in high-dimensional spaces remains a challenging problem, especially when only samples, rather than distribution forms, are accessible. Previous works mainly focus on MI lower bound approximation, which is not applicable to MI minimization problems. In thi… ▽ More

    Submitted 23 July, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: Accepted by the 37th International Conference on Machine Learing (ICML2020)

  7. arXiv:2006.11918  [pdf, ps, other

    cs.LG stat.ML

    MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients

    Authors: Chen Zhu, Yu Cheng, Zhe Gan, Furong Huang, Jingjing Liu, Tom Goldstein

    Abstract: Adaptive gradient methods such as RMSProp and Adam use exponential moving estimate of the squared gradient to compute adaptive step sizes, achieving better convergence than SGD in face of noisy objectives. However, Adam can have undesirable convergence behaviors due to unstable or extreme adaptive learning rates. Methods such as AMSGrad and AdaBound have been proposed to stabilize the adaptive lea… ▽ More

    Submitted 4 July, 2021; v1 submitted 21 June, 2020; originally announced June 2020.

    Comments: ECML PKDD 2021

  8. arXiv:2006.06195  [pdf, other

    cs.CV cs.CL cs.LG

    Large-Scale Adversarial Training for Vision-and-Language Representation Learning

    Authors: Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng, Jingjing Liu

    Abstract: We present VILLA, the first known effort on large-scale adversarial training for vision-and-language (V+L) representation learning. VILLA consists of two training stages: (i) task-agnostic adversarial pre-training; followed by (ii) task-specific adversarial finetuning. Instead of adding adversarial perturbations on image pixels and textual tokens, we propose to perform adversarial training in the… ▽ More

    Submitted 22 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020 Spotlight paper

  9. arXiv:2006.03315  [pdf, other

    cs.CV cs.LG eess.IV

    Multi-modal Feature Fusion with Feature Attention for VATEX Captioning Challenge 2020

    Authors: Ke Lin, Zhuoxin Gan, Liwei Wang

    Abstract: This report describes our model for VATEX Captioning Challenge 2020. First, to gather information from multiple domains, we extract motion, appearance, semantic and audio features. Then we design a feature attention module to attend on different feature when decoding. We apply two types of decoders, top-down and X-LAN and ensemble these models to get the final result. The proposed method outperfor… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.

  10. arXiv:2005.07310  [pdf, other

    cs.CV cs.CL

    Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

    Authors: Jize Cao, Zhe Gan, Yu Cheng, Licheng Yu, Yen-Chun Chen, Jingjing Liu

    Abstract: Recent Transformer-based large-scale pre-trained models have revolutionized vision-and-language (V+L) research. Models such as ViLBERT, LXMERT and UNITER have significantly lifted state of the art across a wide range of V+L benchmarks with joint image-text pre-training. However, little is known about the inner mechanisms that destine their impressive success. To reveal the secrets behind the scene… ▽ More

    Submitted 18 July, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: Accepted by ECCV 2020 as Spotlight

  11. arXiv:2005.04201  [pdf

    physics.app-ph cond-mat.mtrl-sci physics.optics

    Scalable functionalization of optical fibers using atomically thin semiconductors

    Authors: Gia Quyet Ngo, Antony George, Robin Tristan Klaus Schock, Alessandro Tuniz, Emad Najafidehaghani, Ziyang Gan, Nils C. Geib, Tobias Bucher, Heiko Knopf, Christof Neumann, Tilman Lühder, Stephen Warren-Smith, Heike Ebendorff-Heidepriem, Thomas Pertsch, Markus A. Schmidt, Andrey Turchanin, Falk Eilenberger

    Abstract: Atomically thin transition metal dichalcogenides are highly promising for integrated optoelectronic and photonic systems due to their exciton-driven linear and nonlinear interaction with light. Integrating them into optical fibers yields novel opportunities in optical communication, remote sensing, and all-fiber optoelectronics. However, scalable and reproducible deposition of high quality monolay… ▽ More

    Submitted 2 September, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

    Journal ref: Adv. Mater. 2020, 2003826

  12. arXiv:2005.01279  [pdf, other

    cs.CL cs.LG

    Improving Adversarial Text Generation by Modeling the Distant Future

    Authors: Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Dinghan Shen, Guoyin Wang, Zheng Wen, Lawrence Carin

    Abstract: Auto-regressive text generation models usually focus on local fluency, and may cause inconsistent semantic meaning in long text generation. Further, automatically generating words with similar semantics is challenging, and hand-crafted linguistic rules are difficult to apply. We consider a text planning scheme and present a model-based imitation-learning approach to alleviate the aforementioned is… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: ACL 2020. arXiv admin note: substantial text overlap with arXiv:1811.00696

  13. arXiv:2005.00558  [pdf, other

    cs.CL cs.AI cs.LG

    POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training

    Authors: Yizhe Zhang, Guoyin Wang, Chunyuan Li, Zhe Gan, Chris Brockett, Bill Dolan

    Abstract: Large-scale pre-trained language models, such as BERT and GPT-2, have achieved excellent performance in language representation learning and free-form text generation. However, these models cannot be directly employed to generate text under specified lexical constraints. To address this challenge, we present POINTER (PrOgressive INsertion-based TransformER), a simple yet novel insertion-based appr… ▽ More

    Submitted 26 September, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: EMNLP 2020 long paper

  14. arXiv:2005.00200  [pdf, other

    cs.CV cs.CL cs.LG

    HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training

    Authors: Linjie Li, Yen-Chun Chen, Yu Cheng, Zhe Gan, Licheng Yu, Jingjing Liu

    Abstract: We present HERO, a novel framework for large-scale video+language omni-representation learning. HERO encodes multimodal inputs in a hierarchical structure, where local context of a video frame is captured by a Cross-modal Transformer via multimodal fusion, and global video context is captured by a Temporal Transformer. In addition to standard Masked Language Modeling (MLM) and Masked Frame Modelin… ▽ More

    Submitted 29 September, 2020; v1 submitted 30 April, 2020; originally announced May 2020.

    Comments: Accepted by EMNLP 2020

  15. arXiv:2005.00136  [pdf, other

    cs.CL cs.LG

    Contextual Text Style Transfer

    Authors: Yu Cheng, Zhe Gan, Yizhe Zhang, Oussama Elachqar, Dianqi Li, Jingjing Liu

    Abstract: We introduce a new task, Contextual Text Style Transfer - translating a sentence into a desired style with its surrounding context taken into account. This brings two key challenges to existing style transfer approaches: ($i$) how to preserve the semantic meaning of target sentence and its consistency with surrounding context during transfer; ($ii$) how to train a robust model with limited labeled… ▽ More

    Submitted 30 April, 2020; originally announced May 2020.

  16. arXiv:2005.00117  [pdf

    physics.app-ph physics.comp-ph

    Discovering universal scaling laws in 3D printing of metals with genetic programming and dimensional analysis

    Authors: Zhengtao Gan, Orion L. Kafka, Niranjan Parab, Cang Zhao, Olle Heinonen, Tao Sun, Wing Liu

    Abstract: We leverage dimensional analysis and genetic programming (a type of machine learning) to discover two strikingly simple but universal scaling laws, which remain accurate for different materials, processing conditions, and machines in metal three-dimensional (3D) printing. The first one is extracted from high-fidelity high-speed synchrotron X-ray imaging, and defines a new dimensionless number, Key… ▽ More

    Submitted 27 May, 2020; v1 submitted 30 April, 2020; originally announced May 2020.

  17. arXiv:2005.00054  [pdf, other

    cs.LG stat.ML

    APo-VAE: Text Generation in Hyperbolic Space

    Authors: Shuyang Dai, Zhe Gan, Yu Cheng, Chenyang Tao, Lawrence Carin, Jingjing Liu

    Abstract: Natural language often exhibits inherent hierarchical structure ingrained with complex syntax and semantics. However, most state-of-the-art deep generative models learn embeddings only in Euclidean vector space, without accounting for this structural property of language. In this paper, we investigate text generation in a hyperbolic latent space to learn continuous hierarchical representations. An… ▽ More

    Submitted 14 July, 2021; v1 submitted 30 April, 2020; originally announced May 2020.

  18. arXiv:2003.11690  [pdf, other

    cs.CV

    BachGAN: High-Resolution Image Synthesis from Salient Object Layout

    Authors: Yandong Li, Yu Cheng, Zhe Gan, Licheng Yu, Liqiang Wang, Jingjing Liu

    Abstract: We propose a new task towards more practical application for image generation - high-quality image synthesis from salient object layout. This new setting allows users to provide the layout of salient objects only (i.e., foreground bounding boxes and categories), and lets the model complete the drawing with an invented background and a matching foreground. Two main challenges spring from this new t… ▽ More

    Submitted 27 March, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

    Comments: Accepted to CVPR 2020

  19. arXiv:2003.11618  [pdf, other

    cs.CV cs.AI cs.CL

    VIOLIN: A Large-Scale Dataset for Video-and-Language Inference

    Authors: Jingzhou Liu, Wenhu Chen, Yu Cheng, Zhe Gan, Licheng Yu, Yiming Yang, Jingjing Liu

    Abstract: We introduce a new task, Video-and-Language Inference, for joint multimodal understanding of video and text. Given a video clip with aligned subtitles as premise, paired with a natural language hypothesis based on the video content, a model needs to infer whether the hypothesis is entailed or contradicted by the given video clip. A new large-scale dataset, named Violin (VIdeO-and-Language INferenc… ▽ More

    Submitted 25 March, 2020; originally announced March 2020.

    Comments: Accepted to CVPR2020

  20. arXiv:2001.06944  [pdf, other

    cs.CL cs.LG

    Nested-Wasserstein Self-Imitation Learning for Sequence Generation

    Authors: Ruiyi Zhang, Changyou Chen, Zhe Gan, Zheng Wen, Wenlin Wang, Lawrence Carin

    Abstract: Reinforcement learning (RL) has been widely studied for improving sequence-generation models. However, the conventional rewards used for RL training typically cannot capture sufficient semantic information and therefore render model bias. Further, the sparse and delayed rewards make RL exploration inefficient. To alleviate these issues, we propose the concept of nested-Wasserstein distance for dis… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

    Comments: Accepted by AISTATS2020

  21. arXiv:1912.06147  [pdf, other

    astro-ph.GA

    Infra-Red Emission from Cold Gas Dusty Disks in Massive Ellipticals

    Authors: Zhaoming Gan, Brandon S. Hensley, Jeremiah P. Ostriker, Luca Ciotti, David Schiminovich, Silvia Pellegrini

    Abstract: What is the expected infrared output of elliptical galaxies? Here we report the latest findings obtained in this high time resolution (~10 years) and high spatial resolution (2.5 parsec at center) study. We add a set of grain physics to the MACER code, including (a) dust grains made in passive stellar evolution; (b) dust grain growth due to collision and sticking; (c) grain destruction due to ther… ▽ More

    Submitted 17 August, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

    Comments: 20 pages, 15 figures; accepted by ApJ

  22. arXiv:1912.05857  [pdf, other

    astro-ph.GA astro-ph.HE

    Metal abundances in the MACER simulations of the hot interstellar medium

    Authors: S. Pellegrini, Z. Gan, J. P. Ostriker, L. Ciotti

    Abstract: A hot plasma is the dominant phase of the interstellar medium of early-type galaxies. Its origin can reside in stellar mass losses, residual gas from the formation epoch, and accretion from outside of the galaxies. Its evolution is linked to the dynamical structure of the host galaxy, to the supernova and AGN feedback, and to (late-epoch) star formation, in a way that has yet to be fully understoo… ▽ More

    Submitted 12 December, 2019; originally announced December 2019.

    Comments: 6 pages, 5 figures, Proceedings of the XMM-Newton 2019 science workshop "Astrophysics of hot plasma in extended X-ray sources", to be published in Astron. Nachr

  23. Hot Gas Flows on Parsec Scale in the Low-Luminosity Active Galactic Nucleus NGC 3115

    Authors: Zhiyuan Yao, Zhaoming Gan

    Abstract: NGC 3115 is known as the low-luminosity active galactic nucleus which hosts the nearest ($z\sim0.002$) billion solar mass supermassive black hole ($\sim1.5\times10^9~M_\odot$). Its Bondi radius $r_\mathrm{B}$ ($\sim3\farcs6$) can be readily resolved with Chandra, which offers us an excellent opportunity to investigate the accretion flow onto a supermassive black hole. In this paper, we perform two… ▽ More

    Submitted 6 December, 2019; originally announced December 2019.

    Comments: 13 pages, 8 figures. Accepted for publication by MNRAS

  24. arXiv:1911.08709  [pdf, other

    cs.LG stat.ML

    Graph-Driven Generative Models for Heterogeneous Multi-Task Learning

    Authors: Wenlin Wang, Hongteng Xu, Zhe Gan, Bai Li, Guoyin Wang, Liqun Chen, Qian Yang, Wenqi Wang, Lawrence Carin

    Abstract: We propose a novel graph-driven generative model, that unifies multiple heterogeneous learning tasks into the same framework. The proposed model is based on the fact that heterogeneous learning tasks, which correspond to different generative processes, often rely on data with a shared graph structure. Accordingly, our model combines a graph convolutional network (GCN) with multiple variational aut… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Comments: Accepted by AAAI-2020

  25. arXiv:1911.03829  [pdf, other

    cs.CL cs.LG

    Distilling Knowledge Learned in BERT for Text Generation

    Authors: Yen-Chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu

    Abstract: Large-scale pre-trained language model such as BERT has achieved great success in language understanding tasks. However, it remains an open question how to utilize BERT for language generation. In this paper, we present a novel approach, Conditional Masked Language Modeling (C-MLM), to enable the finetuning of BERT on target generation tasks. The finetuned BERT (teacher) is exploited as extra supe… ▽ More

    Submitted 17 July, 2020; v1 submitted 9 November, 2019; originally announced November 2019.

    Comments: ACL 2020

  26. arXiv:1911.03631  [pdf, other

    cs.CL

    Hierarchical Graph Network for Multi-hop Question Answering

    Authors: Yuwei Fang, Siqi Sun, Zhe Gan, Rohit Pillai, Shuohang Wang, Jingjing Liu

    Abstract: In this paper, we present Hierarchical Graph Network (HGN) for multi-hop question answering. To aggregate clues from scattered texts across multiple paragraphs, a hierarchical graph is created by constructing nodes on different levels of granularity (questions, paragraphs, sentences, entities), the representations of which are initialized with pre-trained contextual encoders. Given this hierarchic… ▽ More

    Submitted 6 October, 2020; v1 submitted 9 November, 2019; originally announced November 2019.

    Comments: Accepted to EMNLP 2020

  27. arXiv:1910.14142  [pdf, other

    cs.CL

    Discourse-Aware Neural Extractive Text Summarization

    Authors: Jiacheng Xu, Zhe Gan, Yu Cheng, Jingjing Liu

    Abstract: Recently BERT has been adopted for document encoding in state-of-the-art text summarization models. However, sentence-based extractive models often result in redundant or uninformative phrases in the extracted summaries. Also, long-range dependencies throughout a document are not well captured by BERT, which is pre-trained on sentence pairs instead of documents. To address these issues, we present… ▽ More

    Submitted 24 April, 2020; v1 submitted 30 October, 2019; originally announced October 2019.

    Comments: To appear at ACL 2020; Code available at https://github.com/jiacheng-xu/DiscoBERT

  28. arXiv:1910.03230  [pdf, other

    cs.CV cs.LG

    Meta Module Network for Compositional Visual Reasoning

    Authors: Wenhu Chen, Zhe Gan, Linjie Li, Yu Cheng, William Wang, Jingjing Liu

    Abstract: Neural Module Network (NMN) exhibits strong interpretability and compositionality thanks to its handcrafted neural modules with explicit multi-hop reasoning capability. However, most NMNs suffer from two critical drawbacks: 1) scalability: customized module for specific function renders it impractical when scaling up to a larger set of functions in complex tasks; 2) generalizability: rigid pre-def… ▽ More

    Submitted 7 November, 2020; v1 submitted 8 October, 2019; originally announced October 2019.

    Comments: Accepted to WACV 21 (Oral)

  29. arXiv:1909.13456  [pdf, other

    cs.LG cs.CL stat.ML

    Improving Textual Network Learning with Variational Homophilic Embeddings

    Authors: Wenlin Wang, Chenyang Tao, Zhe Gan, Guoyin Wang, Liqun Chen, Xinyuan Zhang, Ruiyi Zhang, Qian Yang, Ricardo Henao, Lawrence Carin

    Abstract: The performance of many network learning applications crucially hinges on the success of network embedding algorithms, which aim to encode rich network information into low-dimensional vertex-based vector representations. This paper considers a novel variational formulation of network embeddings, with special focus on textual networks. Different from most existing methods that optimize a discrimin… ▽ More

    Submitted 30 September, 2019; originally announced September 2019.

    Comments: Accepted to NeurIPS 2019

  30. arXiv:1909.11764  [pdf, ps, other

    cs.CL cs.LG

    FreeLB: Enhanced Adversarial Training for Natural Language Understanding

    Authors: Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu

    Abstract: Adversarial training, which minimizes the maximal risk for label-preserving input perturbations, has proved to be effective for improving the generalization of language models. In this work, we propose a novel adversarial training algorithm, FreeLB, that promotes higher invariance in the embedding space, by adding adversarial perturbations to word embeddings and minimizing the resultant adversaria… ▽ More

    Submitted 23 April, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: Adding results with ALBERT

  31. arXiv:1909.11740  [pdf, other

    cs.CV cs.CL cs.LG

    UNITER: UNiversal Image-TExt Representation Learning

    Authors: Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, Jingjing Liu

    Abstract: Joint image-text embedding is the bedrock for most Vision-and-Language (V+L) tasks, where multimodality inputs are simultaneously processed for joint visual and textual understanding. In this paper, we introduce UNITER, a UNiversal Image-TExt Representation, learned through large-scale pre-training over four image-text datasets (COCO, Visual Genome, Conceptual Captions, and SBU Captions), which ca… ▽ More

    Submitted 17 July, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: ECCV 2020

  32. arXiv:1909.11125  [pdf, other

    cs.RO

    Leveraging the Template and Anchor Framework for Safe, Online Robotic Gait Design

    Authors: Jinsun Liu, Pengcheng Zhao, Zhenyu Gan, Matthew Johnson-Roberson, Ram Vasudevan

    Abstract: Online control design using a high-fidelity, full-order model for a bipedal robot can be challenging due to the size of the state space of the model. A commonly adopted solution to overcome this challenge is to approximate the full-order model (anchor) with a simplified, reduced-order model (template), while performing control synthesis. Unfortunately it is challenging to make formal guarantees ab… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

  33. arXiv:1909.05316  [pdf, other

    cs.CL

    What Makes A Good Story? Designing Composite Rewards for Visual Storytelling

    Authors: Junjie Hu, Yu Cheng, Zhe Gan, Jingjing Liu, Jianfeng Gao, Graham Neubig

    Abstract: Previous storytelling approaches mostly focused on optimizing traditional metrics such as BLEU, ROUGE and CIDEr. In this paper, we re-examine this problem from a different angle, by looking deep into what defines a realistically-natural and topically-coherent story. To this end, we propose three assessment criteria: relevance, coherence and expressiveness, which we observe through empirical analys… ▽ More

    Submitted 25 February, 2020; v1 submitted 11 September, 2019; originally announced September 2019.

    Comments: Accepted paper in Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI) 2020

  34. arXiv:1909.05288  [pdf, other

    cs.LG stat.ML

    Contrastively Smoothed Class Alignment for Unsupervised Domain Adaptation

    Authors: Shuyang Dai, Yu Cheng, Yizhe Zhang, Zhe Gan, Jingjing Liu, Lawrence Carin

    Abstract: Recent unsupervised approaches to domain adaptation primarily focus on minimizing the gap between the source and the target domains through refining the feature generator, in order to learn a better alignment between the two domains. This minimization can be achieved via a domain classifier to detect target-domain features that are divergent from source-domain features. However, by optimizing via… ▽ More

    Submitted 6 October, 2020; v1 submitted 11 September, 2019; originally announced September 2019.

  35. arXiv:1909.02050  [pdf, other

    cs.CL cs.CV

    TIGEr: Text-to-Image Grounding for Image Caption Evaluation

    Authors: Ming Jiang, Qiuyuan Huang, Lei Zhang, Xin Wang, Pengchuan Zhang, Zhe Gan, Jana Diesner, Jianfeng Gao

    Abstract: This paper presents a new metric called TIGEr for the automatic evaluation of image captioning systems. Popular metrics, such as BLEU and CIDEr, are based solely on text matching between reference captions and machine-generated captions, potentially leading to biased evaluations because references may not fully cover the image content and natural language is inherently ambiguous. Building upon a m… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

  36. arXiv:1908.09395  [pdf, other

    cs.CL

    Domain Adaptive Text Style Transfer

    Authors: Dianqi Li, Yizhe Zhang, Zhe Gan, Yu Cheng, Chris Brockett, Ming-Ting Sun, Bill Dolan

    Abstract: Text style transfer without parallel data has achieved some practical success. However, in the scenario where less data is available, these methods may yield poor performance. In this paper, we examine domain adaptation for text style transfer to leverage massively available data from other domains. These data may demonstrate domain shift, which impedes the benefits of utilizing such data for trai… ▽ More

    Submitted 25 August, 2019; originally announced August 2019.

    Comments: EMNLP 2019, long paper

  37. arXiv:1908.09355  [pdf, other

    cs.CL cs.LG

    Patient Knowledge Distillation for BERT Model Compression

    Authors: Siqi Sun, Yu Cheng, Zhe Gan, Jingjing Liu

    Abstract: Pre-trained language models such as BERT have proven to be highly effective for natural language processing (NLP) tasks. However, the high demand for computing resources in training such models hinders their application in practice. In order to alleviate this resource hunger in large-scale model training, we propose a Patient Knowledge Distillation approach to compress an original large model (tea… ▽ More

    Submitted 25 August, 2019; originally announced August 2019.

    Comments: Accepted to EMNLP 2019

  38. arXiv:1908.09209  [pdf, other

    cs.CL

    Adversarial Domain Adaptation for Machine Reading Comprehension

    Authors: Huazheng Wang, Zhe Gan, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Hongning Wang

    Abstract: In this paper, we focus on unsupervised domain adaptation for Machine Reading Comprehension (MRC), where the source domain has a large amount of labeled data, while only unlabeled passages are available in the target domain. To this end, we propose an Adversarial Domain Adaptation framework (AdaMRC), where ($i$) pseudo questions are first generated for unlabeled passages in the target domain, and… ▽ More

    Submitted 24 August, 2019; originally announced August 2019.

    Comments: Accepted to EMNLP 2019

  39. Harmonic surface mapping algorithm for electrostatic potentials in an atomistic/continuum hybrid model for electrolyte solutions

    Authors: Jing Fu, Zecheng Gan

    Abstract: Simulating charged many-body systems has been a computational demanding task due to the long-range nature of electrostatic interaction. For the multi-scale model of electrolytes which combines the strengths of atomistic/continuum electrolyte representations, a harmonic surface mapping algorithm is developed for fast and accurate evaluation of the electrostatic reaction potentials. Our method refor… ▽ More

    Submitted 6 January, 2020; v1 submitted 2 July, 2019; originally announced July 2019.

    Comments: 17 pages, 5 figures

    Journal ref: Commun. Comput. Phys., Vol. 29, No. 2, pp. 571-587, 2021

  40. Fine structure in the $α$ decay of $^{223}$U

    Authors: M. D. Sun, Z. Liu, T. H. Huang, W. Q. Zhang, A. N. Andreyev, B. Ding, J. G. Wang, X. Y. Liu, H. Y. Lu, D. S. Hou, Z. G. Gan, L. Ma, H. B. Yang, Z. Y. Zhang, L. Yu, J. Jiang, K. L. Wang, Y. S. Wang, M. L. Liu, Z. H. Li, J. Li, X. Wang, A. H. Feng, C. J. Lin, L. J. Sun , et al. (7 additional authors not shown)

    Abstract: Fine structure in the $α$ decay of $^{223}$U was observed in the fusion-evaporation reaction $^{187}$Re($^{40}$Ar, p3n) by using fast digital pulse processing technique. Two $α$-decay branches of $^{223}$U feeding the ground state and 244 keV excited state of $^{219}$Th were identified by establishing the decay chain $^{223}$U $\xrightarrow{α_{1}}$ $^{219}$Th $\xrightarrow{α_{2}}$ $^{215}$Ra… ▽ More

    Submitted 22 October, 2019; v1 submitted 9 April, 2019; originally announced April 2019.

    Comments: 6 pages, 6 figures

    Journal ref: PhysicsLettersB800(2020)135096

  41. Synthesis of model predictive control based on data-driven learning

    Authors: Yuanqiang Zhou, Dewei Li, Yugeng Xi, Zhongxue Gan

    Abstract: For the application of MPC design in on-line regulation or tracking control problems, several studies have attempted to develop an accurate model, and realize adequate uncertainty description of linear or non-linear plants of the processes. In this study, we employ the data-driven learning technique to iteratively approximate the dynamical parameters, without requiring a priori knowledge of system… ▽ More

    Submitted 29 March, 2019; originally announced April 2019.

    Comments: 4 pages

    Journal ref: SCIENCE CHINA Information Sciences, 2019

  42. arXiv:1903.12314  [pdf, other

    cs.CV cs.AI

    Relation-Aware Graph Attention Network for Visual Question Answering

    Authors: Linjie Li, Zhe Gan, Yu Cheng, Jingjing Liu

    Abstract: In order to answer semantically-complicated questions about an image, a Visual Question Answering (VQA) model needs to fully understand the visual scene in the image, especially the interactive dynamics between different objects. We propose a Relation-aware Graph Attention Network (ReGAT), which encodes each image into a graph and models multi-type inter-object relations via a graph attention mech… ▽ More

    Submitted 9 October, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

    Comments: To appear in ICCV 2019

  43. On-node lattices construction using $\textit{partial}$ Gauss-Hermite quadrature for the lattice Boltzmann method

    Authors: Huanfeng Ye, Zecheng Gan, Bo Kuang, Yanhua Yang

    Abstract: A concise theoretical framework, the $\textit{partial}$ Gauss-Hermite quadrature (pGHQ), is established for constructing on-node lattices of the lattice Boltzmann (LB) method under a Cartesian coordinate system. Comparing with existing approaches, the pGHQ scheme has the following advantages: $\textbf{a).}$ extremely concise algorithm, $\textbf{b).}$ unifying the constructing procedure of symmetri… ▽ More

    Submitted 25 March, 2019; originally announced March 2019.

    Journal ref: Chin. Phys. B, Vol. 28, No. 5 (2019) 054702

  44. arXiv:1903.07137  [pdf, other

    cs.CL

    Topic-Guided Variational Autoencoders for Text Generation

    Authors: Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, Lawrence Carin

    Abstract: We propose a topic-guided variational autoencoder (TGVAE) model for text generation. Distinct from existing variational autoencoder (VAE) based approaches, which assume a simple Gaussian prior for the latent code, our model specifies the prior as a Gaussian mixture model (GMM) parametrized by a neural topic module. Each mixture component corresponds to a latent topic, which provides guidance to ge… ▽ More

    Submitted 17 March, 2019; originally announced March 2019.

  45. arXiv:1903.04504  [pdf, other

    astro-ph.HE

    Multi-Physics of AGN Jets in the Multi-Messenger Era

    Authors: B. Rani, M. Petropoulou, H. Zhang, F. D'Ammando, J. Finke, M. Baring, M. Böttcher, S. Dimitrakoudis, Z. Gan, D. Giannios, D. H. Hartmann, T. P. Krichbaum, A. P. Marscher, A. Mastichiadis, K. Nalewajko, R. Ojha, D. Paneque, C. Shrader, L. Sironi, A. Tchekhovskoy, D. J. Thompson, N. Vlahakis, T. M. Venters

    Abstract: Active galactic nuclei (AGN) with relativistic jets, powered by gas accretion onto their central supermassive black hole (SMBH), are unique laboratories for studying the physics of matter and elementary particles in extreme conditions that cannot be realized on Earth. For a long time since the discovery of AGN, photons were the only way to probe the underlying physical processes. The recent discov… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Comments: submitted to Astro2020 (Astronomy and Astrophysics Decadal Survey)

  46. arXiv:1903.02547  [pdf, other

    cs.CL cs.CV cs.LG cs.NE cs.RO

    Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

    Authors: Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa

    Abstract: We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et. al. (2018). Given a natural language instruction and photo-realistic image views of a previously unseen environment, the agent was tasked with navigating from sourc… ▽ More

    Submitted 2 April, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: CVPR 2019 Oral, video demo: https://youtu.be/AD9TNohXoPA

  47. arXiv:1902.00579  [pdf, other

    cs.CV cs.CL

    Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog

    Authors: Zhe Gan, Yu Cheng, Ahmed El Kholy, Linjie Li, Jingjing Liu, Jianfeng Gao

    Abstract: This paper presents a new model for visual dialog, Recurrent Dual Attention Network (ReDAN), using multi-step reasoning to answer a series of questions about an image. In each question-answering turn of a dialog, ReDAN infers the answer progressively through multiple reasoning steps. In each step of the reasoning process, the semantic representation of the question is updated based on the image an… ▽ More

    Submitted 4 June, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: Accepted to ACL 2019

  48. arXiv:1901.06283  [pdf, other

    cs.CL

    Improving Sequence-to-Sequence Learning via Optimal Transport

    Authors: Liqun Chen, Yizhe Zhang, Ruiyi Zhang, Chenyang Tao, Zhe Gan, Haichao Zhang, Bai Li, Dinghan Shen, Changyou Chen, Lawrence Carin

    Abstract: Sequence-to-sequence models are commonly trained via maximum likelihood estimation (MLE). However, standard MLE training considers a word-level objective, predicting the next word given the previous ground-truth partial sentence. This procedure focuses on modeling local syntactic patterns, and may fail to capture long-range semantic structure. We present a novel solution to alleviate these issues.… ▽ More

    Submitted 18 January, 2019; originally announced January 2019.

  49. Adding a Suite of Chemical Abundances to the MACER Code for the Evolution of Massive Elliptical Galaxies

    Authors: Zhaoming Gan, Ena Choi, Jeremiah P. Ostriker, Luca Ciotti, Silvia Pellegrini

    Abstract: We add a suite of chemical abundances to the MACER (Massive AGN Controlled Ellipticals Resolved) 2D code, by solving 12 additional continuity equations for H, He, C, N, O, Ne, Mg, Si, S, Ca, Fe and Ni respectively with sources from AGB stars and supernovae of type Ia and II with metal yields based on standard stellar physics. New stars, formed in Toomre unstable circumnuclear disks (of a size… ▽ More

    Submitted 8 April, 2019; v1 submitted 25 December, 2018; originally announced December 2018.

    Comments: 10 pages, 10 figures; accepted by ApJ

  50. arXiv:1812.08352  [pdf, other

    cs.CV cs.AI stat.ML

    Sequential Attention GAN for Interactive Image Editing

    Authors: Yu Cheng, Zhe Gan, Yitong Li, Jingjing Liu, Jianfeng Gao

    Abstract: Most existing text-to-image synthesis tasks are static single-turn generation, based on pre-defined textual descriptions of images. To explore more practical and interactive real-life applications, we introduce a new task - Interactive Image Editing, where users can guide an agent to edit images via multi-turn textual commands on-the-fly. In each session, the agent takes a natural language descrip… ▽ More

    Submitted 5 August, 2020; v1 submitted 19 December, 2018; originally announced December 2018.

    Comments: ACM MM 2020