Skip to main content

Showing 1–17 of 17 results for author: Pi, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11633  [pdf, other

    cs.CV

    DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

    Authors: Renqiu Xia, Song Mao, Xiangchao Yan, Hongbin Zhou, Bo Zhang, Haoyang Peng, Jiahao Pi, Daocheng Fu, Wenjie Wu, Hancheng Ye, Shiyang Feng, Bin Wang, Chao Xu, Conghui He, Pinlong Cai, Min Dou, Botian Shi, Sheng Zhou, Yongwei Wang, Bin Wang, Junchi Yan, Fei Wu, Yu Qiao

    Abstract: Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extract… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Homepage of DocGenome: https://unimodal4reasoning.github.io/DocGenome_page 22 pages, 11 figures

  2. arXiv:2405.10750  [pdf, other

    eess.SY cs.LG

    Parameter Identification for Electrochemical Models of Lithium-Ion Batteries Using Bayesian Optimization

    Authors: Jianzong Pi, Samuel Filgueira da Silva, Mehmet Fatih Ozkan, Abhishek Gupta, Marcello Canova

    Abstract: Efficient parameter identification of electrochemical models is crucial for accurate monitoring and control of lithium-ion cells. This process becomes challenging when applied to complex models that rely on a considerable number of interdependent parameters that affect the output response. Gradient-based and metaheuristic optimization techniques, although previously employed for this task, are lim… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 6 pages

  3. arXiv:2402.03830  [pdf, other

    cs.CV

    OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving

    Authors: Guohang Yan, Jiahao Pi, Jianfei Guo, Zhaotong Luo, Min Dou, Nianchen Deng, Qiusheng Huang, Daocheng Fu, Licheng Wen, Pinlong Cai, Xing Gao, Xinyu Cai, Bo Zhang, Xuemeng Yang, Yeqi Bai, Hongbin Zhou, Botian Shi

    Abstract: With deep learning and computer vision technology development, autonomous driving provides new solutions to improve traffic safety and efficiency. The importance of building high-quality datasets is self-evident, especially with the rise of end-to-end autonomous driving algorithms in recent years. Data plays a core role in the algorithm closed-loop system. However, collecting real-world data is ex… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 10 pages, 9 figures

  4. arXiv:2305.16840  [pdf, other

    cs.RO

    Automatic Surround Camera Calibration Method in Road Scene for Self-driving Car

    Authors: Jixiang Li, Jiahao Pi, Guohang Yan, Yikang Li

    Abstract: With the development of autonomous driving technology, sensor calibration has become a key technology to achieve accurate perception fusion and localization. Accurate calibration of the sensors ensures that each sensor can function properly and accurate information aggregation can be achieved. Among them, camera calibration based on surround view has received extensive attention. In autonomous dri… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 6 pages, 7 figures

  5. arXiv:2302.02057  [pdf, other

    cs.CV

    Semantic Diffusion Network for Semantic Segmentation

    Authors: Haoru Tan, Sitong Wu, Jimin Pi

    Abstract: Precise and accurate predictions over boundary areas are essential for semantic segmentation. However, the commonly-used convolutional operators tend to smooth and blur local detail cues, making it difficult for deep models to generate accurate boundary predictions. In this paper, we introduce an operator-level approach to enhance semantic boundary awareness, so as to improve the prediction of the… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: Accepted by NeurIPS2022

  6. arXiv:2212.04976  [pdf, other

    cs.CV

    Augmentation Matters: A Simple-yet-Effective Approach to Semi-supervised Semantic Segmentation

    Authors: Zhen Zhao, Lihe Yang, Sifan Long, Jimin Pi, Luping Zhou, Jingdong Wang

    Abstract: Recent studies on semi-supervised semantic segmentation (SSS) have seen fast progress. Despite their promising performance, current state-of-the-art methods tend to increasingly complex designs at the cost of introducing more network components and additional training procedures. Differently, in this work, we follow a standard teacher-student framework and propose AugSeg, a simple and clean approa… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

    Comments: 10 pages, 8 tables

  7. arXiv:2211.11335  [pdf, other

    cs.CV

    Instance-specific and Model-adaptive Supervision for Semi-supervised Semantic Segmentation

    Authors: Zhen Zhao, Sifan Long, Jimin Pi, Jingdong Wang, Luping Zhou

    Abstract: Recently, semi-supervised semantic segmentation has achieved promising performance with a small fraction of labeled data. However, most existing studies treat all unlabeled data equally and barely consider the differences and training difficulties among unlabeled instances. Differentiating unlabeled instances can promote instance-specific supervision to adapt to the model's evolution dynamically.… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  8. arXiv:2211.11315  [pdf, other

    cs.CV

    Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers

    Authors: Sifan Long, Zhen Zhao, Jimin Pi, Shengsheng Wang, Jingdong Wang

    Abstract: Vision transformers have achieved significant improvements on various vision tasks but their quadratic interactions between tokens significantly reduce computational efficiency. Many pruning methods have been proposed to remove redundant tokens for efficient vision transformers recently. However, existing studies mainly focus on the token importance to preserve local attentive tokens but completel… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  9. arXiv:2211.09799  [pdf, other

    cs.CV

    CAE v2: Context Autoencoder with CLIP Target

    Authors: Xinyu Zhang, Jiahui Chen, Junkun Yuan, Qiang Chen, Jian Wang, Xiaodi Wang, Shumin Han, Xiaokang Chen, Jimin Pi, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang

    Abstract: Masked image modeling (MIM) learns visual representation by masking and reconstructing image patches. Applying the reconstruction supervision on the CLIP representation has been proven effective for MIM. However, it is still under-explored how CLIP supervision in MIM influences performance. To investigate strategies for refining the CLIP-targeted MIM, we study two critical elements in MIM, i.e., t… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  10. arXiv:2209.09422  [pdf, other

    cs.CV

    Bit Allocation using Optimization

    Authors: Tongda Xu, Han Gao, Chenjian Gao, Yuanyuan Wang, Dailan He, Jinyong Pi, Jixiang Luo, Ziyu Zhu, Mao Ye, Hongwei Qin, Yan Wang, Jingjing Liu, Ya-Qin Zhang

    Abstract: In this paper, we consider the problem of bit allocation in Neural Video Compression (NVC). First, we reveal a fundamental relationship between bit allocation in NVC and Semi-Amortized Variational Inference (SAVI). Specifically, we show that SAVI with GoP (Group-of-Picture)-level likelihood is equivalent to pixel-level bit allocation with precise rate \& quality dependency model. Based on this equ… ▽ More

    Submitted 8 May, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: ICML 2023

  11. arXiv:2209.07694  [pdf, other

    cs.RO

    An Extrinsic Calibration Method between LiDAR and GNSS/INS for Autonomous Driving

    Authors: Guohang Yan, Jiahao Pi, Chengjie Wang, Xinyu Cai, Yikang Li

    Abstract: Accurate and reliable sensor calibration is critical for fusing LiDAR and inertial measurements in autonomous driving. This paper proposes a novel three-stage extrinsic calibration method between LiDAR and GNSS/INS for autonomous driving. The first stage can quickly calibrate the extrinsic parameters between the sensors through point cloud surface features so that the extrinsic can be narrowed fro… ▽ More

    Submitted 28 February, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: 7 pages, 12 figures, submitted to IROS 2023

  12. arXiv:2011.11619  [pdf, other

    cs.LG

    Neural collapse with unconstrained features

    Authors: Dustin G. Mixon, Hans Parshall, Jianzong Pi

    Abstract: Neural collapse is an emergent phenomenon in deep learning that was recently discovered by Papyan, Han and Donoho. We propose a simple "unconstrained features model" in which neural collapse also emerges empirically. By studying this model, we provide some explanation for the emergence of neural collapse in terms of the landscape of empirical risk.

    Submitted 23 November, 2020; originally announced November 2020.

  13. arXiv:2007.06801  [pdf, other

    eess.SY cs.SI

    Multi-Objective Vehicle Rebalancing for Ridehailing System using a Reinforcement Learning Approach

    Authors: Yuntian Deng, Hao Chen, Shiping Shao, Jiacheng Tang, Jianzong Pi, Abhishek Gupta

    Abstract: The problem of designing a rebalancing algorithm for a large-scale ridehailing system with asymmetric demand is considered here. We pose the rebalancing problem within a semi Markov decision problem (SMDP) framework with closed queues of vehicles serving stationary, but asymmetric demand, over a large city with multiple nodes (representing neighborhoods). We assume that the passengers queue up at… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

  14. arXiv:2003.02228  [pdf, other

    cs.LG stat.ML

    PushNet: Efficient and Adaptive Neural Message Passing

    Authors: Julian Busch, Jiaxing Pi, Thomas Seidl

    Abstract: Message passing neural networks have recently evolved into a state-of-the-art approach to representation learning on graphs. Existing methods perform synchronous message passing along all edges in multiple subsequent rounds and consequently suffer from various shortcomings: Propagation schemes are inflexible since they are restricted to $k$-hop neighborhoods and insensitive to actual demands of in… ▽ More

    Submitted 17 December, 2020; v1 submitted 4 March, 2020; originally announced March 2020.

    Journal ref: 24th European Conference on Artificial Intelligence (ECAI 2020)

  15. arXiv:1904.10778  [pdf, other

    cs.LG eess.SY math.OC math.PR stat.ML

    Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms

    Authors: Abhishek Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar

    Abstract: Recursive stochastic algorithms have gained significant attention in the recent past due to data driven applications. Examples include stochastic gradient descent for solving large-scale optimization problems and empirical dynamic programming algorithms for solving Markov decision problems. These recursive stochastic algorithms approximate certain contraction operators and can be viewed within the… ▽ More

    Submitted 23 July, 2020; v1 submitted 24 April, 2019; originally announced April 2019.

    Comments: Accepted in SIMODS, 37 pages

  16. arXiv:1904.00450  [pdf, other

    cs.GT

    Two Algorithms for Computing Exact and Approximate Nash Equilibria in Bimatrix Games

    Authors: Jianzong Pi, Joseph L. Heyman, Abhishek Gupta

    Abstract: In this paper, we first devise two algorithms to determine whether or not a bimatrix game has a strategically equivalent zero-sum game. If so, we propose an algorithm that computes the strategically equivalent zero-sum game. If a given bimatrix game is not strategically equivalent to a zero-sum game, we then propose an approach to compute a zero-sum game whose saddle-point equilibrium can be mappe… ▽ More

    Submitted 11 August, 2021; v1 submitted 31 March, 2019; originally announced April 2019.

    Comments: 20 pages, 3 figures. Replaces "On the Computation of Strategically Equivalent Rank-0 Games" by condensing the main results of that paper and extending the results with an algorithm for well-supported approximate Nash equilibrium. Submitted to 2021 Conference on Decision and Game Theory for Security (GameSec 2021)

  17. arXiv:1805.00625  [pdf, other

    eess.IV cs.CL cs.CV

    Multimodal Utterance-level Affect Analysis using Visual, Audio and Text Features

    Authors: Didan Deng, Yuqian Zhou, Jimin Pi, Bertram E. Shi

    Abstract: The integration of information across multiple modalities and across time is a promising way to enhance the emotion recognition performance of affective systems. Much previous work has focused on instantaneous emotion recognition. The 2018 One-Minute Gradual-Emotion Recognition (OMG-Emotion) challenge, which was held in conjunction with the IEEE World Congress on Computational Intelligence, encour… ▽ More

    Submitted 4 May, 2018; v1 submitted 2 May, 2018; originally announced May 2018.

    Comments: 5 pages, 1 figure, subject to the 2018 IJCNN challenge on One-Minute Gradual-Emotion Recognition