Zum Hauptinhalt springen

Showing 1–50 of 71 results for author: Takeuchi, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.03144  [pdf, other

    stat.ML cs.LG

    Active Learning for Level Set Estimation Using Randomized Straddle Algorithms

    Authors: Yu Inatsu, Shion Takeno, Kentaro Kutsukake, Ichiro Takeuchi

    Abstract: Level set estimation (LSE), the problem of identifying the set of input points where a function takes value above (or below) a given threshold, is important in practical applications. When the function is expensive-to-evaluate and black-box, the \textit{straddle} algorithm, which is a representative heuristic for LSE based on Gaussian process models, and its extensions having theoretical guarantee… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    Comments: 21 pages, 4 figures

  2. arXiv:2406.18902  [pdf, other

    stat.ML cs.LG

    Statistical Test for Data Analysis Pipeline by Selective Inference

    Authors: Tomohiro Shiraishi, Tatsuya Matsukawa, Shuichi Nishino, Ichiro Takeuchi

    Abstract: A data analysis pipeline is a structured sequence of processing steps that transforms raw data into meaningful insights by effectively integrating various analysis algorithms. In this paper, we propose a novel statistical test designed to assess the statistical significance of data analysis pipelines. Our approach allows for the systematic development of valid statistical tests applicable to any d… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.05964  [pdf, other

    stat.ML cs.LG

    Distributionally Robust Safe Sample Screening

    Authors: Hiroyuki Hanada, Aoyama Tatsuya, Akahane Satoshi, Tomonari Tanaka, Yoshito Okura, Yu Inatsu, Noriaki Hashimoto, Shion Takeno, Taro Murayama, Hanju Lee, Shinya Kojima, Ichiro Takeuchi

    Abstract: In this study, we propose a machine learning method called Distributionally Robust Safe Sample Screening (DRSSS). DRSSS aims to identify unnecessary training samples, even when the distribution of the training samples changes in the future. To achieve this, we effectively combine the distributionally robust (DR) paradigm, which aims to enhance model robustness against variations in data distributi… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  4. arXiv:2405.17881  [pdf, other

    cs.LG

    Crystal-LSBO: Automated Design of De Novo Crystals with Latent Space Bayesian Optimization

    Authors: Onur Boyar, Yanheng Gu, Yuji Tanaka, Shunsuke Tonogai, Tomoya Itakura, Ichiro Takeuchi

    Abstract: Generative modeling of crystal structures is significantly challenged by the complexity of input data, which constrains the ability of these models to explore and discover novel crystals. This complexity often confines de novo design methodologies to merely small perturbations of known crystals and hampers the effective application of advanced optimization techniques. One such optimization techniq… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 10 pages, 5 figures

  5. arXiv:2404.16328  [pdf, other

    stat.ML cs.LG

    Distributionally Robust Safe Screening

    Authors: Hiroyuki Hanada, Satoshi Akahane, Tatsuya Aoyama, Tomonari Tanaka, Yoshito Okura, Yu Inatsu, Noriaki Hashimoto, Taro Murayama, Lee Hanju, Shinya Kojima, Ichiro Takeuchi

    Abstract: In this study, we propose a method Distributionally Robust Safe Screening (DRSS), for identifying unnecessary samples and features within a DR covariate shift setting. This method effectively combines DR learning, a paradigm aimed at enhancing model robustness against variations in data distribution, with safe screening (SS), a sparse optimization technique designed to identify irrelevant samples… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  6. arXiv:2402.11789  [pdf, other

    stat.ML cs.CV cs.LG

    Statistical Test on Diffusion Model-based Generated Images by Selective Inference

    Authors: Teruyuki Katsuoka, Tomohiro Shiraishi, Daiki Miwa, Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: AI technology for generating images, such as diffusion models, has advanced rapidly. However, there is no established framework for quantifying the reliability of AI-generated images, which hinders their use in critical decision-making tasks, such as medical image diagnosis. In this study, we propose a method to quantify the reliability of decision-making tasks that rely on images produced by diff… ▽ More

    Submitted 29 July, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: 31 pages, 7 figures

  7. arXiv:2402.03724  [pdf, other

    stat.ML cs.LG

    Statistical Test for Anomaly Detections by Variational Auto-Encoders

    Authors: Daiki Miwa, Tomohiro Shiraishi, Vo Nguyen Le Duy, Teruyuki Katsuoka, Ichiro Takeuchi

    Abstract: In this study, we consider the reliability assessment of anomaly detection (AD) using Variational Autoencoder (VAE). Over the last decade, VAE-based AD has been actively studied in various perspective, from method development to applied research. However, when the results of ADs are used in high-stakes decision-making, such as in medical diagnosis, it is necessary to ensure the reliability of the… ▽ More

    Submitted 2 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  8. arXiv:2402.02198  [pdf

    cond-mat.mtrl-sci cs.LG

    Co-orchestration of Multiple Instruments to Uncover Structure-Property Relationships in Combinatorial Libraries

    Authors: Boris N. Slautin, Utkarsh Pratiush, Ilia N. Ivanov, Yongtao Liu, Rohit Pant, Xiaohang Zhang, Ichiro Takeuchi, Maxim A. Ziatdinov, Sergei V. Kalinin

    Abstract: The rapid growth of automated and autonomous instrumentations brings forth an opportunity for the co-orchestration of multimodal tools, equipped with multiple sequential detection methods, or several characterization tools to explore identical samples. This can be exemplified by the combinatorial libraries that can be explored in multiple locations by multiple tools simultaneously, or downstream c… ▽ More

    Submitted 17 March, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: 22 pages, 9 figures

  9. arXiv:2401.08169  [pdf, other

    stat.ML cs.LG

    Statistical Test for Attention Map in Vision Transformer

    Authors: Tomohiro Shiraishi, Daiki Miwa, Teruyuki Katsuoka, Vo Nguyen Le Duy, Kouichi Taji, Ichiro Takeuchi

    Abstract: The Vision Transformer (ViT) demonstrates exceptional performance in various computer vision tasks. Attention is crucial for ViT to capture complex wide-ranging relationships among image patches, allowing the model to weigh the importance of image patches and aiding our understanding of the decision-making process. However, when utilizing the attention of ViT as evidence in high-stakes decision-ma… ▽ More

    Submitted 19 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 42pages, 17figures

  10. arXiv:2311.16564  [pdf, other

    cs.MA

    Multi-agent statistical discriminative sub-trajectory mining and an application to NBA basketball

    Authors: Rory Bunker, Vo Nguyen Le Duy, Yasuo Tabei, Ichiro Takeuchi, Keisuke Fujii

    Abstract: Improvements in tracking technology through optical and computer vision systems have enabled a greater understanding of the movement-based behaviour of multiple agents, including in team sports. In this study, a Multi-Agent Statistically Discriminative Sub-Trajectory Mining (MA-Stat-DSM) method is proposed that takes a set of binary-labelled agent trajectory matrices as input and incorporates Haus… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  11. arXiv:2311.14964  [pdf, other

    stat.ML cs.LG

    Selective Inference for Changepoint detection by Recurrent Neural Network

    Authors: Tomohiro Shiraishi, Daiki Miwa, Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: In this study, we investigate the quantification of the statistical reliability of detected change points (CPs) in time series using a Recurrent Neural Network (RNN). Thanks to its flexibility, RNN holds the potential to effectively identify CPs in time series characterized by complex dynamics. However, there is an increased risk of erroneously detecting random noise fluctuations as CPs. The prima… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

    Comments: 41pages, 16figures

  12. arXiv:2311.13460  [pdf, other

    cs.LG stat.ML

    Multi-Objective Bayesian Optimization with Active Preference Learning

    Authors: Ryota Ozaki, Kazuki Ishikawa, Youhei Kanzaki, Shinya Suzuki, Shion Takeno, Ichiro Takeuchi, Masayuki Karasuyama

    Abstract: There are a lot of real-world black-box optimization problems that need to optimize multiple criteria simultaneously. However, in a multi-objective optimization (MOO) problem, identifying the whole Pareto front requires the prohibitive search cost, while in many practical scenarios, the decision maker (DM) only needs a specific solution among the set of the Pareto optimal solutions. We propose a B… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  13. arXiv:2311.03760  [pdf, other

    cs.LG stat.ML

    Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds

    Authors: Shion Takeno, Yu Inatsu, Masayuki Karasuyama, Ichiro Takeuchi

    Abstract: Among various acquisition functions (AFs) in Bayesian optimization (BO), Gaussian process upper confidence bound (GP-UCB) and Thompson sampling (TS) are well-known options with established theoretical properties regarding Bayesian cumulative regret (BCR). Recently, it has been shown that a randomized variant of GP-UCB achieves a tighter BCR bound compared with GP-UCB, which we call the tighter BCR… ▽ More

    Submitted 4 June, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: 28 pages, 3 figures, 2 tables, Accepted to ICML2024

  14. arXiv:2310.14608  [pdf, other

    stat.ML cs.LG

    CAD-DA: Controllable Anomaly Detection after Domain Adaptation by Statistical Inference

    Authors: Vo Nguyen Le Duy, Hsuan-Tien Lin, Ichiro Takeuchi

    Abstract: We propose a novel statistical method for testing the results of anomaly detection (AD) under domain adaptation (DA), which we call CAD-DA -- controllable AD under DA. The distinct advantage of the CAD-DA lies in its ability to control the probability of misidentifying anomalies under a pre-specified level $α$ (e.g., 0.05). The challenge within this DA setting is the necessity to account for the i… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  15. Mixing Histopathology Prototypes into Robust Slide-Level Representations for Cancer Subtyping

    Authors: Joshua Butke, Noriaki Hashimoto, Ichiro Takeuchi, Hiroaki Miyoshi, Koichi Ohshima, Jun Sakuma

    Abstract: Whole-slide image analysis via the means of computational pathology often relies on processing tessellated gigapixel images with only slide-level labels available. Applying multiple instance learning-based methods or transformer models is computationally expensive as, for each image, all instances have to be processed simultaneously. The MLP-Mixer is an under-explored alternative model to common v… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: The final authenticated publication is available online at https://doi.org/10.1007/978-3-031-45676-3_12

    Journal ref: Machine Learning in Medical Imaging. MLMI 2023. Lecture Notes in Computer Science, vol 14349, pp. 114-123. Cham: Springer Nature Switzerland

  16. arXiv:2307.11351  [pdf, other

    stat.ML cs.LG

    Bounded P-values in Parametric Programming-based Selective Inference

    Authors: Tomohiro Shiraishi, Daiki Miwa, Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: Selective inference (SI) has been actively studied as a promising framework for statistical hypothesis testing for data-driven hypotheses. The basic idea of SI is to make inferences conditional on an event that a hypothesis is selected. In order to perform SI, this event must be characterized in a traceable form. When selection event is too difficult to characterize, additional conditions are intr… ▽ More

    Submitted 28 December, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

    Comments: 48pages, 14figures

  17. arXiv:2306.13561  [pdf, other

    stat.ML cs.LG

    Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning

    Authors: Takumi Yoshida, Hiroyuki Hanada, Kazuya Nakagawa, Kouichi Taji, Koji Tsuda, Ichiro Takeuchi

    Abstract: Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering substructures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model.… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  18. arXiv:2306.12670  [pdf, other

    stat.ML cs.LG

    Generalized Low-Rank Update: Model Parameter Bounds for Low-Rank Training Data Modifications

    Authors: Hiroyuki Hanada, Noriaki Hashimoto, Kouichi Taji, Ichiro Takeuchi

    Abstract: In this study, we have developed an incremental machine learning (ML) method that efficiently obtains the optimal model when a small number of instances or features are added or removed. This problem holds practical importance in model selection, such as cross-validation (CV) and feature selection. Among the class of ML methods known as linear estimators, there exists an efficient model update fra… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  19. arXiv:2306.10406  [pdf

    cond-mat.mtrl-sci cs.HC cs.LG

    Human-In-the-Loop for Bayesian Autonomous Materials Phase Mapping

    Authors: Felix Adams, Austin McDannald, Ichiro Takeuchi, A. Gilad Kusne

    Abstract: Autonomous experimentation (AE) combines machine learning and research hardware automation in a closed loop, guiding subsequent experiments toward user goals. As applied to materials research, AE can accelerate materials exploration, reducing time and cost compared to traditional Edisonian studies. Additionally, integrating knowledge from diverse sources including theory, simulations, literature,… ▽ More

    Submitted 17 June, 2023; originally announced June 2023.

  20. arXiv:2304.01404  [pdf, other

    cs.LG cs.CE

    Adaptive Defective Area Identification in Material Surface Using Active Transfer Learning-based Level Set Estimation

    Authors: Shota Hozumi, Kentaro Kutsukake, Kota Matsui, Syunya Kusakawa, Toru Ujihara, Ichiro Takeuchi

    Abstract: In material characterization, identifying defective areas on a material surface is fundamental. The conventional approach involves measuring the relevant physical properties point-by-point at the predetermined mesh grid points on the surface and determining the area at which the property does not reach the desired level. To identify defective areas more efficiently, we propose adaptive mapping met… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  21. arXiv:2302.02399  [pdf

    cs.LG

    Latent Space Bayesian Optimization with Latent Data Augmentation for Enhanced Exploration

    Authors: Onur Boyar, Ichiro Takeuchi

    Abstract: Latent Space Bayesian Optimization (LSBO) combines generative models, typically Variational Autoencoders (VAE), with Bayesian Optimization (BO) to generate de-novo objects of interest. However, LSBO faces challenges due to the mismatch between the objectives of BO and VAE, resulting in poor exploration capabilities. In this paper, we propose novel contributions to enhance LSBO efficiency and overc… ▽ More

    Submitted 28 April, 2024; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: 10 figures, 6 tables

  22. arXiv:2301.11588  [pdf, other

    stat.ML cs.LG

    Bounding Box-based Multi-objective Bayesian Optimization of Risk Measures under Input Uncertainty

    Authors: Yu Inatsu, Shion Takeno, Hiroyuki Hanada, Kazuki Iwata, Ichiro Takeuchi

    Abstract: In this study, we propose a novel multi-objective Bayesian optimization (MOBO) method to efficiently identify the Pareto front (PF) defined by risk measures for black-box functions under the presence of input uncertainty (IU). Existing BO methods for Pareto optimization in the presence of IU are risk-specific or without theoretical guarantees, whereas our proposed method addresses general risk mea… ▽ More

    Submitted 24 November, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: 39 pages, 5 figures

  23. arXiv:2301.02437  [pdf, other

    stat.ML cs.CV cs.LG

    Valid P-Value for Deep Learning-Driven Salient Region

    Authors: Daiki Miwa, Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: Various saliency map methods have been proposed to interpret and explain predictions of deep learning models. Saliency maps allow us to interpret which parts of the input signals have a strong influence on the prediction results. However, since a saliency map is obtained by complex computations in deep learning models, it is often difficult to know how reliable the saliency map itself is. In this… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

  24. arXiv:2206.03003  [pdf, other

    eess.IV cs.CV

    Transformer-based Personalized Attention Mechanism for Medical Images with Clinical Records

    Authors: Yusuke Takagi, Noriaki Hashimoto, Hiroki Masuda, Hiroaki Miyoshi, Koichi Ohshima, Hidekata Hontani, Ichiro Takeuchi

    Abstract: In medical image diagnosis, identifying the attention region, i.e., the region of interest for which the diagnosis is made, is an important task. Various methods have been developed to automatically identify target regions from given medical images. However, in actual medical practice, the diagnosis is made based not only on the images but also on a variety of clinical records. This means that pat… ▽ More

    Submitted 27 January, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

    ACM Class: I.2.1; J.3

    Journal ref: Takagi, Yusuke, et al. "Transformer-based personalized attention mechanism for medical images with clinical records." Journal of Pathology Informatics (2023): 100185

  25. arXiv:2205.14317  [pdf, other

    stat.ML cs.LG

    A Confidence Machine for Sparse High-Order Interaction Model

    Authors: Diptesh Das, Eugene Ndiaye, Ichiro Takeuchi

    Abstract: In predictive modeling for high-stake decision-making, predictors must be not only accurate but also reliable. Conformal prediction (CP) is a promising approach for obtaining the confidence of prediction results with fewer theoretical assumptions. To obtain the confidence set by so-called full-CP, we need to refit the predictor for all possible values of prediction results, which is only possible… ▽ More

    Submitted 1 November, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

  26. arXiv:2204.05838  [pdf

    cond-mat.mtrl-sci cs.LG

    Benchmarking Active Learning Strategies for Materials Optimization and Discovery

    Authors: Alex Wang, Haotong Liang, Austin McDannald, Ichiro Takeuchi, A. Gilad Kusne

    Abstract: Autonomous physical science is revolutionizing materials science. In these systems, machine learning controls experiment design, execution, and analysis in a closed loop. Active learning, the machine learning field of optimal experiment design, selects each subsequent experiment to maximize knowledge toward the user goal. Autonomous system performance can be further improved with implementation of… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

  27. arXiv:2204.04187  [pdf

    cond-mat.mtrl-sci cs.LG cs.RO

    A Low-Cost Robot Science Kit for Education with Symbolic Regression for Hypothesis Discovery and Validation

    Authors: Logan Saar, Haotong Liang, Alex Wang, Austin McDannald, Efrain Rodriguez, Ichiro Takeuchi, A. Gilad Kusne

    Abstract: The next generation of physical science involves robot scientists - autonomous physical science systems capable of experimental design, execution, and analysis in a closed loop. Such systems have shown real-world success for scientific exploration and discovery, including the first discovery of a best-in-class material. To build and use these systems, the next generation workforce requires experti… ▽ More

    Submitted 13 June, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

  28. arXiv:2202.06593  [pdf, other

    stat.ML cs.LG

    Statistical Inference for the Dynamic Time Warping Distance, with Application to Abnormal Time-Series Detection

    Authors: Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: We study statistical inference on the similarity/distance between two time-series under uncertain environment by considering a statistical hypothesis test on the distance obtained from Dynamic Time Warping (DTW) algorithm. The sampling distribution of the DTW distance is too difficult to derive because it is obtained based on the solution of the DTW algorithm, which is complicated. To circumvent t… ▽ More

    Submitted 23 October, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

  29. arXiv:2201.13112  [pdf, other

    stat.ML cs.LG

    Bayesian Optimization for Distributionally Robust Chance-constrained Problem

    Authors: Yu Inatsu, Shion Takeno, Masayuki Karasuyama, Ichiro Takeuchi

    Abstract: In black-box function optimization, we need to consider not only controllable design variables but also uncontrollable stochastic environment variables. In such cases, it is necessary to solve the optimization problem by taking into account the uncertainty of the environmental variables. Chance-constrained (CC) problem, the problem of maximizing the expected value under a certain level of constrai… ▽ More

    Submitted 2 February, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

    Comments: 18 pages, 2 figures

  30. arXiv:2201.00687  [pdf

    cs.DL cs.LG

    Topic Analysis of Superconductivity Literature by Semantic Non-negative Matrix Factorization

    Authors: Valentin Stanev, Erik Skau, Ichiro Takeuchi, Boian S. Alexandrov

    Abstract: We utilize a recently developed topic modeling method called SeNMFk, extending the standard Non-negative Matrix Factorization (NMF) methods by incorporating the semantic structure of the text, and adding a robust system for determining the number of topics. With SeNMFk, we were able to extract coherent topics validated by human experts. From these topics, a few are relatively general and cover bro… ▽ More

    Submitted 1 December, 2021; originally announced January 2022.

    Report number: LA-UR-21-3134

  31. arXiv:2112.05104  [pdf, other

    math.OC cs.LG

    Continuation Path with Linear Convergence Rate

    Authors: Eugene Ndiaye, Ichiro Takeuchi

    Abstract: Path-following algorithms are frequently used in composite optimization problems where a series of subproblems, with varying regularization hyperparameters, are solved sequentially. By reusing the previous solutions as initialization, better convergence speeds have been observed numerically. This makes it a rather useful heuristic to speed up the execution of optimization algorithms in machine lea… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

  32. arXiv:2111.08330  [pdf, ps, other

    stat.ML cs.LG math.OC

    Bayesian Optimization for Cascade-type Multi-stage Processes

    Authors: Shunya Kusakawa, Shion Takeno, Yu Inatsu, Kentaro Kutsukake, Shogo Iwazaki, Takashi Nakano, Toru Ujihara, Masayuki Karasuyama, Ichiro Takeuchi

    Abstract: Complex processes in science and engineering are often formulated as multistage decision-making problems. In this paper, we consider a type of multistage decision-making process called a cascade process. A cascade process is a multistage process in which the output of one stage is used as an input for the subsequent stage. When the cost of each stage is expensive, it is difficult to search for the… ▽ More

    Submitted 7 March, 2023; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: 70pages, 7 figures

    Journal ref: Neural Computation (2022) 34 (12): 2408-2431

  33. arXiv:2111.07478  [pdf

    cond-mat.mtrl-sci cs.LG

    Physics in the Machine: Integrating Physical Knowledge in Autonomous Phase-Mapping

    Authors: A. Gilad Kusne, Austin McDannald, Brian DeCost, Corey Oses, Cormac Toher, Stefano Curtarolo, Apurva Mehta, Ichiro Takeuchi

    Abstract: Application of artificial intelligence (AI), and more specifically machine learning, to the physical sciences has expanded significantly over the past decades. In particular, science-informed AI, also known as scientific AI or inductive bias AI, has grown from a focus on data analysis to now controlling experiment design, simulation, execution and analysis in closed-loop autonomous systems. The CA… ▽ More

    Submitted 16 February, 2022; v1 submitted 14 November, 2021; originally announced November 2021.

    Journal ref: Front. Phys. 10:815863 (2022)

  34. arXiv:2110.08989  [pdf, other

    stat.ML cs.LG

    Valid and Exact Statistical Inference for Multi-dimensional Multiple Change-Points by Selective Inference

    Authors: Ryota Sugiyama, Hiroki Toda, Vo Nguyen Le Duy, Yu Inatsu, Ichiro Takeuchi

    Abstract: In this paper, we study statistical inference of change-points (CPs) in multi-dimensional sequence. In CP detection from a multi-dimensional sequence, it is often desirable not only to detect the location, but also to identify the subset of the components in which the change occurs. Several algorithms have been proposed for such problems, but no valid exact inference method has been established to… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.

  35. arXiv:2109.14206  [pdf, other

    stat.ML cs.LG

    Exact Statistical Inference for the Wasserstein Distance by Selective Inference

    Authors: Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: In this paper, we study statistical inference for the Wasserstein distance, which has attracted much attention and has been applied to various machine learning tasks. Several studies have been proposed in the literature, but almost all of them are based on asymptotic approximation and do not have finite-sample validity. In this study, we propose an exact (non-asymptotic) inference method for the W… ▽ More

    Submitted 20 January, 2022; v1 submitted 29 September, 2021; originally announced September 2021.

  36. arXiv:2109.08622  [pdf

    cs.ET physics.app-ph physics.optics

    Harnessing Optoelectronic Noises in a Photonic Generative Network

    Authors: Changming Wu, Xiaoxuan Yang, Heshan Yu, Ruoming Peng, Ichiro Takeuchi, Yiran Chen, Mo Li

    Abstract: Integrated optoelectronics is emerging as a promising platform of neural network accelerator, which affords efficient in-memory computing and high bandwidth interconnectivity. The inherent optoelectronic noises, however, make the photonic systems error-prone in practice. It is thus imperative to devise strategies to mitigate and, if possible, harness noises in photonic computing systems. Here, we… ▽ More

    Submitted 21 November, 2021; v1 submitted 17 September, 2021; originally announced September 2021.

    Comments: 19 pages, 4 figures

  37. arXiv:2107.03602  [pdf, other

    cs.CV

    Case-based Similar Image Retrieval for Weakly Annotated Large Histopathological Images of Malignant Lymphoma Using Deep Metric Learning

    Authors: Noriaki Hashimoto, Yusuke Takagi, Hiroki Masuda, Hiroaki Miyoshi, Kei Kohno, Miharu Nagaishi, Kensaku Sato, Mai Takeuchi, Takuya Furuta, Keisuke Kawamoto, Kyohei Yamada, Mayuko Moritsubo, Kanako Inoue, Yasumasa Shimasaki, Yusuke Ogura, Teppei Imamoto, Tatsuzo Mishina, Ken Tanaka, Yoshino Kawaguchi, Shigeo Nakamura, Koichi Ohshima, Hidekata Hontani, Ichiro Takeuchi

    Abstract: In the present study, we propose a novel case-based similar image retrieval (SIR) method for hematoxylin and eosin (H&E)-stained histopathological images of malignant lymphoma. When a whole slide image (WSI) is used as an input query, it is desirable to be able to retrieve similar cases by focusing on image patches in pathologically important regions such as tumor cells. To address this problem, w… ▽ More

    Submitted 27 January, 2023; v1 submitted 8 July, 2021; originally announced July 2021.

    ACM Class: H.3.3; I.2.1; J.3

  38. arXiv:2106.04929  [pdf, other

    stat.ML cs.LG

    Fast and More Powerful Selective Inference for Sparse High-order Interaction Model

    Authors: Diptesh Das, Vo Nguyen Le Duy, Hiroyuki Hanada, Koji Tsuda, Ichiro Takeuchi

    Abstract: Automated high-stake decision-making such as medical diagnosis requires models with high interpretability and reliability. As one of the interpretable and reliable models with good prediction ability, we consider Sparse High-order Interaction Model (SHIM) in this study. However, finding statistically significant high-order interactions is challenging due to the intrinsic high dimensionality of the… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  39. arXiv:2105.04920  [pdf, other

    stat.ML cs.LG

    More Powerful Conditional Selective Inference for Generalized Lasso by Parametric Programming

    Authors: Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: Conditional selective inference (SI) has been studied intensively as a new statistical inference framework for data-driven hypotheses. The basic concept of conditional SI is to make the inference conditional on the selection event, which enables an exact and valid statistical inference to be conducted even when the hypothesis is selected based on the data. Conditional SI has mainly been studied in… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

    Journal ref: Journal of Machine Learning Research 23.300 (2022): 1-37

  40. arXiv:2104.10840  [pdf, other

    stat.ML cs.LG

    Conditional Selective Inference for Robust Regression and Outlier Detection using Piecewise-Linear Homotopy Continuation

    Authors: Toshiaki Tsukurimichi, Yu Inatsu, Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: In practical data analysis under noisy environment, it is common to first use robust methods to identify outliers, and then to conduct further analysis after removing the outliers. In this paper, we consider statistical inference of the model estimated after outliers are removed, which can be interpreted as a selective inference (SI) problem. To use conditional SI framework, it is necessary to cha… ▽ More

    Submitted 2 January, 2022; v1 submitted 21 April, 2021; originally announced April 2021.

  41. arXiv:2104.06648  [pdf, other

    stat.ML cs.LG stat.CO

    Root-finding Approaches for Computing Conformal Prediction Set

    Authors: Eugene Ndiaye, Ichiro Takeuchi

    Abstract: Conformal prediction constructs a confidence set for an unobserved response of a feature vector based on previous identically distributed and exchangeable observations of responses and features. It has a coverage guarantee at any nominal level without additional assumptions on their distribution. Its computation deplorably requires a refitting procedure for all replacement candidates of the target… ▽ More

    Submitted 6 December, 2022; v1 submitted 14 April, 2021; originally announced April 2021.

    Comments: Published in Machine Learning Journal https://rdcu.be/c0nh9

    Journal ref: Machine Learning Journal, 2022

  42. arXiv:2102.04000  [pdf, other

    stat.ML cs.LG

    Active learning for distributionally robust level-set estimation

    Authors: Yu Inatsu, Shogo Iwazaki, Ichiro Takeuchi

    Abstract: Many cases exist in which a black-box function $f$ with high evaluation cost depends on two types of variables $\bm x$ and $\bm w$, where $\bm x$ is a controllable \emph{design} variable and $\bm w$ are uncontrollable \emph{environmental} variables that have random variation following a certain distribution $P$. In such cases, an important task is to find the range of design variables $\bm x$ such… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: 23 pages, 7 figures

  43. arXiv:2012.13545  [pdf, other

    stat.ML cs.LG

    More Powerful and General Selective Inference for Stepwise Feature Selection using the Homotopy Continuation Approach

    Authors: Kazuya Sugiyama, Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: Conditional selective inference (SI) has been actively studied as a new statistical inference framework for data-driven hypotheses. The basic idea of conditional SI is to make inferences conditional on the selection event characterized by a set of linear and/or quadratic inequalities. Conditional SI has been mainly studied in the context of feature selection such as stepwise feature selection (SFS… ▽ More

    Submitted 21 April, 2021; v1 submitted 25 December, 2020; originally announced December 2020.

  44. Supervised sequential pattern mining of event sequences in sport to identify important patterns of play: an application to rugby union

    Authors: Rory Bunker, Keisuke Fujii, Hiroyuki Hanada, Ichiro Takeuchi

    Abstract: Given a set of sequences comprised of time-ordered events, sequential pattern mining is useful to identify frequent subsequences from different sequences or within the same sequence. However, in sport, these techniques cannot determine the importance of particular patterns of play to good or bad outcomes, which is often of greater interest to coaches and performance analysts. In this study, we app… ▽ More

    Submitted 18 May, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

  45. arXiv:2010.01823  [pdf, other

    stat.ML cs.CV cs.LG

    Quantifying Statistical Significance of Neural Network-based Image Segmentation by Selective Inference

    Authors: Vo Nguyen Le Duy, Shogo Iwazaki, Ichiro Takeuchi

    Abstract: Although a vast body of literature relates to image segmentation methods that use deep neural networks (DNNs), less attention has been paid to assessing the statistical reliability of segmentation results. In this study, we interpret the segmentation results as hypotheses driven by DNN (called DNN-driven hypotheses) and propose a method by which to quantify the reliability of these hypotheses with… ▽ More

    Submitted 14 December, 2022; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Accepted at NeurIPS 2022

  46. arXiv:2009.08166  [pdf, other

    stat.ML cs.LG

    Mean-Variance Analysis in Bayesian Optimization under Uncertainty

    Authors: Shogo Iwazaki, Yu Inatsu, Ichiro Takeuchi

    Abstract: We consider active learning (AL) in an uncertain environment in which trade-off between multiple risk measures need to be considered. As an AL problem in such an uncertain environment, we study Mean-Variance Analysis in Bayesian Optimization (MVA-BO) setting. Mean-variance analysis was developed in the field of financial engineering and has been used to make decisions that take into account the tr… ▽ More

    Submitted 17 September, 2020; originally announced September 2020.

    Comments: 26 pages, 3 figures

  47. arXiv:2006.11986  [pdf, other

    stat.ML cs.LG

    Bayesian Quadrature Optimization for Probability Threshold Robustness Measure

    Authors: Shogo Iwazaki, Yu Inatsu, Ichiro Takeuchi

    Abstract: In many product development problems, the performance of the product is governed by two types of parameters called design parameter and environmental parameter. While the former is fully controllable, the latter varies depending on the environment in which the product is used. The challenge of such a problem is to find the design parameter that maximizes the probability that the performance of the… ▽ More

    Submitted 21 June, 2020; originally announced June 2020.

    Comments: 34 pages, 14 figures

  48. arXiv:2004.09749  [pdf, other

    stat.ML cs.LG

    Parametric Programming Approach for More Powerful and General Lasso Selective Inference

    Authors: Vo Nguyen Le Duy, Ichiro Takeuchi

    Abstract: Selective Inference (SI) has been actively studied in the past few years for conducting inference on the features of linear models that are adaptively selected by feature selection methods such as Lasso. The basic idea of SI is to make inference conditional on the selection event. Unfortunately, the main limitation of the original SI approach for Lasso is that the inference is conducted not only c… ▽ More

    Submitted 22 February, 2021; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: International Conference on Artificial Intelligence and Statistics (AISTATS) 2021

  49. arXiv:2002.09132  [pdf, other

    stat.ML cs.LG stat.ME

    Computing Valid p-value for Optimal Changepoint by Selective Inference using Dynamic Programming

    Authors: Vo Nguyen Le Duy, Hiroki Toda, Ryota Sugiyama, Ichiro Takeuchi

    Abstract: There is a vast body of literature related to methods for detecting changepoints (CP). However, less attention has been paid to assessing the statistical reliability of the detected CPs. In this paper, we introduce a novel method to perform statistical inference on the significance of the CPs, estimated by a Dynamic Programming (DP)-based optimal CP detection algorithm. Based on the selective infe… ▽ More

    Submitted 22 February, 2021; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: Spotlight Presentation at NeurIPS 2020

  50. Distance Metric Learning for Graph Structured Data

    Authors: Tomoki Yoshida, Ichiro Takeuchi, Masayuki Karasuyama

    Abstract: Graphs are versatile tools for representing structured data. As a result, a variety of machine learning methods have been studied for graph data analysis. Although many such learning methods depend on the measurement of differences between input graphs, defining an appropriate distance metric for graphs remains a controversial issue. Hence, we propose a supervised distance metric learning method f… ▽ More

    Submitted 17 June, 2021; v1 submitted 3 February, 2020; originally announced February 2020.

    Comments: 38 pages, 11 figures. This is a pre-print of an article published in Machine Learning Journal. The final authenticated version is available online at: https://doi.org/10.1007/s10994-021-06009-3