-
Adaptive Deep Iris Feature Extractor at Arbitrary Resolutions
Authors:
Yuho Shoji,
Yuka Ogino,
Takahiro Toizumi,
Atsushi Ito
Abstract:
This paper proposes a deep feature extractor for iris recognition at arbitrary resolutions. Resolution degradation reduces the recognition performance of deep learning models trained by high-resolution images. Using various-resolution images for training can improve the model's robustness while sacrificing recognition performance for high-resolution images. To achieve higher recognition performanc…
▽ More
This paper proposes a deep feature extractor for iris recognition at arbitrary resolutions. Resolution degradation reduces the recognition performance of deep learning models trained by high-resolution images. Using various-resolution images for training can improve the model's robustness while sacrificing recognition performance for high-resolution images. To achieve higher recognition performance at various resolutions, we propose a method of resolution-adaptive feature extraction with automatically switching networks. Our framework includes resolution expert modules specialized for different resolution degradations, including down-sampling and out-of-focus blurring. The framework automatically switches them depending on the degradation condition of an input image. Lower-resolution experts are trained by knowledge-distillation from the high-resolution expert in such a manner that both experts can extract common identity features. We applied our framework to three conventional neural network models. The experimental results show that our method enhances the recognition performance at low-resolution in the conventional methods and also maintains their performance at high-resolution.
△ Less
Submitted 12 July, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Embedding Digital Signature into CSV Files Using Data Hiding
Authors:
Akinori Ito
Abstract:
Open data is an important basis for open science and evidence-based policymaking. Governments of many countries disclose government-related statistics as open data. Some of these data are provided as CSV files. However, since CSV files are plain texts, we cannot ensure the integrity of a downloaded CSV file. A popular way to prove the data's integrity is a digital signature; however, it is difficu…
▽ More
Open data is an important basis for open science and evidence-based policymaking. Governments of many countries disclose government-related statistics as open data. Some of these data are provided as CSV files. However, since CSV files are plain texts, we cannot ensure the integrity of a downloaded CSV file. A popular way to prove the data's integrity is a digital signature; however, it is difficult to embed a signature into a CSV file. This paper proposes a method for embedding a digital signature into a CSV file using a data hiding technique. The proposed method exploits a redundancy of the CSV format related to the use of double quotes. The experiment revealed we could embed a 512-bit signature into actual open data CSV files.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching
Authors:
Akira Ito,
Masanori Yamada,
Atsutoshi Kumagai
Abstract:
Recently, Ainsworth et al. showed that using weight matching (WM) to minimize the $L_2$ distance in a permutation search of model parameters effectively identifies permutations that satisfy linear mode connectivity (LMC), in which the loss along a linear path between two independently trained models with different seeds remains nearly constant. This paper provides a theoretical analysis of LMC usi…
▽ More
Recently, Ainsworth et al. showed that using weight matching (WM) to minimize the $L_2$ distance in a permutation search of model parameters effectively identifies permutations that satisfy linear mode connectivity (LMC), in which the loss along a linear path between two independently trained models with different seeds remains nearly constant. This paper provides a theoretical analysis of LMC using WM, which is crucial for understanding stochastic gradient descent's effectiveness and its application in areas like model merging. We first experimentally and theoretically show that permutations found by WM do not significantly reduce the $L_2$ distance between two models and the occurrence of LMC is not merely due to distance reduction by WM in itself. We then provide theoretical insights showing that permutations can change the directions of the singular vectors, but not the singular values, of the weight matrices in each layer. This finding shows that permutations found by WM mainly align the directions of singular vectors associated with large singular values across models. This alignment brings the singular vectors with large singular values, which determine the model functionality, closer between pre-merged and post-merged models, so that the post-merged model retains functionality similar to the pre-merged models, making it easy to satisfy LMC. Finally, we analyze the difference between WM and straight-through estimator (STE), a dataset-dependent permutation search method, and show that WM outperforms STE, especially when merging three or more models.
△ Less
Submitted 15 April, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning
Authors:
Xuecheng Niu,
Akinori Ito,
Takashi Nose
Abstract:
Training task-oriented dialog agents based on reinforcement learning is time-consuming and requires a large number of interactions with real users. How to grasp dialog policy within limited dialog experiences remains an obstacle that makes the agent training process less efficient. In addition, most previous frameworks start training by randomly choosing training samples, which differs from the hu…
▽ More
Training task-oriented dialog agents based on reinforcement learning is time-consuming and requires a large number of interactions with real users. How to grasp dialog policy within limited dialog experiences remains an obstacle that makes the agent training process less efficient. In addition, most previous frameworks start training by randomly choosing training samples, which differs from the human learning method and hurts the efficiency and stability of training. Therefore, we propose Scheduled Curiosity-Deep Dyna-Q (SC-DDQ), a curiosity-driven curriculum learning framework based on a state-of-the-art model-based reinforcement learning dialog model, Deep Dyna-Q (DDQ). Furthermore, we designed learning schedules for SC-DDQ and DDQ, respectively, following two opposite training strategies: classic curriculum learning and its reverse version. Our results show that by introducing scheduled learning and curiosity, the new framework leads to a significant improvement over the DDQ and Deep Q-learning(DQN). Surprisingly, we found that traditional curriculum learning was not always effective. Specifically, according to the experimental results, the easy-first and difficult-first strategies are more suitable for SC-DDQ and DDQ. To analyze our results, we adopted the entropy of sampled actions to depict action exploration and found that training strategies with high entropy in the first stage and low entropy in the last stage lead to better performance.
△ Less
Submitted 20 May, 2024; v1 submitted 31 January, 2024;
originally announced February 2024.
-
Improving Low-Light Image Recognition Performance Based on Image-adaptive Learnable Module
Authors:
Seitaro Ono,
Yuka Ogino,
Takahiro Toizumi,
Atsushi Ito,
Masato Tsukada
Abstract:
In recent years, significant progress has been made in image recognition technology based on deep neural networks. However, improving recognition performance under low-light conditions remains a significant challenge. This study addresses the enhancement of recognition model performance in low-light conditions. We propose an image-adaptive learnable module which apply appropriate image processing…
▽ More
In recent years, significant progress has been made in image recognition technology based on deep neural networks. However, improving recognition performance under low-light conditions remains a significant challenge. This study addresses the enhancement of recognition model performance in low-light conditions. We propose an image-adaptive learnable module which apply appropriate image processing on input images and a hyperparameter predictor to forecast optimal parameters used in the module. Our proposed approach allows for the enhancement of recognition performance under low-light conditions by easily integrating as a front-end filter without the need to retrain existing recognition models designed for low-light conditions. Through experiments, our proposed method demonstrates its contribution to enhancing image recognition performance under low-light conditions.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Authors:
Aoi Ito,
Shota Horiguchi
Abstract:
Large-scale pretrained models using self-supervised learning have reportedly improved the performance of speech anti-spoofing. However, the attacker side may also make use of such models. Also, since it is very expensive to train such models from scratch, pretrained models on the Internet are often used, but the attacker and defender may possibly use the same pretrained model. This paper investiga…
▽ More
Large-scale pretrained models using self-supervised learning have reportedly improved the performance of speech anti-spoofing. However, the attacker side may also make use of such models. Also, since it is very expensive to train such models from scratch, pretrained models on the Internet are often used, but the attacker and defender may possibly use the same pretrained model. This paper investigates whether the improvement in anti-spoofing with pretrained models holds under the condition that the models are available to attackers. As the attacker, we train a model that enhances spoofed utterances so that the speaker embedding extractor based on the pretrained models cannot distinguish between bona fide and spoofed utterances. Experimental results show that the gains the anti-spoofing models obtained by using the pretrained models almost disappear if the attacker also makes use of the pretrained models.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Computer-assisted proofs of "Kariya's theorem" with computer algebra
Authors:
Ayane Ito,
Takefumi Kasai,
Akira Terui
Abstract:
We demonstrate computer-assisted proofs of "Kariya's theorem," a theorem in elementary geometry, with computer algebra. In the proof of geometry theorem with computer algebra, vertices of geometric figures that are subjects for the proof are expressed as variables. The variables are classified into two classes: arbitrarily given points and the points defined from the former points by constraints.…
▽ More
We demonstrate computer-assisted proofs of "Kariya's theorem," a theorem in elementary geometry, with computer algebra. In the proof of geometry theorem with computer algebra, vertices of geometric figures that are subjects for the proof are expressed as variables. The variables are classified into two classes: arbitrarily given points and the points defined from the former points by constraints. We show proofs of Kariya's theorem with two formulations according to two ways for giving the arbitrary points: one is called "vertex formulation," and the other is called "incenter formulation," with two methods: one is Gröbner basis computation, and the other is Wu's method. Furthermore, we show computer-assisted proofs of the property that the point so-called "Kariya point" is located on the hyperbola so-called "Feuerbach's hyperbola", with two formulations and two methods.
△ Less
Submitted 15 April, 2023;
originally announced April 2023.
-
MPC Builder for Autonomous Drive: Automatic Generation of MPCs for Motion Planning and Control
Authors:
Kohei Honda,
Hiroyuki Okuda,
Tatsuya Suzuki,
Akira Ito
Abstract:
This study presents a new framework for vehicle motion planning and control based on the automatic generation of model predictive controllers (MPCs) named MPC Builder. In this framework, several components necessary for MPC, such as prediction models, constraints, and cost functions, are prepared in advance. The MPC Builder then generates various MPCs online in a unified manner according to traffi…
▽ More
This study presents a new framework for vehicle motion planning and control based on the automatic generation of model predictive controllers (MPCs) named MPC Builder. In this framework, several components necessary for MPC, such as prediction models, constraints, and cost functions, are prepared in advance. The MPC Builder then generates various MPCs online in a unified manner according to traffic situations. This scheme enabled us to represent various driving tasks with less design effort than typical switched MPC systems. The proposed framework was implemented considering the continuation/generalized minimum residual (C/GMRES) method optimization solver, which can reduce computational costs. Finally, numerical experiments on multiple driving scenarios were presented.
△ Less
Submitted 22 April, 2023; v1 submitted 29 October, 2022;
originally announced October 2022.
-
Image quality enhancement of embedded holograms in holographic information hiding using deep neural networks
Authors:
Tomoyoshi Shimobaba,
Sota Oshima,
Takashi Kakue,
and Tomoyoshi Ito
Abstract:
Holographic information hiding is a technique for embedding holograms or images into another hologram, used for copyright protection and steganography of holograms. Using deep neural networks, we offer a way to improve the visual quality of embedded holograms. The brightness of an embedded hologram is set to a fraction of that of the host hologram, resulting in a barely damaged reconstructed image…
▽ More
Holographic information hiding is a technique for embedding holograms or images into another hologram, used for copyright protection and steganography of holograms. Using deep neural networks, we offer a way to improve the visual quality of embedded holograms. The brightness of an embedded hologram is set to a fraction of that of the host hologram, resulting in a barely damaged reconstructed image of the host hologram. However, it is difficult to perceive because the embedded hologram's reconstructed image is darker than the reconstructed host image. In this study, we use deep neural networks to restore the darkened image.
△ Less
Submitted 19 December, 2021;
originally announced December 2021.
-
Projective reconstruction in algebraic vision
Authors:
Atsushi Ito,
Makoto Miura,
Kazushi Ueda
Abstract:
We discuss the geometry of rational maps from a projective space of an arbitrary dimension to the product of projective spaces of lower dimensions induced by linear projections. In particular, we give an algebro-geometric variant of the projective reconstruction theorem by Hartley and Schaffalitzky [HS09].
We discuss the geometry of rational maps from a projective space of an arbitrary dimension to the product of projective spaces of lower dimensions induced by linear projections. In particular, we give an algebro-geometric variant of the projective reconstruction theorem by Hartley and Schaffalitzky [HS09].
△ Less
Submitted 11 November, 2019; v1 submitted 17 October, 2017;
originally announced October 2017.
-
Context-Sensitive Measurement of Word Distance by Adaptive Scaling of a Semantic Space
Authors:
Hideki Kozima,
Akira Ito
Abstract:
The paper proposes a computationally feasible method for measuring context-sensitive semantic distance between words. The distance is computed by adaptive scaling of a semantic space. In the semantic space, each word in the vocabulary V is represented by a multi-dimensional vector which is obtained from an English dictionary through a principal component analysis. Given a word set C which specif…
▽ More
The paper proposes a computationally feasible method for measuring context-sensitive semantic distance between words. The distance is computed by adaptive scaling of a semantic space. In the semantic space, each word in the vocabulary V is represented by a multi-dimensional vector which is obtained from an English dictionary through a principal component analysis. Given a word set C which specifies a context for measuring word distance, each dimension of the semantic space is scaled up or down according to the distribution of C in the semantic space. In the space thus transformed, distance between words in V becomes dependent on the context C. An evaluation through a word prediction task shows that the proposed measurement successfully extracts the context of a text.
△ Less
Submitted 25 June, 1996; v1 submitted 23 January, 1996;
originally announced January 1996.