-
Encoding and Decoding Algorithms of ANS Variants and Evaluation of Their Average Code Lengths
Authors:
Hirosuke Yamamoto,
Ken-ichi Iwata
Abstract:
Asymmetric Numeral Systems (ANS) proposed by Jarek Duda are high-performance distortionless data compression schemes that can achieve almost the same compression performance as arithmetic codes with less arithmetic operations than arithmetic coding. The ANS is widely used in various practical systems like Facebook, Apple, Google, Dropbox, Microsoft, and Pixar, due to their high performance, but ma…
▽ More
Asymmetric Numeral Systems (ANS) proposed by Jarek Duda are high-performance distortionless data compression schemes that can achieve almost the same compression performance as arithmetic codes with less arithmetic operations than arithmetic coding. The ANS is widely used in various practical systems like Facebook, Apple, Google, Dropbox, Microsoft, and Pixar, due to their high performance, but many researchers still lack much knowledge about the ANS. This paper thoroughly explains the encoding and decoding algorithms of the ANS, and theoretically analyzes the average code length achievable by the ANS.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
Estimation of Human Condition at Disaster Site Using Aerial Drone Images
Authors:
Tomoki Arai,
Kenji Iwata,
Kensho Hara,
Yutaka Satoh
Abstract:
Drones are being used to assess the situation in various disasters. In this study, we investigate a method to automatically estimate the damage status of people based on their actions in aerial drone images in order to understand disaster sites faster and save labor. We constructed a new dataset of aerial images of human actions in a hypothetical disaster that occurred in an urban area, and classi…
▽ More
Drones are being used to assess the situation in various disasters. In this study, we investigate a method to automatically estimate the damage status of people based on their actions in aerial drone images in order to understand disaster sites faster and save labor. We constructed a new dataset of aerial images of human actions in a hypothetical disaster that occurred in an urban area, and classified the human damage status using 3D ResNet. The results showed that the status with characteristic human actions could be classified with a recall rate of more than 80%, while other statuses with similar human actions could only be classified with a recall rate of about 50%. In addition, a cloud-based VR presentation application suggested the effectiveness of using drones to understand the disaster site and estimate the human condition.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
The Optimality of AIFV Codes in the Class of $2$-bit Delay Decodable Codes
Authors:
Kengo Hashimoto,
Ken-ichi Iwata
Abstract:
AIFV (almost instantaneous fixed-to-variable length) codes are noiseless source codes that can attain a shorter average codeword length than Huffman codes by allowing a time-variant encoder with two code tables and a decoding delay of at most 2 bits. First, we consider a general class of noiseless source codes, called k-bit delay decodable codes, in which one allows a finite number of code tables…
▽ More
AIFV (almost instantaneous fixed-to-variable length) codes are noiseless source codes that can attain a shorter average codeword length than Huffman codes by allowing a time-variant encoder with two code tables and a decoding delay of at most 2 bits. First, we consider a general class of noiseless source codes, called k-bit delay decodable codes, in which one allows a finite number of code tables and a decoding delay of at most k bits for k >= 0. Then we prove that AIFV codes achieve the optimal average codeword length in the 2-bit delay decodable codes class.
△ Less
Submitted 16 June, 2023;
originally announced June 2023.
-
Properties of k-bit Delay Decodable Codes
Authors:
Kengo Hashimoto,
Ken-ichi Iwata
Abstract:
The class of k-bit delay decodable codes, source codes allowing decoding delay of at most k bits for k >= 0, can attain a shorter average codeword length than Huffman codes. This paper discusses the general properties of the class of k-bit delay decodable codes with a finite number of code tables and proves two theorems which enable us to limit the scope of code-tuples to be considered when discus…
▽ More
The class of k-bit delay decodable codes, source codes allowing decoding delay of at most k bits for k >= 0, can attain a shorter average codeword length than Huffman codes. This paper discusses the general properties of the class of k-bit delay decodable codes with a finite number of code tables and proves two theorems which enable us to limit the scope of code-tuples to be considered when discussing optimal k-bit delay decodable code-tuples.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Information Entropy-based Camera Path Estimation for In-Situ Visualization
Authors:
Ken Iwata,
Naohisa Sakamoto,
Jorji Nonaka,
Chongke Bi
Abstract:
In-situ processing has widely been recognized as an effective approach for the visualization and analysis of large-scale simulation outputs from modern HPC systems. One of the most common approaches for batch-based in-situ visualization is the image- or video-based approach. In this kind of approach, a large number of rendered images are generated from different viewpoints at each time step and ha…
▽ More
In-situ processing has widely been recognized as an effective approach for the visualization and analysis of large-scale simulation outputs from modern HPC systems. One of the most common approaches for batch-based in-situ visualization is the image- or video-based approach. In this kind of approach, a large number of rendered images are generated from different viewpoints at each time step and has proven useful for detailed analysis of the main simulation results. However, during test runs and model calibration runs before the main simulation run, a quick overview might be sufficient and useful. In this work, we focused on selecting the viewpoints which provide as much information as possible by using information entropy to maximize the subsequent visual analysis task. However, by simply following the selected viewpoints at each of the visualization time steps will probably lead to a rapidly changing video, which can impact the understanding. Therefore, we have also worked on an efficient camera path estimation approach for connecting selected viewpoints, at regular intervals, to generate a smooth video. This resulting video is expected to assist in rapid understanding of the underlying simulation phenomena and can be helpful to narrow down the temporal region of interest to minimize the turnaround time during detailed visual exploration via image- or video-based visual analysis of the main simulation run. We implemented and evaluated the proposed approach using the OpenFOAM CFD application, on an x86-based Server and an ARM A64FX-based supercomputer (Fugaku), and we obtained positive evaluations from domain scientists.
△ Less
Submitted 30 January, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Bounding Box-based Multi-objective Bayesian Optimization of Risk Measures under Input Uncertainty
Authors:
Yu Inatsu,
Shion Takeno,
Hiroyuki Hanada,
Kazuki Iwata,
Ichiro Takeuchi
Abstract:
In this study, we propose a novel multi-objective Bayesian optimization (MOBO) method to efficiently identify the Pareto front (PF) defined by risk measures for black-box functions under the presence of input uncertainty (IU). Existing BO methods for Pareto optimization in the presence of IU are risk-specific or without theoretical guarantees, whereas our proposed method addresses general risk mea…
▽ More
In this study, we propose a novel multi-objective Bayesian optimization (MOBO) method to efficiently identify the Pareto front (PF) defined by risk measures for black-box functions under the presence of input uncertainty (IU). Existing BO methods for Pareto optimization in the presence of IU are risk-specific or without theoretical guarantees, whereas our proposed method addresses general risk measures and has theoretical guarantees. The basic idea of the proposed method is to assume a Gaussian process (GP) model for the black-box function and to construct high-probability bounding boxes for the risk measures using the GP model. Furthermore, in order to reduce the uncertainty of non-dominated bounding boxes, we propose a method of selecting the next evaluation point using a maximin distance defined by the maximum value of a quasi distance based on bounding boxes. As theoretical analysis, we prove that the algorithm can return an arbitrary-accurate solution in a finite number of iterations with high probability, for various risk measures such as Bayes risk, worst-case risk, and value-at-risk. We also give a theoretical analysis that takes into account approximation errors because there exist non-negligible approximation errors (e.g., finite approximation of PFs and sampling-based approximation of bounding boxes) in practice. We confirm that the proposed method outperforms compared with existing methods not only in the setting with IU but also in the setting of ordinary MOBO through numerical experiments.
△ Less
Submitted 24 November, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Optimality of Huffman Code in the Class of 1-bit Delay Decodable Codes
Authors:
Kengo Hashimoto,
Ken-ichi Iwata
Abstract:
For a given independent and identically distributed (i.i.d.) source, Huffman code achieves the optimal average codeword length in the class of instantaneous code with a single code table. However, it is known that there exist time-variant encoders, which achieve a shorter average codeword length than the Huffman code, using multiple code tables and allowing at most k-bit decoding delay for k = 2,…
▽ More
For a given independent and identically distributed (i.i.d.) source, Huffman code achieves the optimal average codeword length in the class of instantaneous code with a single code table. However, it is known that there exist time-variant encoders, which achieve a shorter average codeword length than the Huffman code, using multiple code tables and allowing at most k-bit decoding delay for k = 2, 3, 4, . . .. On the other hand, it is not known whether there exists a 1-bit delay decodable code, which achieves a shorter average length than the Huffman code. This paper proves that for a given i.i.d. source, a Huffman code achieves the optimal average codeword length in the class of 1-bit delay decodable codes with a finite number of code tables.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
Describing and Localizing Multiple Changes with Transformers
Authors:
Yue Qiu,
Shintaro Yamamoto,
Kodai Nakashima,
Ryota Suzuki,
Kenji Iwata,
Hirokatsu Kataoka,
Yutaka Satoh
Abstract:
Change captioning tasks aim to detect changes in image pairs observed before and after a scene change and generate a natural language description of the changes. Existing change captioning studies have mainly focused on a single change.However, detecting and describing multiple changed parts in image pairs is essential for enhancing adaptability to complex scenarios. We solve the above issues from…
▽ More
Change captioning tasks aim to detect changes in image pairs observed before and after a scene change and generate a natural language description of the changes. Existing change captioning studies have mainly focused on a single change.However, detecting and describing multiple changed parts in image pairs is essential for enhancing adaptability to complex scenarios. We solve the above issues from three aspects: (i) We propose a simulation-based multi-change captioning dataset; (ii) We benchmark existing state-of-the-art methods of single change captioning on multi-change captioning; (iii) We further propose Multi-Change Captioning transformers (MCCFormers) that identify change regions by densely correlating different regions in image pairs and dynamically determines the related change regions with words in sentences. The proposed method obtained the highest scores on four conventional change captioning evaluation metrics for multi-change captioning. Additionally, our proposed method can separate attention maps for each change and performs well with respect to change localization. Moreover, the proposed framework outperformed the previous state-of-the-art methods on an existing change captioning benchmark, CLEVR-Change, by a large margin (+6.1 on BLEU-4 and +9.7 on CIDEr scores), indicating its general ability in change captioning tasks.
△ Less
Submitted 14 September, 2021; v1 submitted 25 March, 2021;
originally announced March 2021.
-
Can Vision Transformers Learn without Natural Images?
Authors:
Kodai Nakashima,
Hirokatsu Kataoka,
Asato Matsumoto,
Kenji Iwata,
Nakamasa Inoue
Abstract:
Can we complete pre-training of Vision Transformers (ViT) without natural images and human-annotated labels? Although a pre-trained ViT seems to heavily rely on a large-scale dataset and human-annotated labels, recent large-scale datasets contain several problems in terms of privacy violations, inadequate fairness protection, and labor-intensive annotation. In the present paper, we pre-train ViT w…
▽ More
Can we complete pre-training of Vision Transformers (ViT) without natural images and human-annotated labels? Although a pre-trained ViT seems to heavily rely on a large-scale dataset and human-annotated labels, recent large-scale datasets contain several problems in terms of privacy violations, inadequate fairness protection, and labor-intensive annotation. In the present paper, we pre-train ViT without any image collections and annotation labor. We experimentally verify that our proposed framework partially outperforms sophisticated Self-Supervised Learning (SSL) methods like SimCLRv2 and MoCov2 without using any natural images in the pre-training phase. Moreover, although the ViT pre-trained without natural images produces some different visualizations from ImageNet pre-trained ViT, it can interpret natural image datasets to a large extent. For example, the performance rates on the CIFAR-10 dataset are as follows: our proposal 97.6 vs. SimCLRv2 97.4 vs. ImageNet 98.0.
△ Less
Submitted 24 March, 2021;
originally announced March 2021.
-
Graph Blind Deconvolution with Sparseness Constraint
Authors:
Kazuma Iwata,
Koki Yamada,
Yuichi Tanaka
Abstract:
We propose a blind deconvolution method for signals on graphs, with the exact sparseness constraint for the original signal. Graph blind deconvolution is an algorithm for estimating the original signal on a graph from a set of blurred and noisy measurements. Imposing a constraint on the number of nonzero elements is desirable for many different applications. This paper deals with the problem with…
▽ More
We propose a blind deconvolution method for signals on graphs, with the exact sparseness constraint for the original signal. Graph blind deconvolution is an algorithm for estimating the original signal on a graph from a set of blurred and noisy measurements. Imposing a constraint on the number of nonzero elements is desirable for many different applications. This paper deals with the problem with constraints placed on the exact number of original sources, which is given by an optimization problem with an $\ell_0$ norm constraint. We solve this non-convex optimization problem using the ADMM iterative solver. Numerical experiments using synthetic signals demonstrate the effectiveness of the proposed method.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
Countably Infinite Multilevel Source Polarization for Non-Stationary Erasure Distributions
Authors:
Yuta Sakai,
Ken-ichi Iwata,
Hiroshi Fujisaki
Abstract:
Polar transforms are central operations in the study of polar codes. This paper examines polar transforms for non-stationary memoryless sources on possibly infinite source alphabets. This is the first attempt of source polarization analysis over infinite alphabets. The source alphabet is defined to be a Polish group, and we handle the Arıkan-style two-by-two polar transform based on the group. Def…
▽ More
Polar transforms are central operations in the study of polar codes. This paper examines polar transforms for non-stationary memoryless sources on possibly infinite source alphabets. This is the first attempt of source polarization analysis over infinite alphabets. The source alphabet is defined to be a Polish group, and we handle the Arıkan-style two-by-two polar transform based on the group. Defining erasure distributions based on the normal subgroup structure, we give recursive formulas of the polar transform for our proposed erasure distributions. As a result, the recursive formulas lead to concrete examples of multilevel source polarization with countably infinite levels when the group is locally cyclic. We derive this result via elementary techniques in lattice theory.
△ Less
Submitted 26 April, 2019;
originally announced April 2019.
-
Modular Arithmetic Erasure Channels and Their Multilevel Channel Polarization
Authors:
Yuta Sakai,
Ken-ichi Iwata,
Hiroshi Fujisaki
Abstract:
This study proposes \emph{modular arithmetic erasure channels} (MAECs), a novel class of erasure-like channels with an input alphabet that need not be binary. This class contains the binary erasure channel (BEC) and some other known erasure-like channels as special cases. For MAECs, we provide recursive formulas of Arıkan-like polar transform to simulate channel polarization. In other words, we sh…
▽ More
This study proposes \emph{modular arithmetic erasure channels} (MAECs), a novel class of erasure-like channels with an input alphabet that need not be binary. This class contains the binary erasure channel (BEC) and some other known erasure-like channels as special cases. For MAECs, we provide recursive formulas of Arıkan-like polar transform to simulate channel polarization. In other words, we show that the synthetic channels of MAECs are equivalent to other MAECs. This is a generalization of well-known recursive formulas of the polar transform for BECs. Using our recursive formulas, we also show that a recursive application of the polar transform for MAECs results in \emph{multilevel channel polarization,} which is an asymptotic phenomenon that is characteristic of non-binary polar codes. Specifically, we establish a method to calculate the limiting proportions of the partially noiseless and noisy channels that are generated as a result of multilevel channel polarization for MAECs. In the particular case of MAECs, this calculation method solves an open problem posed by Nasser (2017) in the study of non-binary polar codes.
△ Less
Submitted 4 May, 2020; v1 submitted 23 April, 2018;
originally announced April 2018.
-
Asymptotic Distribution of Multilevel Channel Polarization for a Certain Class of Erasure Channels
Authors:
Yuta Sakai,
Ken-ichi Iwata,
Hiroshi Fujisaki
Abstract:
This study examines multilevel channel polarization for a certain class of erasure channels that the input alphabet size is an arbitrary composite number. We derive limiting proportions of partially noiseless channels for such a class. The results of this study are proved by an argument of convergent sequences, inspired by Alsan and Telatar's simple proof of polarization, and without martingale co…
▽ More
This study examines multilevel channel polarization for a certain class of erasure channels that the input alphabet size is an arbitrary composite number. We derive limiting proportions of partially noiseless channels for such a class. The results of this study are proved by an argument of convergent sequences, inspired by Alsan and Telatar's simple proof of polarization, and without martingale convergence theorems for polarization process.
△ Less
Submitted 13 January, 2018;
originally announced January 2018.
-
Proceedings of Workshop AEW10: Concepts in Information Theory and Communications
Authors:
Kees A. Schouhamer Immink,
Stan Baggen,
Ferdaous Chaabane,
Yanling Chen,
Peter H. N. de With,
Hela Gassara,
Hamed Gharbi,
Adel Ghazel,
Khaled Grati,
Naira M. Grigoryan,
Ashot Harutyunyan,
Masayuki Imanishi,
Mitsugu Iwamoto,
Ken-ichi Iwata,
Hiroshi Kamabe,
Brian M. Kurkoski,
Shigeaki Kuzuoka,
Patrick Langenhuizen,
Jan Lewandowsky,
Akiko Manada,
Shigeki Miyake,
Hiroyoshi Morita,
Jun Muramatsu,
Safa Najjar,
Arnak V. Poghosyan
, et al. (9 additional authors not shown)
Abstract:
The 10th Asia-Europe workshop in "Concepts in Information Theory and Communications" AEW10 was held in Boppard, Germany on June 21-23, 2017. It is based on a longstanding cooperation between Asian and European scientists. The first workshop was held in Eindhoven, the Netherlands in 1989. The idea of the workshop is threefold: 1) to improve the communication between the scientist in the different p…
▽ More
The 10th Asia-Europe workshop in "Concepts in Information Theory and Communications" AEW10 was held in Boppard, Germany on June 21-23, 2017. It is based on a longstanding cooperation between Asian and European scientists. The first workshop was held in Eindhoven, the Netherlands in 1989. The idea of the workshop is threefold: 1) to improve the communication between the scientist in the different parts of the world; 2) to exchange knowledge and ideas; and 3) to pay a tribute to a well respected and special scientist.
△ Less
Submitted 27 July, 2017;
originally announced July 2017.
-
Sharp Bounds on Arimoto's Conditional Rényi Entropies Between Two Distinct Orders
Authors:
Yuta Sakai,
Ken-ichi Iwata
Abstract:
This study examines sharp bounds on Arimoto's conditional Rényi entropy of order $β$ with a fixed another one of distinct order $α\neq β$. Arimoto inspired the relation between the Rényi entropy and the $\ell_{r}$-norm of probability distributions, and he introduced a conditional version of the Rényi entropy. From this perspective, we analyze the $\ell_{r}$-norms of particular distributions. As re…
▽ More
This study examines sharp bounds on Arimoto's conditional Rényi entropy of order $β$ with a fixed another one of distinct order $α\neq β$. Arimoto inspired the relation between the Rényi entropy and the $\ell_{r}$-norm of probability distributions, and he introduced a conditional version of the Rényi entropy. From this perspective, we analyze the $\ell_{r}$-norms of particular distributions. As results, we identify specific probability distributions whose achieve our sharp bounds on the conditional Rényi entropy. The sharp bounds derived in this study can be applicable to other information measures, e.g., the minimum average probability of error, the Bhattacharyya parameter, Gallager's reliability function $E_{0}$, and Sibson's $α$-mutual information, whose are strictly monotone functions of the conditional Rényi entropy.
△ Less
Submitted 2 February, 2017; v1 submitted 31 January, 2017;
originally announced February 2017.
-
Dominant Codewords Selection with Topic Model for Action Recognition
Authors:
Hirokatsu Kataoka,
Masaki Hayashi,
Kenji Iwata,
Yutaka Satoh,
Yoshimitsu Aoki,
Slobodan Ilic
Abstract:
In this paper, we propose a framework for recognizing human activities that uses only in-topic dominant codewords and a mixture of intertopic vectors. Latent Dirichlet allocation (LDA) is used to develop approximations of human motion primitives; these are mid-level representations, and they adaptively integrate dominant vectors when classifying human activities. In LDA topic modeling, action vide…
▽ More
In this paper, we propose a framework for recognizing human activities that uses only in-topic dominant codewords and a mixture of intertopic vectors. Latent Dirichlet allocation (LDA) is used to develop approximations of human motion primitives; these are mid-level representations, and they adaptively integrate dominant vectors when classifying human activities. In LDA topic modeling, action videos (documents) are represented by a bag-of-words (input from a dictionary), and these are based on improved dense trajectories. The output topics correspond to human motion primitives, such as finger moving or subtle leg motion. We eliminate the impurities, such as missed tracking or changing light conditions, in each motion primitive. The assembled vector of motion primitives is an improved representation of the action. We demonstrate our method on four different datasets.
△ Less
Submitted 1 May, 2016;
originally announced May 2016.
-
Sharp Bounds Between Two Rényi Entropies of Distinct Positive Orders
Authors:
Yuta Sakai,
Ken-ichi Iwata
Abstract:
Many axiomatic definitions of entropy, such as the Rényi entropy, of a random variable are closely related to the $\ell_α$-norm of its probability distribution. This study considers probability distributions on finite sets, and examines the sharp bounds of the $\ell_β$-norm with a fixed $\ell_α$-norm, $α\neq β$, for $n$-dimensional probability vectors with an integer $n \ge 2$. From the results, w…
▽ More
Many axiomatic definitions of entropy, such as the Rényi entropy, of a random variable are closely related to the $\ell_α$-norm of its probability distribution. This study considers probability distributions on finite sets, and examines the sharp bounds of the $\ell_β$-norm with a fixed $\ell_α$-norm, $α\neq β$, for $n$-dimensional probability vectors with an integer $n \ge 2$. From the results, we derive the sharp bounds of the Rényi entropy of positive order $β$ with a fixed Rényi entropy of another positive order $α$. As applications, we investigate sharp bounds of Ariomoto's mutual information of order $α$ and Gallager's random coding exponents for uniformly focusing channels under the uniform input distribution.
△ Less
Submitted 5 May, 2016; v1 submitted 29 April, 2016;
originally announced May 2016.
-
A Generalized Erasure Channel in the Sense of Polarization for Binary Erasure Channels
Authors:
Yuta Sakai,
Ken-ichi Iwata
Abstract:
The polar transformation of a binary erasure channel (BEC) can be exactly approximated by other BECs. Arıkan proposed that polar codes for a BEC can be efficiently constructed by using its useful property. This study proposes a new class of arbitrary input generalized erasure channels, which can be exactly approximated the polar transformation by other same channel models, as with the BEC. One of…
▽ More
The polar transformation of a binary erasure channel (BEC) can be exactly approximated by other BECs. Arıkan proposed that polar codes for a BEC can be efficiently constructed by using its useful property. This study proposes a new class of arbitrary input generalized erasure channels, which can be exactly approximated the polar transformation by other same channel models, as with the BEC. One of the main results is the recursive formulas of the polar transformation of the proposed channel. In the study, we evaluate the polar transformation by using the $α$-mutual information. Particularly, when the input alphabet size is a prime power, we examines the following: (i) inequalities for the average of the $α$-mutual information of the proposed channel after the one-step polar transformation, and (ii) the exact proportion of polarizations of the $α$-mutual information of proposed channels in infinite number of polar transformations.
△ Less
Submitted 15 April, 2016;
originally announced April 2016.
-
Relations Between Conditional Shannon Entropy and Expectation of $\ell_α$-Norm
Authors:
Yuta Sakai,
Ken-ichi Iwata
Abstract:
The paper examines relationships between the conditional Shannon entropy and the expectation of $\ell_α$-norm for joint probability distributions. More precisely, we investigate the tight bounds of the expectation of $\ell_α$-norm with a fixed conditional Shannon entropy, and vice versa. As applications of the results, we derive the tight bounds between the conditional Shannon entropy and several…
▽ More
The paper examines relationships between the conditional Shannon entropy and the expectation of $\ell_α$-norm for joint probability distributions. More precisely, we investigate the tight bounds of the expectation of $\ell_α$-norm with a fixed conditional Shannon entropy, and vice versa. As applications of the results, we derive the tight bounds between the conditional Shannon entropy and several information measures which are determined by the expectation of $\ell_α$-norm, e.g., the conditional Rényi entropy and the conditional $R$-norm information. Moreover, we apply these results to discrete memoryless channels under a uniform input distribution. Then, we show the tight bounds of Gallager's $E_{0}$ functions with a fixed mutual information under a uniform input distribution.
△ Less
Submitted 15 February, 2016;
originally announced February 2016.
-
Extremal Relations Between Shannon Entropy and $\ell_α$-Norm
Authors:
Yuta Sakai,
Ken-ichi Iwata
Abstract:
The paper examines relationships between the Shannon entropy and the $\ell_α$-norm for $n$-ary probability vectors, $n \ge 2$. More precisely, we investigate the tight bounds of the $\ell_α$-norm with a fixed Shannon entropy, and vice versa. As applications of the results, we derive the tight bounds between the Shannon entropy and several information measures which are determined by the $\ell_α$-n…
▽ More
The paper examines relationships between the Shannon entropy and the $\ell_α$-norm for $n$-ary probability vectors, $n \ge 2$. More precisely, we investigate the tight bounds of the $\ell_α$-norm with a fixed Shannon entropy, and vice versa. As applications of the results, we derive the tight bounds between the Shannon entropy and several information measures which are determined by the $\ell_α$-norm, e.g., Rényi entropy, Tsallis entropy, the $R$-norm information, and some diversity indices. Moreover, we apply these results to uniformly focusing channels. Then, we show the tight bounds of Gallager's $E_{0}$ functions with a fixed mutual information under a uniform input distribution.
△ Less
Submitted 28 January, 2016;
originally announced January 2016.
-
Feature Evaluation of Deep Convolutional Neural Networks for Object Recognition and Detection
Authors:
Hirokatsu Kataoka,
Kenji Iwata,
Yutaka Satoh
Abstract:
In this paper, we evaluate convolutional neural network (CNN) features using the AlexNet architecture and very deep convolutional network (VGGNet) architecture. To date, most CNN researchers have employed the last layers before output, which were extracted from the fully connected feature layers. However, since it is unlikely that feature representation effectiveness is dependent on the problem, t…
▽ More
In this paper, we evaluate convolutional neural network (CNN) features using the AlexNet architecture and very deep convolutional network (VGGNet) architecture. To date, most CNN researchers have employed the last layers before output, which were extracted from the fully connected feature layers. However, since it is unlikely that feature representation effectiveness is dependent on the problem, this study evaluates additional convolutional layers that are adjacent to fully connected layers, in addition to executing simple tuning for feature concatenation (e.g., layer 3 + layer 5 + layer 7) and transformation, using tools such as principal component analysis. In our experiments, we carried out detection and classification tasks using the Caltech 101 and Daimler Pedestrian Benchmark Datasets.
△ Less
Submitted 25 September, 2015;
originally announced September 2015.