-
Speed-accuracy trade-off for the diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport
Authors:
Kotaro Ikeda,
Tomoya Uda,
Daisuke Okanohara,
Sosuke Ito
Abstract:
We discuss a connection between a generative model, called the diffusion model, and nonequilibrium thermodynamics for the Fokker-Planck equation, called stochastic thermodynamics. Based on the techniques of stochastic thermodynamics, we derive the speed-accuracy trade-off for the diffusion models, which is a trade-off relationship between the speed and accuracy of data generation in diffusion mode…
▽ More
We discuss a connection between a generative model, called the diffusion model, and nonequilibrium thermodynamics for the Fokker-Planck equation, called stochastic thermodynamics. Based on the techniques of stochastic thermodynamics, we derive the speed-accuracy trade-off for the diffusion models, which is a trade-off relationship between the speed and accuracy of data generation in diffusion models. Our result implies that the entropy production rate in the forward process affects the errors in data generation. From a stochastic thermodynamic perspective, our results provide quantitative insight into how best to generate data in diffusion models. The optimal learning protocol is introduced by the conservative force in stochastic thermodynamics and the geodesic of space by the 2-Wasserstein distance in optimal transport theory. We numerically illustrate the validity of the speed-accuracy trade-off for the diffusion models with different noise schedules such as the cosine schedule, the conditional optimal transport, and the optimal transport.
△ Less
Submitted 22 July, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
PlaNet-S: Automatic Semantic Segmentation of Placenta
Authors:
Shinnosuke Yamamoto,
Isso Saito,
Eichi Takaya,
Ayaka Harigai,
Tomomi Sato,
Tomoya Kobayashi,
Kei Takase,
Takuya Ueda
Abstract:
[Purpose] To develop a fully automated semantic placenta segmentation model that integrates the U-Net and SegNeXt architectures through ensemble learning. [Methods] A total of 218 pregnant women with suspected placental anomalies who underwent magnetic resonance imaging (MRI) were enrolled, yielding 1090 annotated images for developing a deep learning model for placental segmentation. The images w…
▽ More
[Purpose] To develop a fully automated semantic placenta segmentation model that integrates the U-Net and SegNeXt architectures through ensemble learning. [Methods] A total of 218 pregnant women with suspected placental anomalies who underwent magnetic resonance imaging (MRI) were enrolled, yielding 1090 annotated images for developing a deep learning model for placental segmentation. The images were standardized and divided into training and test sets. The performance of PlaNet-S, which integrates U-Net and SegNeXt within an ensemble framework, was assessed using Intersection over Union (IoU) and counting connected components (CCC) against the U-Net model. [Results] PlaNet-S had significantly higher IoU (0.73 +/- 0.13) than that of U-Net (0.78 +/- 0.010) (p<0.01). The CCC for PlaNet-S was significantly higher than that for U-Net (p<0.01), matching the ground truth in 86.0\% and 56.7\% of the cases, respectively. [Conclusion]PlaNet-S performed better than the traditional U-Net in placental segmentation tasks. This model addresses the challenges of time-consuming physician-assisted manual segmentation and offers the potential for diverse applications in placental imaging analyses.
△ Less
Submitted 26 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
A computationally efficient semi-blind source separation based approach for nonlinear echo cancellation based on an element-wise iterative source steering
Authors:
Kunxing Lu,
Xianrui Wang,
Tetsuya Ueda,
Shoji Makino,
Jingdong Chen
Abstract:
While the semi-blind source separation-based acoustic echo cancellation (SBSS-AEC) has received much research attention due to its promising performance during double-talk compared to the traditional adaptive algorithms, it suffers from system latency and nonlinear distortions. To circumvent these drawbacks, the recently developed ideas on convolutive transfer function (CTF) approximation and nonl…
▽ More
While the semi-blind source separation-based acoustic echo cancellation (SBSS-AEC) has received much research attention due to its promising performance during double-talk compared to the traditional adaptive algorithms, it suffers from system latency and nonlinear distortions. To circumvent these drawbacks, the recently developed ideas on convolutive transfer function (CTF) approximation and nonlinear expansion have been used in the iterative projection (IP)-based semi-blind source separation (SBSS) algorithm. However, because of the introduction of CTF approximation and nonlinear expansion, this algorithm becomes computationally very expensive, which makes it difficult to implement in embedded systems. Thus, we attempt in this paper to improve this IP-based algorithm, thereby developing an element-wise iterative source steering (EISS) algorithm. In comparison with the IP-based SBSS algorithm, the proposed algorithm is computationally much more efficient, especially when the nonlinear expansion order is high and the length of the CTF filter is long. Meanwhile, its AEC performance is as good as that of IP-based SBSS.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis
Authors:
Shashank Kotyan,
Tatsuya Ueda,
Danilo Vasconcellos Vargas
Abstract:
Most examinations of neural networks' learned latent spaces typically employ dimensionality reduction techniques such as t-SNE or UMAP. These methods distort the local neighborhood in the visualization, making it hard to distinguish the structure of a subset of samples in the latent space. In response to this challenge, we introduce the {k*~distribution} and its corresponding visualization techniq…
▽ More
Most examinations of neural networks' learned latent spaces typically employ dimensionality reduction techniques such as t-SNE or UMAP. These methods distort the local neighborhood in the visualization, making it hard to distinguish the structure of a subset of samples in the latent space. In response to this challenge, we introduce the {k*~distribution} and its corresponding visualization technique This method uses local neighborhood analysis to guarantee the preservation of the structure of sample distributions for individual classes within the subset of the learned latent space. This facilitates easy comparison of different k*~distributions, enabling analysis of how various classes are processed by the same neural network. Our study reveals three distinct distributions of samples within the learned latent space subset: a) Fractured, b) Overlapped, and c) Clustered, providing a more profound understanding of existing contemporary visualizations. Experiments show that the distribution of samples within the network's learned latent space significantly varies depending on the class. Furthermore, we illustrate that our analysis can be applied to explore the latent space of diverse neural network architectures, various layers within neural networks, transformations applied to input samples, and the distribution of training and testing data for neural networks. Thus, the k* distribution should aid in visualizing the structure inside neural networks and further foster their understanding. Project Website is available online at https://shashankkotyan.github.io/k-Distribution/.
△ Less
Submitted 16 August, 2024; v1 submitted 6 December, 2023;
originally announced December 2023.
-
Marine Snow Removal Benchmarking Dataset
Authors:
Reina Kaneko,
Yuya Sato,
Takumi Ueda,
Hiroshi Higashi,
Yuichi Tanaka
Abstract:
This paper introduces a new benchmarking dataset for marine snow removal of underwater images. Marine snow is one of the main degradation sources of underwater images that are caused by small particles, e.g., organic matter and sand, between the underwater scene and photosensors. We mathematically model two typical types of marine snow from the observations of real underwater images. The modeled a…
▽ More
This paper introduces a new benchmarking dataset for marine snow removal of underwater images. Marine snow is one of the main degradation sources of underwater images that are caused by small particles, e.g., organic matter and sand, between the underwater scene and photosensors. We mathematically model two typical types of marine snow from the observations of real underwater images. The modeled artifacts are synthesized with underwater images to construct large-scale pairs of ground truth and degraded images to calculate objective qualities for marine snow removal and to train a deep neural network. We propose two marine snow removal tasks using the dataset and show the first benchmarking results of marine snow removal. The Marine Snow Removal Benchmarking Dataset is publicly available online.
△ Less
Submitted 12 January, 2024; v1 submitted 25 March, 2021;
originally announced March 2021.
-
IlluminatedFocus: Vision Augmentation using Spatial Defocusing via Focal Sweep Eyeglasses and High-Speed Projector
Authors:
Tatsuyuki Ueda,
Daisuke Iwai,
Takefumi Hiraki,
Kosuke Sato
Abstract:
Aiming at realizing novel vision augmentation experiences, this paper proposes the IlluminatedFocus technique, which spatially defocuses real-world appearances regardless of the distance from the user's eyes to observed real objects. With the proposed technique, a part of a real object in an image appears blurred, while the fine details of the other part at the same distance remain visible. We app…
▽ More
Aiming at realizing novel vision augmentation experiences, this paper proposes the IlluminatedFocus technique, which spatially defocuses real-world appearances regardless of the distance from the user's eyes to observed real objects. With the proposed technique, a part of a real object in an image appears blurred, while the fine details of the other part at the same distance remain visible. We apply Electrically Focus-Tunable Lenses (ETL) as eyeglasses and a synchronized high-speed projector as illumination for a real scene. We periodically modulate the focal lengths of the glasses (focal sweep) at more than 60 Hz so that a wearer cannot perceive the modulation. A part of the scene to appear focused is illuminated by the projector when it is in focus of the user's eyes, while another part to appear blurred is illuminated when it is out of the focus. As the basis of our spatial focus control, we build mathematical models to predict the range of distance from the ETL within which real objects become blurred on the retina of a user. Based on the blur range, we discuss a design guideline for effective illumination timing and focal sweep range. We also model the apparent size of a real scene altered by the focal length modulation. This leads to an undesirable visible seam between focused and blurred areas. We solve this unique problem by gradually blending the two areas. Finally, we demonstrate the feasibility of our proposal by implementing various vision augmentation applications.
△ Less
Submitted 6 February, 2020;
originally announced February 2020.
-
FORM version 4.2
Authors:
Ben Ruijl,
Takahiro Ueda,
Jos Vermaseren
Abstract:
We introduce FORM 4.2, a new minor release of the symbolic manipulation toolkit. We demonstrate several new features, such as a new pattern matching option, new output optimization, and automatic expansion of rational functions.
We introduce FORM 4.2, a new minor release of the symbolic manipulation toolkit. We demonstrate several new features, such as a new pattern matching option, new output optimization, and automatic expansion of rational functions.
△ Less
Submitted 20 July, 2017;
originally announced July 2017.
-
How to solve the cake-cutting problem in sublinear time
Authors:
Hiro Ito,
Takahiro Ueda
Abstract:
In this paper, we show algorithms for solving the cake-cutting problem in sublinear-time. More specifically, we preassign (simple) fair portions to o(n) players in o(n)-time, and minimize the damage to the rest of the players. All currently known algorithms require Omega(n)-time, even when assigning a portion to just one player, and it is nontrivial to revise these algorithms to run in $o(n)$-time…
▽ More
In this paper, we show algorithms for solving the cake-cutting problem in sublinear-time. More specifically, we preassign (simple) fair portions to o(n) players in o(n)-time, and minimize the damage to the rest of the players. All currently known algorithms require Omega(n)-time, even when assigning a portion to just one player, and it is nontrivial to revise these algorithms to run in $o(n)$-time since many of the remaining players, who have not been asked any queries, may not be satisfied with the remaining cake. To challenge this problem, we begin by providing a framework for solving the cake-cutting problem in sublinear-time. Generally speaking, solving a problem in sublinear-time requires the use of approximations. However, in our framework, we introduce the concept of "eps n-victims," which means that eps n players (victims) may not get fair portions, where 0< eps =< 1 is an arbitrary constant. In our framework, an algorithm consists of the following two parts: In the first (Preassigning) part, it distributes fair portions to r < n players in o(n)-time. In the second (Completion) part, it distributes fair portions to the remaining n-r players except for the eps n victims in poly}(n)-time. There are two variations on the r players in the first part. Specifically, whether they can or cannot be designated. We will then present algorithms in this framework. In particular, an O(r/eps)-time algorithm for r =< eps n/127 undesignated players with eps n-victims, and an O~(r^2/eps)-time algorithm for r =< eps e^{sqrt{ln{n}}/{7}} designated players and eps =< 1/e with eps n-victims are presented.
△ Less
Submitted 23 July, 2015; v1 submitted 3 April, 2015;
originally announced April 2015.
-
Code Optimization in FORM
Authors:
J. Kuipers,
T. Ueda,
J. A. M. Vermaseren
Abstract:
We describe the implementation of output code optimization in the open source computer algebra system FORM. This implementation is based on recently discovered techniques of Monte Carlo tree search to find efficient multivariate Horner schemes, in combination with other optimization algorithms, such as common subexpression elimination. For systems for which no specific knowledge is provided it per…
▽ More
We describe the implementation of output code optimization in the open source computer algebra system FORM. This implementation is based on recently discovered techniques of Monte Carlo tree search to find efficient multivariate Horner schemes, in combination with other optimization algorithms, such as common subexpression elimination. For systems for which no specific knowledge is provided it performs significantly better than other methods we could compare with. Because the method has a number of free parameters, we also show some methods by which to tune them to different types of problems.
△ Less
Submitted 25 October, 2013;
originally announced October 2013.
-
FORM version 4.0
Authors:
J. Kuipers,
T. Ueda,
J. A. M. Vermaseren,
J. Vollinga
Abstract:
We present version 4.0 of the symbolic manipulation system FORM. The most important new features are manipulation of rational polynomials and the factorization of expressions. Many other new functions and commands are also added; some of them are very general, while others are designed for building specific high level packages, such as one for Groebner bases. New is also the checkpoint facility, t…
▽ More
We present version 4.0 of the symbolic manipulation system FORM. The most important new features are manipulation of rational polynomials and the factorization of expressions. Many other new functions and commands are also added; some of them are very general, while others are designed for building specific high level packages, such as one for Groebner bases. New is also the checkpoint facility, that allows for periodic backups during long calculations. Lastly, FORM 4.0 has become available as open source under the GNU General Public License version 3.
△ Less
Submitted 29 March, 2012;
originally announced March 2012.