Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: Niwa, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15010  [pdf, other

    cs.LG math.OC

    Polyak Meets Parameter-free Clipped Gradient Descent

    Authors: Yuki Takezawa, Han Bao, Ryoma Sato, Kenta Niwa, Makoto Yamada

    Abstract: Gradient descent and its variants are de facto standard algorithms for training machine learning models. As gradient descent is sensitive to its hyperparameters, we need to tune the hyperparameters carefully using a grid search, but it is time-consuming, especially when multiple hyperparameters exist. Recently, parameter-free methods that adjust the hyperparameters on the fly have been studied. Ho… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2311.13147  [pdf, other

    cs.LG

    Optimal Transport with Cyclic Symmetry

    Authors: Shoichiro Takeda, Yasunori Akagi, Naoki Marumo, Kenta Niwa

    Abstract: We propose novel fast algorithms for optimal transport (OT) utilizing a cyclic symmetry structure of input data. Such OT with cyclic symmetry appears universally in various real-world examples: image processing, urban planning, and graph processing. Our main idea is to reduce OT to a small optimization problem that has significantly fewer variables by utilizing cyclic symmetry and various optimiza… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  3. arXiv:2310.08920  [pdf, other

    cs.LG cs.AI cs.CR

    Embarrassingly Simple Text Watermarks

    Authors: Ryoma Sato, Yuki Takezawa, Han Bao, Kenta Niwa, Makoto Yamada

    Abstract: We propose Easymark, a family of embarrassingly simple yet effective watermarks. Text watermarking is becoming increasingly important with the advent of Large Language Models (LLM). LLMs can generate texts that cannot be distinguished from human-written texts. This is a serious problem for the credibility of the text. Easymark is a simple yet effective solution to this problem. Easymark can inject… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  4. arXiv:2310.00833  [pdf, other

    cs.CL cs.LG

    Necessary and Sufficient Watermark for Large Language Models

    Authors: Yuki Takezawa, Ryoma Sato, Han Bao, Kenta Niwa, Makoto Yamada

    Abstract: In recent years, large language models (LLMs) have achieved remarkable performances in various NLP tasks. They can generate texts that are indistinguishable from those written by humans. Such remarkable performance of LLMs increases their risk of being used for malicious purposes, such as generating fake news articles. Therefore, it is necessary to develop methods for distinguishing texts written… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

  5. arXiv:2305.11420  [pdf, other

    cs.LG cs.DC stat.ML

    Beyond Exponential Graph: Communication-Efficient Topologies for Decentralized Learning via Finite-time Convergence

    Authors: Yuki Takezawa, Ryoma Sato, Han Bao, Kenta Niwa, Makoto Yamada

    Abstract: Decentralized learning has recently been attracting increasing attention for its applications in parallel computation and privacy preservation. Many recent studies stated that the underlying network topology with a faster consensus rate (a.k.a. spectral gap) leads to a better convergence rate and accuracy for decentralized learning. However, a topology with a fast consensus rate, e.g., the exponen… ▽ More

    Submitted 15 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  6. arXiv:2209.15505  [pdf, other

    cs.LG

    Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning on Heterogeneous Data

    Authors: Yuki Takezawa, Han Bao, Kenta Niwa, Ryoma Sato, Makoto Yamada

    Abstract: SGD with momentum is one of the key components for improving the performance of neural networks. For decentralized learning, a straightforward approach using momentum is Distributed SGD (DSGD) with momentum (DSGDm). However, DSGDm performs worse than DSGD when the data distributions are statistically heterogeneous. Recently, several studies have addressed this issue and proposed methods with momen… ▽ More

    Submitted 24 September, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: Transactions on Machine Learning Research 2023

  7. arXiv:2205.11979  [pdf, other

    math.OC cs.LG

    Theoretical Analysis of Primal-Dual Algorithm for Non-Convex Stochastic Decentralized Optimization

    Authors: Yuki Takezawa, Kenta Niwa, Makoto Yamada

    Abstract: In recent years, decentralized learning has emerged as a powerful tool not only for large-scale machine learning, but also for preserving privacy. One of the key challenges in decentralized learning is that the data distribution held by each node is statistically heterogeneous. To address this challenge, the primal-dual algorithm called the Edge-Consensus Learning (ECL) was proposed and was experi… ▽ More

    Submitted 22 September, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

  8. arXiv:2205.03779  [pdf, other

    cs.LG

    Communication Compression for Decentralized Learning with Operator Splitting Methods

    Authors: Yuki Takezawa, Kenta Niwa, Makoto Yamada

    Abstract: In decentralized learning, operator splitting methods using a primal-dual formulation (e.g., the Edge-Consensus Learning (ECL)) has been shown to be robust to heterogeneous data and has attracted significant attention in recent years. However, in the ECL, a node needs to exchange dual variables with its neighbors. These exchanges incur significant communication costs. For the Gossip-based algorith… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

  9. arXiv:2203.13273  [pdf, ps, other

    cs.LG

    A DNN Optimizer that Improves over AdaBelief by Suppression of the Adaptive Stepsize Range

    Authors: Guoqiang Zhang, Kenta Niwa, W. Bastiaan Kleijn

    Abstract: We make contributions towards improving adaptive-optimizer performance. Our improvements are based on suppression of the range of adaptive stepsizes in the AdaBelief optimizer. Firstly, we show that the particular placement of the parameter epsilon within the update expressions of AdaBelief reduces the range of the adaptive stepsizes, making AdaBelief closer to SGD with momentum. Secondly, we exte… ▽ More

    Submitted 24 January, 2023; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: 10 pages

  10. arXiv:2107.08809  [pdf, ps, other

    cs.DC math.OC

    Revisiting the Primal-Dual Method of Multipliers for Optimisation over Centralised Networks

    Authors: Guoqiang Zhang, Kenta Niwa, W. Bastiaan Kleijn

    Abstract: The primal-dual method of multipliers (PDMM) was originally designed for solving a decomposable optimisation problem over a general network. In this paper, we revisit PDMM for optimisation over a centralized network. We first note that the recently proposed method FedSplit [1] implements PDMM for a centralized network. In [1], Inexact FedSplit (i.e., gradient based FedSplit) was also studied both… ▽ More

    Submitted 19 July, 2021; originally announced July 2021.

    Comments: 13 pages

  11. SSFG: Stochastically Scaling Features and Gradients for Regularizing Graph Convolutional Networks

    Authors: Haimin Zhang, Min Xu, Guoqiang Zhang, Kenta Niwa

    Abstract: Graph convolutional networks have been successfully applied in various graph-based tasks. In a typical graph convolutional layer, node features are updated by aggregating neighborhood information. Repeatedly applying graph convolutions can cause the oversmoothing issue, i.e., node features at deep layers converge to similar values. Previous studies have suggested that oversmoothing is one of the m… ▽ More

    Submitted 30 March, 2021; v1 submitted 20 February, 2021; originally announced February 2021.

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2022

  12. arXiv:1911.09445  [pdf, ps, other

    cs.LG stat.ML

    Approximated Orthonormal Normalisation in Training Neural Networks

    Authors: Guoqiang Zhang, Kenta Niwa, W. B. Kleijn

    Abstract: Generalisation of a deep neural network (DNN) is one major concern when employing the deep learning approach for solving practical problems. In this paper we propose a new technique, named approximated orthonormal normalisation (AON), to improve the generalisation capacity of a DNN model. Considering a weight matrix W from a particular neural layer in the model, our objective is to design a functi… ▽ More

    Submitted 14 January, 2020; v1 submitted 21 November, 2019; originally announced November 2019.

  13. arXiv:1902.09030  [pdf, ps, other

    cs.LG stat.ML

    Rapidly Adapting Moment Estimation

    Authors: Guoqiang Zhang, Kenta Niwa, W. Bastiaan Kleijn

    Abstract: Adaptive gradient methods such as Adam have been shown to be very effective for training deep neural networks (DNNs) by tracking the second moment of gradients to compute the individual learning rates. Differently from existing methods, we make use of the most recent first moment of gradients to compute the individual learning rates per iteration. The motivation behind it is that the dynamic varia… ▽ More

    Submitted 24 February, 2019; originally announced February 2019.

    Comments: 11 pages

  14. arXiv:1810.09137  [pdf, other

    stat.ML cs.LG cs.SD eess.AS

    DNN-based Source Enhancement to Increase Objective Sound Quality Assessment Score

    Authors: Yuma Koizumi, Kenta Niwa, Yusuke Hioka, Kazunori Kobayashi, Yoichi Haneda

    Abstract: We propose a training method for deep neural network (DNN)-based source enhancement to increase objective sound quality assessment (OSQA) scores such as the perceptual evaluation of speech quality (PESQ). In many conventional studies, DNNs have been used as a mapping function to estimate time-frequency masks and trained to minimize an analytically tractable objective function such as the mean squa… ▽ More

    Submitted 22 October, 2018; originally announced October 2018.

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol.26, Issue.10, 2018

  15. Software Defined Media: Virtualization of Audio-Visual Services

    Authors: Manabu Tsukada, Keiko Ogawa, Masahiro Ikeda, Takuro Sone, Kenta Niwa, Shoichiro Saito, Takashi Kasuya, Hideki Sunahara, Hiroshi Esaki

    Abstract: Internet-native audio-visual services are witnessing rapid development. Among these services, object-based audio-visual services are gaining importance. In 2014, we established the Software Defined Media (SDM) consortium to target new research areas and markets involving object-based digital media and Internet-by-design audio-visual environments. In this paper, we introduce the SDM architecture th… ▽ More

    Submitted 23 February, 2017; originally announced February 2017.

    Comments: IEEE International Conference on Communications (ICC2017), Paris, France, 21-25 May 2017

  16. arXiv:1510.08963  [pdf, other

    cs.SD

    PSD estimation in Beamspace for Estimating Direct-to-Reverberant Ratio from A Reverberant Speech Signal

    Authors: Yusuke Hioka, Kenta Niwa

    Abstract: A method for estimation of direct-to-reverberant ratio (DRR) using a microphone array is proposed. The proposed method estimates the power spectral density (PSD) of the direct sound and the reverberation using the algorithm \textit{PSD estimation in beamspace} with a microphone array and calculates the DRR of the observed signal. The speech corpus of the ACE (Acoustic Characterisation of Environme… ▽ More

    Submitted 29 October, 2015; originally announced October 2015.

    Comments: In Proceedings of the ACE Challenge Workshop - a satellite event of IEEE-WASPAA2015 (arXiv:1510.00383)

    Report number: ACEChallenge/2015/04