A fractional gradient descent algorithm robust to the initial weights of multilayer perceptron

Neural Netw. 2023 Jan:158:154-170. doi: 10.1016/j.neunet.2022.11.018. Epub 2022 Nov 17.

Abstract

For multilayer perceptron (MLP), the initial weights will significantly influence its performance. Based on the enhanced fractional derivative extend from convex optimization, this paper proposes a fractional gradient descent (RFGD) algorithm robust to the initial weights of MLP. We analyze the effectiveness of the RFGD algorithm. The convergence of the RFGD algorithm is also analyzed. The computational complexity of the RFGD algorithm is generally larger than that of the gradient descent (GD) algorithm but smaller than that of the Adam, Padam, AdaBelief, and AdaDiff algorithms. Numerical experiments show that the RFGD algorithm has strong robustness to the order of fractional calculus which is the only added parameter compared to the GD algorithm. More importantly, compared to the GD, Adam, Padam, AdaBelief, and AdaDiff algorithms, the experimental results show that the RFGD algorithm has the best robust performance for the initial weights of MLP. Meanwhile, the correctness of the theoretical analysis is verified.

Keywords: Convergence; First derivative; Fractional calculus; Initial weights; Multilayer perceptron (MLP); Robust.

MeSH terms

  • Algorithms*
  • Neural Networks, Computer*