Оптимізаційний метод AMSGrad в багатошарових нейронних мережах

Сергій Свелеба; Наталя Свелеба

Optimization method AMSGrad in multilayer neural networks

Authors

Serhii Sveleba Lviv University
https://orcid.org/0000-0002-0823-910X
Natalia Sveleba European University
https://orcid.org/0009-0004-2886-3921

Abstract

The most common method for optimizing neural networks is the gradient descent method. Gradient descent is an optimization algorithm that tracks the negative gradient of the objective function to find the minimum of the error function.

The limitation of gradient descent is that it applies a single learning rate to all input variables. Extensions of gradient descent, such as the Adaptive Moment Estimation (Adam) algorithm, use different learning rates for each input variable, but this can result in the learning rate quickly decreasing to very small values.

The AMSGrad method is an enhanced version of the Adam method, which aims to improve the convergence properties of the algorithm by avoiding large abrupt changes in the learning rate for each input variable. Technically, gradient descent is called a first-order optimization algorithm because it explicitly uses the first-order derivative of the objective function.

Downloads

Download data is not yet available.

References

Bordes A., Bottou L., Gallinari P. SGD-QN: Careful quasi-Newton stochastic gradient descent. Journal of Machine Learning Research. 2009. Vol. 10. P. 1737-1754. URL: https://www.jmlr.org/papers/volume10/bordes09a/bordes09a.pdf.

Olenych Y. et all. Features of deep study neural network. OPENREVIEWHUB. URL: https://openreviewhub.org/lea/paper-2019/features-deep-study-neural-network#.

Rudenko O., Bodianskyy E. Artificial neural networks. Kharkiv, Ukraine: SMIT Company, 2006. (Ukrainian).

Ruder S. An overview of gradient descent optimization algo-rithms. arXiv preprint arXiv:1609.04747. 2016. URL: https://ruder.io/optimizing-gradient-descent/index.html#adamax.

Subotin S. Neural networks: theory and prac-tice Zhytomyr, Ukraine: Publisher О. О. Evenok, 2020. URL: http://eir.zp.edu.ua/handle/123456789/6800 (Ukrainian).

Sveleba S. et al. Хаотичні стани багатошарової нейронної мережі. Збірник наукових праць" Електроніка та інформаційні технології". 2021. №. 16. C. 20-35. http://dx.doi.org/10.30970/eli.16.3.

Sveleba, S. et all Multilayer neural networks – as determined systems. Computational Problems of Electrical Engineering. 2021. Vol. 11. №2, P. 26–31. https://doi.org/10.23939/jcpee2021.02.026.

Taranenko Yu. Information entropy of chaos. URL: https://habr.com/ru/post/447874/.

Downloads

PDF (Ukrainian)

Published

2023-06-06

Issue

Vol. 1 (2023): Challenges and Issues of Modern Science

Section

Information technology and project management

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

All articles published in the journal Challenges and Issues of Modern Science are licensed under the Creative Commons Attribution 4.0 International (CC BY) license. This means that you are free to:

Share, copy, and redistribute the article in any medium or format
Adapt, remix, transform, and build upon the article

as long as you provide appropriate credit to the original work, include the authors' names, article title, journal name, and indicate that the work is licensed under CC BY. Any use of the material should not imply endorsement by the authors or the journal.

How to Cite

Sveleba, S., & Sveleba, N. (2023). Optimization method AMSGrad in multilayer neural networks. Challenges and Issues of Modern Science, 1, 446-456. https://cims.fti.dp.ua/j/article/view/87