Human age estimation from a photo using neural networks
Keywords:
age estimation, neural networks, regression, UTKFacesAbstract
The aim of this work was to compare different neural network architectures for the task of age estimation from face images. Since age is a continuous variable, the task of determining a human age from images of their face is treated as a regression problem. The UTKFaces dataset was used in this work. This dataset contains 24,000 annotated images categorized by gender, race, and age. To solve the task, four architectures were chosen for training: AlexNet, VGG-19, ResNet-50, and Inception-v4. These convolutional neural network architectures have shown significant advancements in image classification on the ImageNet dataset. AlexNet introduced the use of ReLU activation, dropout, and max-pooling, while VGG-19 emphasized deeper architectures with small filters. ResNet-50 addressed the vanishing gradient problem with residual connections, and Inception-v4 improved efficiency and gradient flow with optimized blocks and residual connections. In all networks, the last layer was replaced with a fully connected layer with one neuron and a linear activation function. The mean squared error (MSE) was used as the loss function during training, and the mean absolute error (MAE) was used as the quality metric. The data was split into training and testing sets in a 90% to 10% ratio. Before training, the images were normalized and resized to fit each neural network's requirements. AlexNet and VGG-19 were trained using the SGD optimizer with a learning rate of 0.2, ResNet-50 was trained using the Adam optimizer with a learning rate of 0.02, and Inception-v4 was trained using the Adadelta optimizer with a learning rate of 0.02. These methods and their parameters were chosen as the best after computational experiments. Each network was trained for a different number of epochs, as needed for convergence. After training, VGG-19 and ResNet-50 achieved MAE values of 2.7 and 3.5, respectively, while Inception-v4 had an MAE of 3.87. AlexNet exhibited significant overfitting. ResNet-50 processed images the fastest.
Downloads
References
UTKFace. Kaggle. https://www.kaggle.com/datasets/jangedoo/utkface-new/data
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 60(6), 84–90. https://doi.org/10.1145/3065386
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. 3rd International Conference on Learning Representations (ICLR 2015), 1–14. https://doi.org/10.48550/arXiv.1409.1556
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Comput-er Vision and Pattern Recognition, 770–778. https://doi.org/10.1109/CVPR.2016.90
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017). In-ception-v4, Inception-ResNet and the Impact of Residual Connections on Learning Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 31(1), 4278–4284. https://doi.org/10.1609/aaai.v31i1.11231
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Євгеній Вербенко, Ольга Мацуга (Автор)
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in the journal Challenges and Issues of Modern Science are licensed under the Creative Commons Attribution 4.0 International (CC BY) license. This means that you are free to:
- Share, copy, and redistribute the article in any medium or format
- Adapt, remix, transform, and build upon the article
as long as you provide appropriate credit to the original work, include the authors' names, article title, journal name, and indicate that the work is licensed under CC BY. Any use of the material should not imply endorsement by the authors or the journal.