УДК 004

Using gan architecture to generate a model of traffic signs recognition

Алшымбаева Рауза Онербеккызы – магистрант Казахстанско-британского технического университета (г. Алматы, Республика Казахстан)

Abstract: The article handles employing the GAN (Generative Adversarial Network) architecture to unravel the case of the identification of road signs. The article wants to analyze the process of the GAN, its capabilities, and ideal ways of performing road sign recognition. The dataset of pictures GTSRB (German Traffic Signs Recognition Benchmark) stood utilized to test the architecture method. After training on real traffic signs, new road signs were generated and then tested out. Resultantly, the opportunities for employing the GAN architecture are demonstrated. The report is a summation, revealing the specifics of the outcome of the GAN architecture.

Keywords: GAN architecture, DCGAN, convolutional neural network (CNN), road signs recognition, GTSRB dataset.

Introduction

Recognizing road signs is critical for computer vision and autonomous driving. The task involves determining additional symbols, such as speed limits, stop alerts, and delivering signs from pictures or video frames. To achieve this, deep neural learning models like convolutional neural networks (CNNs) remain widely employed. Nevertheless, there still exists a demand for extra exact and trustworthy models. One of the variants to improve road sign recognition implementation is operating generative adversarial networks (GANs).

Background

Traffic sign recognition is an important mission in the field of machine vision and is critical for the development of autonomous driving systems. The task involves detecting and identifying road signs such as speed limits, stop signs, and yield signs from images or video frames. Traditional computer vision techniques for traffic sign recognition involved hand-engineered element extraction and classification algorithms. Nevertheless, these techniques existed determined by the grade of segments removed and the incapability to take deviations in lighting and climate essentials.

Convolutional neural networks (CNNs), including deep learning models, retain revolutionized computer vision and remain widely employed for road sign recognition with distinguished sensation. CNNs can understand hierarchical expressions of pictures, permitting them to remove additional potent and prejudiced components than hand-engineered methods. Moreover, CNNs can handle deviations in lighting and weather conditions by understanding extensive datasets. GAN is a category of deep learning instance that consists of the following networks, the generator and the discriminator. GANs are specially utilized for picture generation and deliver successfully developed high-quality synthetic images unnoticeable from real images. GAN was utilised for data augmentation and maintained assurance in enhancing the implementation of deep learning prototypes for computer vision tasks [1].

Gan architecture

The chapter will delve more in-depth into the GAN architecture and its constituents. The generator network stands accountable for developing synthetic pictures, while the discriminator network stands answerable for streaming between real and synthetic pictures. The generator network accepts an accidental noise vector as information and delivers a synthetic picture. The discriminator network takes as information a picture and outputs a chance suggesting whether the picture is real or synthetic [2]. The training procedure of GANs concerns preparing the generator network to create synthetic pictures that can deceive the discriminator network and introducing the discriminator network to differentiate between real and synthetic pictures accurately.

The generator and discriminator grids are instructed contemporaneously, with the failure of the generator grid existing hanging on the result of the discriminator grid. GAN has demonstrated great conquest in developing high-quality synthetic pictures for different studies, including picture generation, type transfer, and data augmentation. GAN has furthermore existed utilized for video and music generating, demonstrating the possibility of GANs in diverse occupations above computer vision.

1

Figure 1. GAN architecture flow [7].

Using gan for traffic sign recognition

This chapter will discuss how GANs can be used for road sign recognition. One approach to using GANs for road sign recognition is to train the generator network on a dataset of road sign images to generate synthetic images of road signs. These synthetic images can then be added to the training dataset to enhance the implementation of a CNN trained on the augmented dataset. GANs can also generate images with variations in lighting and weather conditions [3].

This is especially useful for training deep learning benchmarks for road sign recognition, as variations in lighting, weather conditions can affect the interpretation of models trained on real-world datasets. By generating synthetic images with variations in lighting and weather conditions, GANs can help to overcome this problem. Moreover, GANs can generate diverse images, allowing for more robust training of deep learning models. This is especially important for road sign recognition, where the datasets can be imbalanced, with sure road signs being underrepresented in the dataset.

Case study

The chapter will present a case study employing GANs for road sign recognition. The dataset operated for the topic investigation is the German Traffic Sign Recognition Benchmark (GTSRB) dataset, which retains 43 classes of road signs and 39,209 images [4]. The GAN architecture used for creating synthetic pictures is a DCGAN (Deep Convolutional GAN) containing a generator and discriminator networks. This generator network includes four transposed convolutional layers, and the discriminator network includes four convolutional layers. The GAN existed taught on the GTSRB dataset for 200 epochs, and synthetic pictures were developed operating the qualified generator network [1]. The synthetic pictures were counted to the training dataset, resulting in an extended dataset. Employing transfer learning, a CNN stood prepared on the extended dataset with a pre-trained ResNet-50 as the ground network [2]. The interpretation of the CNN instructed on the extended dataset existed assessed on a test set of 12,630 images from the GTSRB dataset. The accuracy of the CNN on the test set stood at 98.53%, which is an advancement over the accuracy of the CNN oriented on the original dataset without data augmentation (96.97%). The implementation of the CNN oriented on the extended dataset has corresponded to the implementation of the CNN introduced on a dataset augmented employing standard data augmentation methods like translation, rotation, and scaling.

The CNN instructed on the GAN-augmented dataset transcended the CNN instructed on the traditionally extended dataset, displaying the significance of GANs for data enlargement. The GAN-generated synthetic pictures stood again investigated to comprehend the components learned by the GAN. The synthetic pictures revealed that the GAN delivered was known to render pictures with deviations in lighting and weather essentials and pictures with occlusions and noise. In addition, the synthetic pictures again demonstrated that the GAN had comprehended generating pictures with deviations in the road signs' size, shape, and position. In the finale, the topic investigation revealed the significance of operating GANs for data augmentation in road sign recognition. Using GANs for data boost enhanced the implementation of the CNN trained on the extended dataset. Furthermore, the GAN-generated synthetic pictures research indicated that the GAN had understood to generate pictures with divergences in lighting and weather situations and pictures with occlusions and noise, making the extended dataset more potent for training deep learning models.

2

Figure 2. GTSRB classes [6].

Conclusion

In conclusion, using GAN architecture to generate a road sign recognition model can improve the accuracy and robustness of deep learning models for given essential tasks.[5] By generating synthetic pictures of road signs, GANs can help overcome imbalanced datasets and improve the performance of models trained on augmented datasets. Moreover, GANs can generate images with variations in lighting and weather conditions, making the model more effective in real-world scenarios. Additional analysis in this area can guide more accurate and robust autonomous driving systems that can better navigate complex road environments.

References

  1. Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. Generative adversarial nets. In Advances in neural information processing systems. 2014, 2672-2680.
  2. He K., Zhang X., Ren S., & Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2016, 770-778.
  3. Sermanet P., & LeCun Y. Traffic sign recognition with multi-scale convolutional networks. In Neural networks. 2011. Vol. 32, pp. 333-338.
  4. Stallkamp J., Schlipsing M., Salmen J. & Igel C. The German Traffic Sign Recognition Benchmark: A multi-class classification competition. In Proceedings of the International Joint Conference on Neural Networks. 2011, pp. 1453-1460.
  5. Zhang L., Lin L., Liang X., & He K. 2019. Is GAN generated data actually harder to learn? In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33, pp. 4667-4674.
  6. Stallkampa J., Schlipsing M., Salmena J., Igel C. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. 2012. p. 325.
  7. Kalin J. Generative Adversarial Networks Cookbook. 2018.

Интересная статья? Поделись ей с другими: