Printer Friendly

Lessons in training neural nets: Limited data is a common problem when training CNNs in industrial imaging applications. Petra Thanner and Daniel Soukup, from the Austrian Institute of Technology, discuss ways of working with CNNs when data is scarce.

Deep learning--or neural networks --has had a triumphal march over the last decade. Since sufficient computing power has become available with GPUs, neural networks have been trained to handle many tasks in various areas, from language processing in text and sound, to image processing, classification, and anomaly detection. Numerous publications are reporting successful adoption of deep learning for various datasets and tasks. Today's widespread opinion about deep learning is that the only necessary ingredients are a data sample and a task. After that, the neural network is able to identify the relevant patterns in the data all by itself and give highly sophisticated and reliable predictions on complex data problems. Unfortunately, this is not always correct.

Industrial inspection

In industrial inspection, image processing has always been a core discipline. To solve inspection problems during the pre-deep learning era, machine vision engineers often analysed only a handful of image samples and developed elaborate mathematical models that approximately described the visible, relevant geometrical or statistical structures. This was a painstaking procedure and had to be repeated each time new insights emerged. Today's convolutional neural networks (CNN) promise to make this process much easier and, above all, more cost-effective, since all relevant information is automatically extracted from sample images. And the CNN training procedure can easily be re-run, whenever new and extended training data are available. The only necessary component is computing power.

However, there are also pitfalls with drawing all information only from training samples. Characteristic appearance variations not covered in the training data will probably not be recognised correctly in the CNN's inference phase. Therefore, training data for neural networks must contain samples covering as many variations in appearance as possible.

While CNNs are indeed remarkably capable of generalising the presented training samples to a certain extent, they also might over-fit on scarce training data, and so jeopardise their prediction performance on unseen data. In addition, object classes that are missing entirely cannot usually be handled properly. Only in special settings, or when special precautions have been taken, will CNNs exhibit some awareness about novelty in the data. This means being able to identify anomalous data with regards to what has been presented during the training phase.

The academic community has compiled multiple thorough and balanced sample datasets, based on which the potential of new deep learning algorithms are explored, tested and published. For industrial applications, large databases--of images of production defects, for example--are not often readily available. Unbalanced and insufficient data contain a bias, so that the CNN over-fits on certain aspects of a task that are covered by the training data, but is blind to aspects that were inadvertently overlooked. Thanks to the amount and complexity of large data, one should always be aware of the potential for data bias. Therefore, training data should be diligently maintained, extended, as well as repeatedly re-training and evaluating the CNNs.

While neural networks require exhaustive data for training, lack of data is an inherent problem for real-world problems. In 2014, a new method for artificial data generation with CNNs was published: generative adversarial networks (GANs). Two CNNs are trained to work against each other, a counterfeiter network and a discriminator network. The counterfeiter network is trained to become better at generating random images, plausibly mimicking the available real training data but providing some new structures. The discriminator is trained for revealing whether images are artificially generated or real data. As the discriminator improves on revealing counterfeits, the generator network improves on generating plausible looking images until the discriminator can't distinguish between the counterfeit images and the real images.

The emergence of GANs raised hopes that scarce data samples could simply be augmented with enough variation to make up for the lack of real data, for example when training CNNs for inspection tasks. Indeed, in some settings, such as entertainment or fashion applications, remarkable results can be achieved where augmented images are indistinguishable from real images--even humans can't tell the difference. This work uses relatively large real training databases containing enough structural variation for the GAN to learn. In fact, a GAN cannot generate substantially new artificial data without a clue from real data. For industrial inspection where precision and reliability count, the artificially generated data should be really in line with real image distributions, not only those that look plausible. Nevertheless, even measuring and proving generated data matches real data cannot reliably be accomplished.

For some applications, one way to use a GAN is to make it simpler for the GAN by combining it with conventional artificial data generation methods such as rendering. This generates a large amount of geometrical variation but usually fails to imitate the characteristics of the real image acquisition systems correctly. GANs can be used for doing the so-called style transfer, learning and applying the statistical camera properties to the rendered images without disturbing their geometrical content.

Image statistics are another potential pitfall when working with neural networks. It has been shown in various publications that adversarial noise patterns exist that deteriorate the predictions of CNNs. An image showing a truck is correctly classified as a truck by a CNN, but by adding an adversarial noise pattern to that image the CNN fails, although the image content has not changed visibly from a human perspective. That disturbing vulnerability to adversarial noise has been shown to be inherent to neural networks. Therefore, one must take care of properly setting up the acquisition system and image pre-processing pipelines, to thoroughly ensure that only properly processed data are used.

As deep learning is a fully data-driven method, one is totally dependent on the data quality. In an industrial environment, one might not be willing to entirely rely on the quality of a training dataset. We ask users to integrate deep learning cautiously into diligent data acquisition and image-processing pipelines, potentially using neural networks only in sub-parts, keeping conventional image processing methods whenever they work satisfactorily. For example, modern photometric and multi-view acquisitions, together with computational imaging, might be a valuably enriched input for a CNN to infer with. In that manner, one can exploit the controllability of data quality and information content with image processing, together with the ability to reveal complex, high-dimensional interdependencies, which no other method can grasp like deep learning.

The effort of developing and maintaining a reliable inspection system has not diminished. However, the bulk of the work is now managing and acquiring data instead of designing models: the better the training data, the better the predicted results. With a good representative training database--optimally real, not artificially generated data--deep learning methods are capable of surpassing conventional image processing-based systems by far. Reliably integrating deep learning into an actual industrial visual inspection process requires a lot of experience in deep learning, plus image acquisition and processing.

The high-performance imaging processing group at Austrian Institute of Technology works on industrial inspection aspects of deep learning, with methods such as one-class learning and data augmentation to solve challenges with training neural networks when data is limited. Petra Thanner is senior research engineer, Daniel Soukup is a scientist in the group.
COPYRIGHT 2019 Europa Science, Ltd.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2019 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Thanner, Petra; Soukup, Daniel
Publication:Imaging and Machine Vision Europe
Date:Aug 1, 2019
Previous Article:Imaging bites back: Keely Portway investigates how vision technology is being used in the dental sector, from initial diagnosis, to quality control...

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters