Accelerating large-scale convolutional neural networks with parallel graphics multiprocessors

Proceedings of International Conference on Artificial Neural Networks (ICANN) |

Training convolutional neural networks (CNNs) on large sets of high-resolution images is too computationally intense to be performed on commodity CPUs. Such architectures, however, achieve state-of-the-art results on low-resolution machine vision tasks such as recognition of handwritten characters. We have adapted the inherent multi-level parallelism of CNNs for Nvidia’s CUDA GPU architecture to accelerate the training by two orders of magnitude. This dramatic speedup permits to apply CNN architectures to pattern recognition tasks on datasets with high-resolution natural images.