Press Releases Sony

Sony’s Artificial Intelligence Development: It Has Achieved the Fastest*1 Deep Learning Speeds in the World

The following video presents Accelerating Deep Learning with GPUs:
AnacondaCon 2018. Stan Seibert. GPU acceleration has become critical for deep learning as models have become more complex and datasets have grown in size. Although not initially designed for deep learning, GPUs were recognized early on as having an architecture well-adapted to speeding up the massively parallel array calculations at the heart of deep learning. Speedups of 10x or more during the training process are often seen using GPUs, and many models can be scaled up to use multiple GPUs. GPU manufacturers, like NVIDIA, are starting to release GPUs with deep learning-specific features to further speedup model training and improve the throughput of deployed models…

This is an update to these previous blogs:

PRESS RELEASE

November 13, 2018

Sony Achieves World’s Fastest*1 Deep Learning Speeds through Distributed Learning

Reaches Efficiency Milestone for AI Development

Tokyo, Japan – Sony Corporation (hereafter “Sony”) today announced that by utilizing its deep learning development framework “Core Library: Neural Network Libraries” in addition to the AI Bridging Cloud Infrastructure (ABCI), a world-class computing infrastructure for AI processing constructed and operated by Japan’s National Institute of Advanced Industrial Science and Technology (AIST), it has achieved the fastest*1 deep learning speeds in the world.

Deep learning is a method of machine learning which uses neural networks modeled after the human brain. By harnessing deep learning, image and sound recognition capabilities have seen rapid growth in recent years, even outperforming humans in certain domains. However, the size of data used in this learning and model parameters used to improve recognition accuracy have been increasing, causing a subsequent rise in calculation times. In some cases, it has taken weeks or even months to conduct a single learning session. Because AI development requires a continuous process of trial-and-error, shortening this learning time is of the utmost importance. To this end, distributed learning using multiple GPUs as a means of shortening learning times is emerging as a popular solution.

When increasing the number of GPUs for distributed learning, there are cases where an increase to batch sizes (the amount of data to be processed at one time) halts the learning process, and other cases where the learning speed actually decreases due to the processing delays caused by data transmission times between GPUs. By utilizing technology that can determine the optimal batch sizes and appropriate number of GPUs based on the current state of the learning process, Sony makes it possible to carry out learning even in large-scale GPU environments such as ABCI, and increased transmission speeds between GPUs through data synchronization technology optimized for ABCI’s system structure. These technologies were implemented into the “Neural Network Libraries,” and used ABCI computing resources provided by AIST’s “ABCI Grand Challenge” to carry out learning. As a result, it was able to complete ImageNet/ResNet-50*2 (the general industry benchmark used to measure distributed learning speeds for deep learning) in approximately 3.7 minutes (when using as many as 2,176 GPUs), achieving the world’s fastest speeds to date. The results of this research have been published at “ImageNet/ResNet-50 Training in 224 Seconds (PDF).

The results of this experiment demonstrate that learning/execution carried out using Neural Network Libraries can achieve world-class speeds, and that by utilizing the same framework, it is possible to conduct technology development using deep learning with a shorter trial-and-error period. Moving forward, Sony will continue development on related technologies and seek to contribute to the development of society using AI technology.

*1 As of November 13, 2018 (according to Sony research)
*2 ImageNet/ ResNet-50 are a widely-used data for image recognition and model for image recognition, respectively. We utilized the ImageNet data set and carried out learning of the ResNet-50 model.