What happened with Capsule Neural Networks?

2022-04-02 07:55:57 By : Mr. Jack Zhao

Together with Yann Lecun and Yoshua Bengio, Geoffery Hinton is referred to as the Godfather of Deep Learning. Hinton is most famously credited for the invention of backpropagation. Author Cade Metz writes in his book Genius Makers that Lecun first developed his idea on convolutional neural networks (CNNs) with Hinton during his time in Toronto before the former moved to Bell Labs to give a definitive shape to his idea.

Though widely popular for computer vision applications, CNNs suffers from a fundamental problem – it lacks the creativity of the human mind that it aims to recreate. As mentioned by Hinton during a talk, neural networks should be able to ‘generalise effortlessly’. “If they learned to recognise something, and you make it ten times as big, and you rotate it 60 degrees, it shouldn’t cause them any problem at all. We know computer graphics is like that, and we’d like to make neural nets more like that,” he said.

CNNs have so far failed to achieve that. To remediate this problem, in 2017, Hinton and his team introduced Capsule Neural Networks. In 2020, Google filed a patent on it. In this article, we attempt to trace the development of this new neural network type over the course of five years.

Even as one of the earliest proponents of neural networks, Hinton has always cautioned about this idea having its own limitations. To this end, Hinton and his team introduced an alternative mathematical model called the capsule neural network. Capsule neural network solves this problem by looking at the world in three dimensions. In an earlier interview, he said that capsules were a way on how one does visual perceptions using reconstruction and routing the information to the right places. “In standard neural nets, the information, the activity in the layer, just automatically go somewhere; you don’t decide where to send it. The idea of capsules was to make decisions about where to send information,” he said.

Google filed a patent claiming that capsules can be used in the place of conventional CNNs. The former mainly contribute in three ways – vector outputs, squash function and routing.

This model fetches spatial information and other important features to overcome the loss of information that is seen with pooling in CNNs. Capsules give vector (with direction) as an output. For example, if the orientation of the image is changed, the vector will be moved in the same direction. This is an important feature for the classification of objects for computer vision tasks. For example, a cat has more chances of being classified as a cat based on the position of its whisker, that too in fewer steps, as compared to CNN.

Since its introduction, capsule neural networks have found applications in a few areas. In the 2021 Nature paper, researchers Vittorio Mazzia, Francesco Salvetti and Marcello Chiaberge introduced Efficient-CapsNet, a capsule neural network architecture with 160k parameters. The proposed architecture was able to achieve state-of-the-art results even with just 2 per cent of the original capsule neural network parameters. The researchers were able to prove the effectiveness of their methodology and the capability of capsule networks to embed visual representations that are more prone to generalisation.

In a paper titled “Capsule Neural Network-based Height Classification using Low-Cost Automotive Ultrasonic Sensors”, researchers demonstrated a capsule neural network that could provide a detailed height analysis of detected objects. The team then applied re-sorting and re-shaping methods. This method was able to achieve a validation accuracy of 99 per cent with a runtime of 0.2 ms.

Other major research works include capsule neural networks for sentimental analysis, useful life estimations, and biometric recognition systems.

Despite being around for some time now, we do not get to see a lot about capsule neural networks. Compare this with Transformers, which since their introduction around the same time, have been actively used for various applications, including large language models and computer vision. In a popular ML forum discussion (started in 2019), the Google Brain said that capsule neural networks weren’t “quite dead yet”. The team added that there was still a long way to go, but it needs to be first scaled up for real-world problems and become a standard in the machine learning toolbox.

Hinton also said in a 2019 interview, “Now, since I started working on capsules, some other very smart people at Google invented Transformers, which are doing the same thing. They’re deciding where to route information, and that’s a big win.”

Clearly, the transformer has won the popularity contest here. Not considering the efficiency and performance of the capsule neural network, the model has remained under shadows, at least until now. That said, some of the greatest inventions have taken off only after being neglected for a long period of time. This could prove true for capsule neural networks too.

Conference, in-person (Bangalore) Rising 2022 | Women in AI Conference 8th Apr

Webinar Masterclass on AI innovation with oneAPI by Intel 13th Apr

Workshop, Virtual Accelerate Python* for data science and machine learning with oneAPI AI Analytics Toolkit 22nd Apr

Conference, Virtual Data Engineering Summit 2022 30th Apr

Conference, in-person (Bangalore) MachineCon 2022 24th Jun

Conference, in-person (Bangalore) Cypher 2022 21-23rd Sep

Stay Connected with a larger ecosystem of data science and ML Professionals

Discover special offers, top stories, upcoming events, and more.

One cannot discount ConvNets of its several flaws. Some of these limitations are very fundamental, pushing users to prefer other models over ConvNets.

The proposed application will be robust against environmental influences, such as shadow effects, light changes, and different weather conditions. Both systems will produce challenges. It will be a trial system with a web interface to check the alerts.

the convolutional layers reduce the size of the output. So in cases where we want to increase the size of the output and save the information presented in the corners

Text classification is a process of providing labels to the set of texts or words in one, zero or predefined labels format, and those labels will tell us about the sentiment of the set of words.

The word “Feature”, when explained as in the domain of computer vision and image processing,

Convolution is the simple application of a filter to an input image that results in activation,

We all have audienced the fantastic deep learning approaches that have regularly or empirically, demonstrated

Researchers at work find a way to enhance AI systems working through causal learning to overcome challenges.

DETR(Detection Transformer) is an end to end object detection model that does object classification and localization i.e boundary box detection. It is a simple encoder-decoderTransformer with a novel loss function that allows us to formulate the complex object detection problem as a set prediction problem.

RepVGG is a simple ConvNet architecture that combines multibranch topologies’ increased performance and the simplicity of VGG topology.

Stay up to date with our latest news, receive exclusive deals, and more.

© Analytics India Magazine Pvt Ltd 2022