Deep learning is an approach to artificial intelligence (AI) in which computers use many layers of neural networks or algorithms to “learn” from large amounts of data and autonomously make predictions, perform analytics, or otherwise transform the data. In machine learning, an overarching term that includes deep learning, the performance of the algorithms typically plateaus as the amount of input data increases. In contrast, by using more rapid processors and intricately layered algorithms, deep learning’s performance scales with the size of the dataset.
These developments have exciting implications for numerous fields. Deep learning research has unearthed applications in fields as wide-ranging as speech recognition, image processing, and financial modeling. Business leaders are also turning to deep learning to improve processes and efficiency.
AI vs machine learning vs deep learning
A common misconception is that AI, machine learning, and deep learning are somehow disjointed approaches. However, machine learning and deep learning are subfields of AI, the umbrella term used to describe methods of designing algorithms to approximate human intelligence. Each subfield is best suited to a specific set of problems, yet rapid growth in cloud technologies makes each field much more approachable for the average practitioner. Don’t miss the deep dive into machine learning with the Fundamentals of Machine Learning on-demand webinar.
Fundamentals of Machine Learning now.
Deep learning and neural networks
Since the early 20th century, deep learning has existed under many names — e.g., hierarchical feature learning, deep neural learning, deep neural networks, deep structured learning, and artificial neural networks (ANNs) — but the core goals have been the same. Theorists have pursued a machine that could provide an appropriate output without an explicit set of directions. In fact, deep learning’s pioneers found that codifying directions for even a simple human task, such as recognizing a human face, proved quite difficult to translate into code. Another method was needed.
An approach to solve this problem was the neural network. In 1943, Warren McCulloh and Walter Pitts published a paper introducing a preliminary model of neural networks, but it wasn’t until 1986 that Rina Dichter introduced the term deep learning to describe the developments in the field. In an attempt to approximate human intelligence, the design of neural networks was loosely based on biological models of the brain.
In a neural network information feeds through different nodes or neurons which, based on the outcome at that node, triggers another node until a final outcome or output is given. By layering neural nets, practitioners were able to advance shallow, restricted computer learning to a scalable, deeper learning. Notably, the neural network field has halted at several points during its developments due to limitations in the technology. But in 2006, deep learning had a sustained resurgence. Only recently has computational power and available data advanced enough to match the demands.
How neural networks enable deep learning
The activity of neural networks is central to the advancement of deep learning. In order to jumpstart a deep learning system, the algorithms must be trained with lots of examples called training data. Usually training data requires some manipulation before the system can process it. However, depending on the task and overall method chosen for the deep learning system, deep neural networks can flexibly use variably prepared data.
Labeling data: Supervised vs Unsupervised Learning
AI systems’ learning begins with the input of massive amounts of data. In the early days of machine learning, much of the data had to be labeled in order for the system to make inferences from the data. This method of building and handling an AI system is called supervised learning. Supervised learning still has applications today and can be very useful in algorithms intended for classification tasks such as image recognition. During the training process, the algorithm essentially is able to check its guesses against the pre-labeled dataset. To improve the system, adjustments are made when the algorithm guesses incorrectly.
On the other hand, engineers who build a system using unsupervised learning methods provide a system with unlabeled data. This method is useful when the particular features of the data are not known. In this case, the algorithms find similarities within the training data and then can use that to either generate new data instances or evaluate data instances based on what it has learned. Some practitioners of deep learning also use hybrid methods which give systems both labeled and unlabeled data.
Neurons and nodes
The essential unit of the neural network is the neuron or node. Neural networks rely on algorithms that, when given binary data, can give it an appropriate mathematical weight and provide a single output. Neural networks make these determinations many times in succession, and each decision point is a neuron or a node. The fundamental neuron that machine learning systems employ is a perceptron. Perceptrons, which rely on linear regression to weight data, are highly sensitive to slight variations in the inputs. Therefore, slight changes to inputs can cause dramatic changes to the outputs.
Sigmoid neurons and logistic neurons are alternative approaches that address the sensitivity in the perceptron model. Multilayer perceptrons are neural networks composed of sigmoid neurons. The layers of neurons that receive data are called input layers. The layers which yield results are the output layers. The layers in between are the hidden layers. Each layer can have hundreds or thousands of neurons depending on the problem. As hidden layers accrete, a deep learning network is born.
Feedforward deep network
In this type of neural network, information only moves forward in the neural networks. Each node receives an input and passes the determination forward in the system until a final determination is made.
The convolutional neural network (CNN) is an example of a feedforward deep network. It is commonly used for perceptual tasks and is often applied to image-type data.
Recurrent neural network (RNN)
In this type of neural network, information can move forward and backwards through the network of nodes. Specifically, feedback loops within the system allow neurons to “fire” when information is passed back to them. This system is significantly more complex than feedforward deep networks. It stores information within its nodes to process sequences of inputs. It is often used in processing time-dependent data and can be applied to human speech and predicting language. A key technique specific to RNNs which has enabled language processing is long short term memory (LSTM).
Deep learning methods
As a field, deep learning improves upon many of the methods of machine learning. Still traditional deep learning is not without its faults. Over the past several years, researchers have developed new approaches to deep learning that address the problems inherent with processing massive amounts of data.
As noted, in traditional feedforward neural networks, each layer passes on information to the next layer until a determination is reached. However with many hidden layers, the performance of the system as a whole can plateau and then diminish when the number of layers is high. This problem, known as the vanishing gradient problem, makes a deep network incapable of learning simple functions.
In response, residual networks or ResNets pass information forward through the system but also have the ability to bypass layers as necessary. Called skip connections, short cuts, or residual connections, these prevent the vanishing gradient problem. This is especially useful during the training phase when the algorithm must still learn the appropriate weighting to give to different layers of the neural network.
Generative Adversarial Networks (GANs)
GAN algorithms are composed of two parts: the generator and the discriminator. As their names imply, the generator creates new data instances and the discriminator determines if those instances are real or false. The generator’s data instances are fed to the discriminator along with a string of real data instances from the training set. The discriminator learns what a true data instance is from the training data and the generator learns to create a passably false data instance based on its feedback loop with the discriminator. Since the parts of the model must train against each other, this is an adversarial task, a zero sum game.
GANs are a type of semi-supervised deep learning and have been used to generate stunning pieces of original art in the style of distinctive time periods or artists.
Geometric deep learning
From images to text to audio to numbers, most deep learning is performed with 2D data or with multidimensional data represented in 2D form. The problem with this approach is that these representations are lossy. In transforming multi-dimensional data into a lower dimension representation, valuable data attributes are discarded since the system cannot process them.
Geometric deep learning addresses this problem by analyzing non-Euclidean data, i.e., data that is not in one or two dimensions. Utilizing graph theory, geometric deep learning has a variety of algorithms (e.g., graph neural networks, graph convolutional networks) that build off the traditional algorithms with graphs as data types. Overall this method of deep learning allows engineers to fully capitalize on using the collected data. To date, the popular focuses of this deep learning approach are molecular modeling and 3D modeling.
A traditional deep learning system learns from examples and then makes a series of actions to give the appropriate outputs. The purpose of a deep reinforcement learning system, however, is to optimize the actions themselves until the goal of a system is achieved. These systems learn by trial and error. A reinforcement learning system receives rewards and punishments for reaching or failing to reach certain goals and learns to adjust its actions accordingly. This type of learning is considered semi-supervised.
One type of deep reinforcement learning systems, the deep q learning system, has been used to teach systems to play video games on Atari.
Over-fitting data is a common challenge faced by deep learning practitioners. During the training phase neurons can become co-dependent, skewing the output. Dropout learning provides an antidote to this challenge. Applying a dropout technique to a deep learning system lets analysis run while certain nodes or neurons are ignored. This temporarily creates a smaller neural network, which lets neurons learn independent of one another.
Why deep learning matters
According to a recent study, by 2020, humans will have generated approximately 44 zetabytes of data. All of this information is more than a lone researcher or lone algorithm can handle. But deep learning systems are uniquely designed not only to handle this magnitude of data but to transform it into usable outputs. And as the data available increases, the deep learning system’s performance increases with it.
This can only be a positive thing. Deep learning systems can automate burdensome tasks, make predictions much faster than a human, and come to more accurate conclusions. This will lead to increased efficiency in businesses and applications that were previously unheard of due to the sheer time required to make sufficient strides.
Naturally, a few drawbacks do still remain. For one, deep learning systems require expensive graphics processing units (GPUs). Also some worry the applications of deep learning have been overstated. Do these deep learning algorithms truly have predictive power? If the algorithms have never seen an example of an input will they discount it altogether? Lastly, will tiny changes in input data willdy skew the results of certain algorithms?
Of course, many deep learning methods directly address these problems, but time will show how effective these methods are. In the meantime, deep learning continues to prove its value with impressive results. Learn how machine learning techniques can drive business success with Speed and Scale Advanced Analytics with Machine Learning, an on-demand webinar.
Speed and Scale: Advanced Analytics with Machine Learning now.
8 deep learning examples and applications
Machine learning has already made an impression on our daily lives and deep learning is quickly following. From meteorology to image recognition to healthcare, researchers are uncovering the wide-ranging benefits of deep learning. Here are a few exciting growth areas.
Natural language processing
A subfield of AI, natural language processing (NLP) seeks to improve algorithm’s comprehension of and interaction with human language. It allows computer to analyze, comprehend, and ultimately communicate with human speech. In the past, NLP algorithms were developed through shallow statistical machine learning methods which suffered from the curse of dimensionality. In contrast, natural language processing with deep learning benefits from the large training datasets and neural nets which can support more in-depth probabilistic methods.
Moving into a familiar technology now, smartphones and smart home devices recognize human speech. While these systems are highly reliable (~95%), they are far from perfect. Moving the needle from highly reliable to almost entirely reliable could revolutionize the way humans interact with computers. To do this, a lot more data is needed. As this data is collected, it can be processed with recurrent neural networks to refine the interpretation of human speech by computers. The neural network can process small chunks of audio at the syllabic level to predict which syllable may come next.
Financial Modeling is a complex field rife with abstractions and models with many non-linear influencing inputs. Though some critics of deep learning in the financial realm point to low predictability due to the general unpredictability to financial markets themselves, advances in other areas where deep learning has been applied support that deep learning systems can reliably predict non-linear relationships in non-linear data.
Not only can deep learning provide the benefit of process automation for modelers, as the neural network learns features, it can suggest improvements and ask vital questions during the modeling process, complex actions that a typical financial modeling software could not perform.
Natural language processing also plays a part in financial modeling by improving efficiency in document analyses that rely on quantitative strategies.
Deep learning in healthcare
The benefits of deep learning can be reaped in improvements to the patient experience as well as various updates to diagnostic processes.
Convolutional neural networks rival diagnosticians at interpreting features of MRIs, x-rays, and dermatological images. In one study in which images of possible melanoma were analyzed, even without specific patient history, the deep learning model’s accuracy surpassed that of dermatologists who had access to this information.
Other researchers created a deep learning system that could accurately detect a handful of acute neurological disorders from a CT scan 150 times faster than a doctor could.
Deep learning also enables clinicians and researchers to create powerful educational tools. Using GANs, they can create images of rare pathological disorders so students have working examples to learn from.
Image recognition is a typical classification problem that deep neural nets are uniquely equipped to solve.
Deep neural networks can identify images as a whole (such as in the medical example above), and they can localize specific parts of images—e.g., detecting unique items in a picture of a busy street.
This field also has enticing reconstructive capacities, such as restoring corrupt parts of images and converting grayscale or black-and-white old photo to full color realizations.
Many companies employ some version of automated customer service often in the form of chat boxes and phone operators. The usual models, while generally reliable and presenting a general interest in customer engagement, can be faulty or fail to address the true consumer need.
Deep learning seeks to improve the customer experience by offering predictive services that lead customers through the buying process or the help process. Some deep learning based systems include intelligent call routing, customer voice authentication, self-improving chat bots, and customer ticketing driven by social listening.
These services can cut down on the repetitive tasks that fill up customer service agent’s time and allow them to redirect to more specialized tasks. These smarter automated systems lead to lower training costs and happy customers who receive answers to their problems much faster.
Autonomous car technology must heavily rely on many applications of deep learning to function safely. Self-driving cars must process images to navigate traffic, understand signals, and avoid accidents. Thus, autonomous cars use many deep learning technologies such as image recognition, voice/speech recognition, and motion detection to interact with pedestrian and vehicular traffic.
Predicting crises: Seismology
In the aftermath of an earthquake, aftershocks, smaller earthquake events set off by the larger event, pose a significant risk to areas surrounding the impact area. Researchers have created predictive models which detail “how big” the aftershocks may be and “when” they might occur. Deep learning adds an additional element to this equation: where these shocks might happen.
A seismic event can be influenced by a plethora of factors, and a predictive model, regardless of its complexity, might have trouble pulling workable projections from an extensive dataset. Deep learning algorithms are specially equipped to analyze this highly complex data and provide projections. While these algorithms are still being perfected, their successful implementation suggests significant life-saving benefits.
Build a Big Data Analytics Pipeline with Machine Learning on Google Cloud now.
Deep learning, Talend, and the cloud
Deep learning has the capabilities to enable and streamline business processes. While researchers are continuing to develop new methods with varied applications from scratch, a newcomer can reap many of the benefits from deep learning software with off-the-shelf products and cloud-enabled data tools.
Talend Data Fabric is a suite of apps with built-in machine learning and deep learning capabilities. Built on the powerful Hadoop and Apache Spark, Talend Data Fabric eliminates the need for data science training and leverages drag-and-drop developer components. With deep learning techniques such as AI APIs, facial recognition, computer vision, and natural language processing methods integrated into the service, Talend Data Fabric can bring deep learning capabilities to your business. Try Talend Data Fabric today.Try Talend Data Fabric