O'Reilly logo
live online training icon Live Online training

Applied Deep Learning for coders with Apache MXNet

Hands-on deep learning in Computer Vision and Natural Language Processing

Sandeep Krishnamurthy

Get started with deep learning by grasping the fundamental principles, and developing hands-on techniques using Apache MXNet. You'll learn the importance of deep learning, its ecosystem, and develop an intuition for working with the fundamental principles. We’ll explore the application of deep learning in computer vision and natural language processing. You'll learn ideas behind common concepts, including training data, validation data, features, model training, model validation, loss functions, and optimization. You’ll also gain an understanding of tensors, multi-layer perceptron, convolutional neural networks, and recurrent neural networks. After learning the foundational concepts, you'll get hands-on with MXNet, installing various components and its powerful Gluon interface. You'll learn all about MXNet components, such as data loading and manipulation, context – CPU/GPU/multi-GPU, blocks, trainer, optimizers, loss functions, model training, model parameters, and more.

You'll apply what you’ve learned about convolutional neural networks to build a facial emotion recognition model that can detect an emotion from any picture of a person's face. You will also use recurrent neural networks to build an Emoji prediction model using given sentences.

We’ll also touch on advanced applications of deep learning in the field of computer vision and natural language processing. You'll learn about advanced MXNet toolkits for cutting-edge research – GluonNLP, GluonCV. We’ll finish with an introduction to MXNet Model Server for serving deep learning models in production at scale.

What you'll learn-and how you can apply it

  • Learn common terms and concepts used in the field of AI, including: training data, validation data, features, model training, model validation, loss functions, and optimization
  • Explore fundamental concepts in deep learning, including: tensors, multi-layer perceptron (MLP), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN)
  • Learn how to install and use Apache MXNet Gluon, and utilize components suchs as NDArray, autograd, data loaders and iterators, data transformations, block, trainer, model training, and model inference.
  • Build a facial emotion recognition model using CNNs
  • Create an emoji prediction model for a given sentence using RNNs
  • Learn how to use two advanced MXNet toolkits – GluonNLP and GluonCV
  • Deploy deep learning models in production with MXNet Model Server

This training course is for you because...

You are a data engineer or data scientist new to the field of AI, and you want to learn the foundational principles of deep learning techniques, and how to deploy a deep learning model to production at scale.


  • Programming experience in Python
  • Docker support
  • We will be sharing a Docker image for CPU/GPU with pre-configured MXNet, Jupyter notebooks, and MXNet Model Server


Setup Conda Environments on your machine following instructions: https://conda.io/docs/user-guide/install/index.html

Materials, downloads, or Supplemental Content needed in advance


About your instructor

  • Sandeep Krishnamurthy is a Deep Learning Engineer with Amazon AI, on a mission focused on building tools and technologies to enable usage of AI by millions of developers. He thinks his mission is complete when developers use AI like they use File IO today! Sandeep is an active open source contributor in the area of Deep Learning tools and technologies. Recently, he was busy building the Apache MXNet backend for Keras, and he is also an active contributor and committer for Apache MXNet, one of the most popular, scalable and easy to use deep learning frameworks. In the past, he has also built large-scale Machine Learning and Data Processing Platforms at Amazon. He is an active speaker in various meetups and learning groups, teaching Deep Learning for engineers. He holds a master of technology degree in Computer Science from IIIT-Bangalore.


The timeframes are only estimates and may vary according to how the class is progressing

Day 1

SECTION 1: Introduction to Artificial Intelligence and Deep Learning – 60 Minutes

  • Introduction to Artificial Intelligence (AI)
  • 3 broad categories of AI: Computer Vision, Natural Language Processing, Speech Recognition
  • Neural Networks (Deep Learning (DL))
  • Hardware and software for DL
  • Why DL now?
  • Common concepts in the AI field – training data, validation data, features, model training, overfitting, epoch, model validation, accuracy, model, loss functions, optimization.
  • Tensors: Scalar, Vectors, Matrices, Tensors, Image, Sequence Data.
  • Tensor operations: Dot product, norm (distance).
  • Deep Learning Basics: Forward Propagation, Backward Propagation, and Differentiation.

Break – 10 minutes

SECTION 2: Getting started with Apache MXNet – 30 Minutes

  • Installation: Anaconda, MXNet, Jupyter notebook for experimentation
  • DL programming paradigms – Symbolic v/s Imperative
  • NDArray: N-Dimensional Tensors. Context – CPU / GPU
  • NDArray operations
  • Autograd
  • MXNet Gluon: Blocks, Loss, Optimizers, Parameters, Trainer
  • Data loading and preparation: Data Loader, NDArrayIterator, Dataset
  • First Deep Learning Model: Multi-Layer Perceptron (MLP) for Fashion MNIST

Break – 10 minutes

Lab 1: NDArray operations, Auto Grad, Data Loaders, Data Iterators, Dataset; MLP model for Fashion MNIST (30 mins)

Break – 10 minutes

SECTION 3: Convolutional Neural Networks – 60 Minutes

  • Problems with Multi-Layer Perceptron
  • The intuition behind Convolutional Neural Networks (CNN)
  • Convolution basics: Kernel, Convolution, Strides, Padding
  • Pooling
  • BatchNormalization
  • Dropout
  • Activation: Relu
  • Importance of Image Augmentation: Rotate,

Lab 2: Application of CNN - Facial Emotion Recognition (30 mins)

SECTION 4: Advanced Applications of CNN – 10 Minutes

  • CNN in Self Driving Cars
  • CNN in Health Care
  • CNN in Manufacturing
  • GluonCV Toolkit


SECTION 1: Introduction to Sequence Models with RNN – 60 Minutes

  • Sequence Data
  • Recurrent Neural Network
  • Language Model
  • One hot encoding
  • Word Embeddings – Word to Vec
  • Cosine Similarity

Break – 10 minutes

Lab 4: Application of RNN - Emoji prediction for a given sentence (60 Mins)

SECTION 2: Advanced Applications of RNN – 10 Minutes

  • Language Translation
  • Time series forecasting
  • Automatic Speech Recognition
  • GluonNLP Toolkit

Break – 10 minutes

SECTION 3: Deploying Deep Learning Models in Production with MXNet Model Server – 30 Minutes

  • Deep Learning Models – Params (weights), Symbols (network)
  • Inference / Predictions
  • Deep Learning Models in production – Edge devices, mobile, web servers
  • MXNet Model Server (MMS)- Serving MXNet Models in production

Lab 3: Deploying the Facial Emotion Recognition model in production with MMS (30 mins)