O'Reilly logo
live online training icon Live Online training

Artificial Intelligence for Big Data

Connecting big data with machine learning models to solve real problems

Fernando Damasio

By learning Artificial Intelligence for Big Data, you will be able to apply machine learning algorithms to solve real problems, building your own portfolio of projects.

You will dive in examples using different techniques and approaches to deal with a mass amount of data, extracting information and performing intelligent actions. We will cover Supervised, Unsupervised, Reinforcement and Deep Learning techniques.

How a Neural Network works? How to classify images using CNN’s? How to pre-process the data before feeding a model? What are the key differences between the different kind of algorithms? These are a few answers you will have after taking this course.

The pipeline to solve problems using AI is common to most of the machine learning algorithms. You will learn the step by step to solve these problems, also the concepts and how to apply models to generate value.

Using Python notebooks and libraries related to AI like Sckit-Learn, Tensorflow and Keras, you will have practical lessons with your hands on real problems.

What you'll learn-and how you can apply it

  • Pre-process the data before feeding the Machine Learning models
  • Compare different AI approaches to solve a problem
  • Implement the best solution using the best model
  • Understand the different technologies that are moving the industry forward
  • Showcase your portfolio of projects

This training course is for you because...

  • You are a Data Scientist or an Engineer interested in solve problems using artificial intelligence.

Prerequisites

  • Basic knowledge of Python or object-oriented languages.
  • Basic knowledge on statistics and math
  • Ability to build a Python environment for coding

Recommended preparation

Materials to download in advance of class:

https://github.com/fernandodamasio/ai-for-big-data

About your instructor

  • Fernando Damasio is an accomplished Senior Executive and thought-leader with more than 15 years of success across the technology, automotive, education, logistics, marketing, and steel industries. Leveraging extensive experience excelling in competitive markets, he is a valuable asset for a business developing its digital transformation and go-to-market strategy. His broad areas of expertise include technical skills, leadership, relationship management, competitive analysis, and methodology.

    Throughout his career, Fernando has held various leadership positions including Engineer at Odebrect AS, Project Leader at Vale AS, Session Led at Udacity and CEO of CashFlix. Currently, he is the Product Leader and Founder of Skoods, Principal Consultant at Data Riders and Mentor for Udacity and Singularity University.

    Fernando has had tremendous success over the years and has served as a key contributor to numerous organizational achievements. He was responsible for leading three of the largest projects at Vale, all of which were delivered on time and within budget. Fernando founded CashFlix in 2014 after raising investment for the company that provides an innovative purchasing solution that utilizes Machine Learning to read texts in photos of customer purchase vouchers. In 2017, he founded Data Riders, a consultancy company related to digital transformation and in 2018, he is founding a new company, Skoods, a crowdsourced self-racing car team.

    Fernando received his Bachelor’s Degree in Automation and Control Engineering from Pontificia Universidade Católica de Minas Gerais and his Master of Business Administration in Project Management from Fundação Dom Cabral. He regularly participates in continuing education and professional development opportunities and has completed programs in Port and Harbor Engineering, Machine Learning Engineering, Self-Driving Car Engineering and Digital Strategies for Business.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Day 1

Session 1: Introductions Concept and history of Artificial Intelligence(45 minutes)

  • Types of Machine Learning algorithms:
  • Supervised Learning
  • Unsupervised Learning
  • Reinforcement Learning
  • Deep Learning
  • Quiz: About Type of ML algorithms
  • The Machine Learning Pipeline
  • Quiz: About the pipeline

Session 2: Pre-processing the data, Code Practice, Supervised Learning (2 Hours 30 minutes)

  • Pre-processing the data
  • Exploration
  • Feature selection
  • Normalization
  • Scaling
  • Encoding
  • Quiz on pre-processing
  • Outliers
  • Dimensionality reduction: Feature importance, PCA
  • Quiz on dimensionality reduction
  • Exercise: Finding Donors project
  • Next steps on Pre-processing Data
  • Code Practice:
  • Supervised Learning:
  • How it works
  • Applications
  • Algorithms
  • Data split
  • Under and Overfitting
  • Quiz: Under and Overfitting
  • Score function
  • Defining a benchmark
  • Choosing a model
  • Tuning a model (Grid search)
  • Quiz: Grid Search
  • Showing the results
  • Exercise: Insurance project
  • Next steps on Supervised Learning

Session 3: Unsupervised Learning, Reinforcement Learning (45 mins)

  • Unsupervised Learning:
  • How it works
  • Applications
  • Algorithms
  • Score function
  • Quiz
  • Exercise: Movie recommendation
  • Next steps on Unsupervised Learning
  • Reinforcement Learning:
  • How it works
  • Applications
  • Algorithms
  • Score function
  • Quiz
  • Exercise: Simple AI Gameplayer
  • Next steps on Reinforcement Learning

Day 2

Session 4: Image processing, Image classification with Deep Learning (2 Hours)

  • Image processing
  • Color spaces
  • Color channels
  • Thresholds and filters
  • Edge detection
  • Case: Steel industry application
  • Next steps on Image processing
  • Image classification with Deep Learning
  • Neural Networks
  • Quiz
  • Gradient Descendent
  • Parameters
  • CNN’s
  • Pre-processing
  • Sliding window
  • Exercise: Traffic Sign classifier
  • Next steps on Image Classification
  • Code Practise (20 mins) Q&A + Break (10 mins)

Session 5: Natural Language Processing with Deep Learning (30 mins)

  • How it works
  • Applications
  • Algorithms
  • Case
  • Next steps on Natural Language Processing

Session 6: Other technologies, Managing the transformation (1 Hour 15 minutes)

  • Other technologies
  • Keras, Tensorflow, Watson, Azure, AWS, DominoLab
  • Quiz
  • Managing the transformation
  • Scope
  • Schedule
  • Cost
  • Quality
  • Communication
  • People (different roles)
  • Quiz
  • Code Practise
  • Q&A (15 mins)