O'Reilly logo
live online training icon Live Online training

Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook

enter image description here

Learn just the essentials of Python-based Machine Learning on AWS with Jupyter Notebook

Noah Gift

There is an overwhelming demand to learn business focused Python-based Machine Learning. This training is about learning how to apply Machine Learning techniques in Python to common business applications. Examples of this could be classifying types of users registered on a shopping site, to using regression to predict the sales for the next month.

The live training shows how to get started with the basics in Python via Jupyter notebooks, then proceeds to dive into nuts and bolts of Data Science libraries in Python. EDA, or exploratory data analysis, is at the heart of the Machine Learning feedback look, and this series will highlight how to perform this in Python and Jupyter Notebook.

Finally, AWS will be used to expand the machine learning concepts to real world environments in the cloud. Machine Learning on AWS concepts will cover how to do batch based job workflows for Machine Learning pipelines, as well as the use of the boto library.

What you'll learn-and how you can apply it

  • Python fundamentals
  • Jupyter notebook fundamentals with Pandas, scikit-learn, and seaborn
  • AWS fundamentals for Python and Machine Learning
  • Machine Learning concepts and applications

This training course is for you because...

  • You are a business and analytics professional with some SQL experience and are looking to move to the next generation of Data Science.
  • You are a Junior Data Scientist who is looking to expand into cloud-based Machine Learning concepts on AWS.
  • You’re a software developer who wants to understand how to get more deeply involved in the Data Science movement.
  • You’re a technical leader who wants to understand Machine Learning in Python to effectively manage teams that perform these actions.
  • You’re a currently involved in Data Science, Analytics or Machine Learning training and are looking for additional material to supplement your learning.

Prerequisites

  • Some previous programming experiences
  • Basic understanding of statistics and probability

Recommended preparation:

Course Set-up:

  • Jupyter notebook and/or Colab Notebook (In Google Chrome)
  • Python 3.6 or greater
  • (Optional) AWS account

Resources List:

https://github.com/noahgift/functional_intro_to_python

About your instructor

  • Noah Gift is lecturer and consultant at both UC Davis Graduate School of Management MSBA program and the Graduate Data Science program, MSDS, at Northwestern. He is teaching and designing graduate machine learning, AI, Data Science courses and consulting on Machine Learning and Cloud Architecture for students and faculty. These responsibilities including leading a multi-cloud certification initiative for students. He has published close to 100 technical publications including two books on subjects ranging from Cloud Machine Learning to DevOps. Gift received an MBA from UC Davis, a M.S. in Computer Information Systems from Cal State Los Angeles, and a B.S. in Nutritional Science from Cal Poly San Luis Obispo.

    Professionally, Noah has approximately 20 years’ experience programming in Python. He is a Python Software Foundation Fellow, AWS Subject Matter Expert (SME) on Machine Learning, AWS Certified Solutions Architect and AWS Academy Accredited Instructor, Google Certified Professional Cloud Architect, Microsoft MTA on Python. He has worked in roles ranging from CTO, General Manager, Consulting CTO and Cloud Architect. This experience has been with a wide variety of companies including ABC, Caltech, Sony Imageworks, Disney Feature Animation, Weta Digital, AT&T, Turner Studios and Linden Lab. In the last ten years, he has been responsible for shipping many new products at multiple companies that generated millions of dollars of revenue and had global scale. Currently he is consulting startups and other companies.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Day 1 Introductory Concepts in Python, Jupyter and Colab

Part One: 1 Hour + 30 Minutes

A. Introductory Concepts

  • Using IPython, Jupyter, Colab and Python executable
  • Colab Notebook Key Features
  • Colab Hacks: Mounting GDrive, Using Kaggle Data
  • Procedural statements
  • Strings and String formatting
  • Numbers and arithmetic operations
  • Data Structures: Lists, Dictionaries, Sets and operations on them.
  • Writing and Running Scripts

B. Functions

  • Writing Functions
  • Function arguments: positional, keyword
  • Closures and Functional Currying
  • Lazy Evaluated Functions (Generators)
  • Partial Functions
  • Decorators: Functions that wrap other functions
  • Making Classes Behave Like Functions
  • Applying Functions to Pandas DataFrames
  • Lambdas

C. Homework Exercises (Recommended Next Steps)

Q&A: 15 Minutes Break: 15 Minutes

Part Two: 45 Minutes

A. Understanding Libraries, Classes, Control Structures, and Regular Expressions

  • Modules
  • Writing a library in python
  • Importing a library in python and using namespaces
  • Using other libraries with pip install.
  • Mixing third party libraries with your code.
  • Understanding Python Classes
  • Making simple objects and interacting with them
  • Writing classes basics
  • Understanding inheritance
  • Interacting with Special Class Methods
  • Immutability concepts with Objects
  • Control Structures
  • For loops
  • While loops
  • If/else statements
  • Try/except
  • Generator expressions
  • List Comprehensions
  • Dictionary Comprehensions
  • Understanding Sorting
  • Python Regular Expressions

B. Homework Exercises (Recommended Next Steps)

Q&A: 15 Minutes

Day 2 Applied Python for AWS for Data Science and ML (180 minutes)

Part One: 1 Hour + 30 Minutes

A. IO Operations in Python and Pandas and Data Science Project Exploration

  • Working with Files
  • Serialization Techniques
  • Use Pandas DataFrames
  • Concurrency in Python
  • GPU programming and Numba

B. Walking through Social Power NBA Data Science Project

  • Importing and merging DataFrames in Pandas
  • Creating correlation heatmaps
  • Using seaborn lmplot
  • Using linear regression in Python
  • Using ggplot
  • Using yellowbrick road
  • Doing KMeans clustering
  • Using PCA with sklearn
  • Using ML auto-sklearn
  • Using Plotly for interactive Data Visualization

Q&A: 15 Minutes Break: 15 Minutes

Part Two: 45 Minutes

A. AWS Cloud-Native Python for ML/AI

  • Introduction to AWS Web Services: Creating accounts, Creating Users and Using Amazon S3
  • Recap AWS Reinvent 2018 Features
  • Loading AWS API Keys into Colab
  • Using Python Boto
  • Brief overview of AWS Python Lambda development with Chalice
  • Overview of Step functions with AWS
  • Overview of AWS Batch for ML Jobs
  • Using AWS Sagemaker
  • Using AWS Comprehend

B. Homework Exercises (Recommended Next Steps)

  • Software Carpentry (Bonus Material #1)
  • Creating a Data Engineering API with Flask and Pandas (Bonus Material #2)
  • Creating Command-line Machine Learning Tools (Bonus Material #3)
  • Managed Machine Learning Systems and Internet of things (Bonus Material #4)

Q&A: 15 Minutes