O'Reilly logo
live online training icon Live Online training

Intermediate SQL for data analysis

Doing more with relational databases

Thomas Nield

SQL is often viewed as a means for accessing and writing data and performing rudimentary analysis, but most database platforms offer powerful analytics capabilities beyond aggregations and simple joins. Common table expressions, regular expressions, windowing functions, recursive queries, and cross joins make SQL a powerful platform for data analytics. In addition, subqueries and exotic join patterns can transform data in less obvious ways, and windowing functions can flexibly aggregate on contexts not possible with GROUP BY. Doing intensive analysis with SQL also allows you to put the onus of churning data on the database engine, which is optimized for that purpose. This course will also briefly cover how to invoke SQL from Java, Python, and R so your applications and data science models can work with data sources directly.

Join expert Thomas Nield for a deep dive into advanced SQL techniques for data analysis. You’ll learn how SQL can answer most common analytics questions—and do it more efficiently.

What you'll learn-and how you can apply it

By the end of this live, online course, you’ll understand:

  • How SQL can be used for deeper and more complex analytical tasks, including windowing functions, common table expressions, regular expressions, and cross joins
  • When to use SQL versus another technology (like R or Python) for a given task

And you’ll be able to:

  • Flexibly derive sets of data and query off of queries
  • Create more flexible and legible SQL using tools like common table expressions
  • Leverage the power of the regular expression to qualify complex text patterns
  • Use SQL joins, including recursive and cross joins, in more abstract and powerful ways
  • Leverage windowing functions to get contextual aggregations
  • Use SQL with Python, R, or Java

This training course is for you because...

  • You’re an analyst or data science professional with fundamental SQL proficiency who wants to leverage SQL for more advanced analysis.
  • You’re a programmer or developer who needs to process large amounts of analytical data and wants to leverage more advanced SQL to put that work back on the database.
  • You’re a database administrator who wants to better understand advanced analysis features like windowing functions.

Prerequisites

  • Proficiency in basic SQL operators including WHERE, GROUP BY, INNER JOIN, and LEFT JOIN

Required materials and setup:

Recommended preparation:

About your instructor

  • Thomas Nield (author of Getting Started with SQL) is a business consultant for Southwest Airlines in Schedule Initiatives. Early in his career, he became fascinated with technology and its role in business analytics. After becoming proficient in Java, Kotlin, Python, SQL, and reactive programming, he became an open-source contributor as well as an author/trainer for O’Reilly Media. He is passionate about sharing what he learns and enabling others with new skill sets. He enjoys making technical content relatable and relevant to those unfamiliar with or intimidated by it.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Course setup: Installing SQLiteStudio or SQLiteOnline (10 minutes)

Subqueries, Derived Tables, and Unions (25 minutes)

  • Lecture:: Subqueries; derived tables; common table expressions; joining derived tables; unions; group concatenation
  • EXERCISE: Joining averages to individual records:

Regular expressions (25 minutes)

  • Lecture: Using group_concat(); introduction to regular expressions; using a regular expression to qualify records
  • EXERCISE: Qualifying patterns in street addresses
  • Break (10 minutes)

Advanced joins (40 minutes)

  • Lecture: Inner join and left join review; creating a volatile table; doing a regex join; self joins; recursive queries; cross joins;
  • EXERCISE: Finding totals for all customers, even with no orders
  • Break (10 minutes)

Windowing functions (40 minutes)

  • Lecture: Partitioning a simple aggregation; partitioning a rolling aggregation; applying a windowed sort; moving averages;
  • EXERCISE: Calculating rolling average of orders across customers

Programming with SQL (40 minutes)

  • Lecture: SQL versus programming languages—pros and cons; using SQL with Python; using SQL with R; using SQL in Java, Scala, and Kotlin
  • SQL Injection Demo
  • EXERCISE: Quiz questions
  • Final Q&A