O'Reilly logo
live online training icon Live Online training

Graphs and Network Algorithms from Scratch

Explore and Analyze Network Data in Python

Bruno Gonçalves

Trees, graphs and networks are fundamental data structures that underlie much of the recent developments in data science and computer science algorithms. Technologies and applications like social networks, cloud and distributed computing, cryptocurrencies and traffic routing and directions all rely on the proper use of graph concepts.

In this course we will build, step by step, a mini toolkit of network representations and algorithms that will allow students to understand the fundamental ideas and concepts that lie at the base of state of art algorithms (such as PageRank and recommendation systems), technologies (such as graph databases) and tools (like web crawlers).

What you'll learn-and how you can apply it

  • Understand the similarities and differences between trees and graphs
  • Identify data sets and problems that can be best represented in graph form
  • Apply graph algorithms to practical data science problems
  • Use graph algorithms for recommendation and optimization
  • Build the rudiments of a graph library using nothing but basic Python

This training course is for you because...

  • Want to use network feature to improve ML predictive models
  • You work with data that describes the relationships between elements
  • Want to understand the effect of network topology on the performance of graph algorithms
  • Are curious about the best way of model social and technological networks
  • Want to apply networks to recommendation and optimization problems

Prerequisites

  • Basic Python
  • Numpy
  • Matplotlib
  • Jupyter

Course Set-up:

  • Scientific Python distribution like Anaconda

Recommended Preparation:

Watch Learning Data Structures and Algorithms (video)

Recommended Follow-up:

About your instructor

  • Bruno Gonçalves is currently a Senior Data Scientist working at the intersection of Data Science and Finance. Previously, he was a Data Science fellow at NYU's Center for Data Science while on leave from a tenured faculty position at Aix-Marseille Université. Since completing his PhD in the Physics of Complex Systems in 2008 he has been pursuing the use of Data Science and Machine Learning to study Human Behavior. Using large datasets from Twitter, Wikipedia, web access logs, and Yahoo! Meme he studied how we can observe both large scale and individual human behavior in an obtrusive and widespread manner. The main applications have been to the study of Computational Linguistics, Information Diffusion, Behavioral Change and Epidemic Spreading. In 2015 he was awarded the Complex Systems Society's 2015 Junior Scientific Award for "outstanding contributions in Complex Systems Science" and in 2018 is was named a Science Fellow of the Institute for Scientific Interchange in Turin, Italy.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Segment 1 - Networks and Graphs (50 min)

  • Graph theory
  • Network and graph examples
  • Types of graphs
  • Tree and graph representations
  • Break (10 min)

Segment 2 - Graph Properties (40 min)

  • Degree distributions
  • Nearest neighbors
  • Weight distributions
  • Degree and weight correlations
  • Break (10 min)

Segment 3 - Graph Algorithms (60 min)

  • Paths and walks on graphs
  • Epidemic and viral spreading
  • Graph sampling
  • Shortest paths and maximum spanning trees
  • Graph diameter and friendship paradox
  • Random walks and markov chains
  • Break (10min)

Segment 4 - Applications to Empirical Social and Technological Networks (40 min)

  • Temporal networks
  • Multi-layer networks
  • Bipartite networks and recommender systems
  • Graphs and optimization