Despite the widespread adoption of websites like Facebook and Twitter in recent years, social networks are actually not new. Their study dates as far back as the early twentieth century and has shown to be an extremely interesting model for human behavior. However, because of these popular social network websites, data scientists have access to much more data than the anthropologists who studied networks of tribes!

Because networks take a relationship-centered view of the world, the data structures that we will analyze will model real-world behaviors and community. Through a suite of algorithms derived from mathematical graph theory, we can compute and predict the behavior of individuals and communities through these types of analyses. This has several practical applications from recommendation to law enforcement to election prediction, and more.

What You Will Learn

In this course we will ingest data and construct a social network using Python. We will learn analyses that compute cardinality, as well as traversal and querying techniques on the graph, and even compute clusters to detect community. Besides learning the basics of graph theory, we will also make predictions and create visualizations from our graphs so that we can easily leverage social networks in larger data products.

Course Outline

The course will cover the following topics:

  • Transforming data into graph format.

  • Creating graphs using NetworkX.

  • Serializing and deserializing NetworkX graphs.

  • An introduction to Graph theory.

  • Finding strong ties through link weighting.

  • Computing centrality and key players (celebrities).

  • Finding communities through clustering techniques.

  • Visualizing graphs with matplotlib.

Upon completion of the course, you will understand how to conduct graph analyses on networks. You will also have built a library for analysis on a social network!

Course Requirements

Attendees should be familiar with Python and with the command line before participating in this course. They should also have the required software installed and operational on their computers.