Sabbatical 2018 Week 10: What Happened to Weeks 7-9?
Boy, time sure flies when you’re busy. It’s already week 10. I had to go back to August to count the weeks because I barely know what day it is, let alone how many weeks have passed in the semester. This of course is all good. While on sabbatical, I’ve also been renovating a vacation home we purchased up in Happy Jack, so my life has been consumed with data and renovations for the last three months. Thankfully one of those projects is almost complete. And that would not be the data project. On we roll.
Big Data is still my world at the moment. I’m currently in course 5 of the Big Data Specialization on Coursera. Course 5 is Graph Analytics for Big Data. I’m learning about how real world data science problems can be modeled as graphs along with various tools and techniques. The biggest thing I’ve learned so far is that most people don’t know what graphs are. Most people think graphs are these pretty pie charts.
These are not graphs apparently. These are pie charts. I knew that. I love pie charts. We are not learning how to make pie charts in the Graph Analytics for Big Data course. We’re learning how to make this below. This is a graph with nodes and edges.
I should have know this was not going to be simple. This graph theory is tied to math, so they are “mathematical structures used to model pairwise relations between objects.” “Graphs can be used to model many types of relations and processes in physical, biological, social and information systems” (Wikipedia).
A good example of how graphs can be used is with fraud detection. Graph databases are uniquely positioned to spot the connections between large data sets and identify patterns, a useful trait when it comes to spotting complex, modern fraud techniques. A better example is the product recommendations you get on Amazon and other online retail sites. Amazon can pull together product, customer, inventory, supplier and social sentiment data into a graph database to spot patterns and make smarter recommendations to you.
I’m still wrapping my head around how graphs can be useful in education. For an assignment I designed a graph around a peer review assignment for students. It’s pretty basic, but in my mind this might be useful data to find patterns to help students improve their work.
Later in this course we will be learning how to use Neo4j, a graph database management system and GraphX, Apache Spark’s API for graphs and graph-parallel computation. So I imagine my graphs in another week will be much better.
Next post I’ll share some information about Canvas Data Portal, as I now have access to Maricopa’s instance. It’s so exciting even though I don’t really know how to “look” at the data yet, but I can see all the flat files. I just need a database to magically appear with a data scientist attached to help. 🙂