by Allie Terrell
Relational databases have been the paradigm of choice for storing information for most commercial systems. They can be thought of as storing data in a spreadsheet. Until recently, the majority of data we recorded fit neatly into this model. Now, as we start looking to store more complicated structures like chemical compounds and social networks, relational databases don't always suffice. Such data needs a more flexible structure for holding it, and graphs allow for an ease of extensibility which accommodates that.
But all of that flexibility comes at a price - finding the solution set to queries to the database is an NP-complete problem, meaning that there is no known algorithm for it than can be run in a reasonable amount of time. This rises from having to match the query graph to have graph in the database to find matches. In order to alleviate this time problem, indexing is used to eliminate a large portion of graphs in the database using other heuristics before having to check for answers.
My thesis focuses on different indexing schemes for graph databases, identifying which algorithms work best for certain types of databases. One of the most interesting aspects of it is seeing all of the different applications of it to real world data - graph databases have the potential to let us think about storing data in a different way than is currently available, allowing for us to store things we previously couldn't.