[00:00:07]
>> Thank you Jim NG and so I actually were 2nd hat I'm still a good with the College of computing and. Junk research faculty and that's why I can get away today with slides that still have the Georgia Tech logo on them. And so being that this is a class I know it's also a class and a joint seminar with with child we can have again a bit more fun and if you have questions as we go along please raise your hand.
[00:00:36]
So what I want to talk today about is scale scalable graph in the loops on G.P.U. accelerators and I think the emphasis here is on the word in the lid and I think it's important for any person who's dealing with. With whether it's graphing the lives or health a military generals the tools are very much repetitive now in terms of why is the word G.B.U. in there well it's because G P U's now offer some the best performance for a wide range of analytics whether it's machine learning or deep learning or just regular analytics but one thing that we sort of have.
[00:01:15]
Has not been entirely adopted just yet is the field of graph and while it's on the G.P.S. That may be my last bullet points of what I'll be talking about today I really want to show that G.P.S. are good for sparse and dynamic graph operations and now the thing that I'll probably spend a bit of my time today talking about 2 different data structures are frameworks we put together here at Georgia Tech and one of them is called Hornet and the others hornets nest and Hornet is a sparse and dynamic graph data structure for well not for G.B. is just a dynamic data structure and hornet's nest is a framework that we put together to implement into Linux in general on the G.P.U. And so by the way I see a few of my students from the massive graph in the links class so they probably seen some of these slides before but they're going to be different than what I've talked about in class now.
[00:02:12]
The thing I want to see here is people who develop in the lid ICS are very different than people who do each P.C. in general and there's there's a good reason for that people who focus on the data science want to focus on what they feel comfortable with they want to focus on the application the users usually have a dataset and the question is not about how do I get this to go faster but how do I in allies that dataset and make this things practical and so about 34 years ago we started putting together some framework said Georgia Tech had different name at the time but they're now in the format of hornet in a hornet's nest and the question was how do we may graph and a little bit in the legs in general accessible to users and that's where these 2 frameworks came together and our focus was really on sort of these evolving graphs what we're calling dynamic graphs and.
[00:03:06]
I'll talk a bit more about that in the upcoming slides but I think it's really important for people who are doing health informatics because and I'll give you a few examples in a bit the world is not static everything is constantly changing and so we really need to be able to deal with dynamic data in by the way most day Namak data problems are still very much open and available the those are those are the highest and projects that people are dealing with right now so any solutions that you can come up with people will are very interested in hearing now maybe I should have highlighted this bullet point over here in red but what really is sort of my take away from today's talk if I want you to leave with something today is that not all static graph problems are really really static what I'm going to show you is that by using tools that we've developed over the last probably 2 decades people who have been doing graph in the late ix we've been seeing that we develop solutions for dynamic graphs Now really this goes to my next slide is what is a dynamic graphs so for many years I use the slide and I said that a dynamic graph is a graph that could change over time but I've started to stop to believe that I no longer will believe that it has to do with any sort of temporal information rather a dynamic graph is simply a graph that can change period doesn't have to be temporal information it's about a fact that the graph can have a change in the apology the number of vertices the number of edges the graph is constantly changing so I don't want you to think about the word time here anymore because that's what we've become accustomed to think about that it has to be something associated with time.
[00:04:55]
And one of the reason that we've done that is we've sort of branched off from the theory world where they use the word streaming graphs and that really gave us this sort of this intuition that our graph was changing as a function of time there was something that had to do assume some sort of with the word time here and really the fact that refer to graph the change at very high rates and so really I want to focus on dynamic graphs and then I'm going to tell you your static graph is still static but your static graph problem can still very budget be dynamic and that's really the focus of my talk today.
[00:05:32]
So maybe to just to give a few examples of where we have dynamic grabs is a financial network where I could have transactions between players and I could have different types of players in my networks. A communication network that's a very high end very fast changing network where I have.
[00:05:55]
You know my routers are constantly moving messages at rates of hundreds of thousands of packets for seconds and so if I one in any way find a way to represent that data through a graph I have to have a a data structure that can change very quickly and I think this slide I mean I've had this example in here for a long time but it's super relevant for this class it's have health care networks we have different types of players in my graph we have doctors we have nurses we have patients we have orderly We want to be able to do pattern matching epidemic monitoring the numbers the size of these networks is constantly changing so we have a lot of these great use cases here.
[00:06:35]
We know that the problems are hard but how do we deal with them now if anyone says well let's deal with them using Python but we know that's not going to scale if you say I'm going to write my solution in Python that gives me a lot of great tools but the underlying data structures are going to be Python you get nowhere really really fast so you have to have the fission data structures.
[00:06:58]
To use them and the way that the world is going nowadays is we can create those data structures in C.. And we can wrapper them up in Python and that's really sort of the objective that we put together many years ago we never found ourselves actually wrapping up the code because.