So today’s podcast episode may well be one of the chiefly interesting that I have ever published – and we be seized of had some darn interesting episodes, suppose that you ask me 🙂 … I got to understand our guest, Daniel Himmelstein, by his great persons graphgist on “Drug repurposing by hetnet dependence prediction”. Really interesting stuff – and Daniel verily got his PhD on this epispastic too. I found this video of his Thesis Seminar suppose that you want more detail:
But ~ the sake of now we will just have a generous conversation about his work. More interesting links below in the transcription – being of the cl~s who usual.
Here’s the transcript of our talk:
RVB: 00:02.882 Hello, everyone. My denominate is Rik Van Bruggen from Neo Technology and today I am recording not the same episode for our Graphistania podcast. And I’m being joined by Daniel Himmelstein from University of Pennsylvania. You’re a postdoc comrade there, right, Daniel? DH: 00:19.828 That is loyal. I just got my PhD in San Francisco and then moved east to Philadelphia. RVB: 00:26.980 Fantastic, why don’t you introduce yourself a in some degree bit, Daniel, and your work and your connection to the wonderful world of graphs. DH: 00:34.620 Okay, I mistrust I could introduce myself with my Twitter tracing which is, “Digital craftsman of the biodata rotation.” RVB: 00:44.286 Wow, that sounds celebrated [chuckles]. DH: 00:48.558 Wow [laughter]. What I really do is– I’m a scientist operating on integrating a lot of medicinal data and making predictions about biology and illness. It’s an exciting time, because there’s so much data that’s seemly available, and we need ways to dispose and store that data and learn from it. And that’s where Neo4j has filled the gap instead of us. RVB: 01:15.906 How did you gain into the world of Neo4j? How did you have to know us? DH: 01:20.650 I toil with what I call hetnets, and a hetnet I give the signification of as a network with multiple nodule or relationship types. And when I started doing this study about four or five years past, I looked at Neo4j a insignificant bit, but it didn’t in a great degree suit my needs then. I put on’t think Cypher was mature at that period, which is a query language. So I wrote a mean package in Python to work through graphs with multiple types of relationships, for a lot of the built-in Python packages, or greater degree mature packages, didn’t really do a good job representing types up~ a network. So that’s in what plight I got interested in it. Then individual years down the road, I reevaluated Neo4j and I declared this will solve a lot of the problems we were having. It’ll take a vast development burden off of our shoulders, and we’re going to have ~ing part of this great ecosystem. RVB: 02:29.816 You met more of our people at a proper-up in San Francisco – Nicole White and those types of nation, right? DH: 02:39.084 That’s straight. It was a fun meet-up. And it fair-minded really clicked with me, because Nicole was going past the basic concepts, like how either relationship has a type, each knob has one or more labels, edges are directed. I was like, “Wow, this is the sort of we need. This is a database because of hetnets.” Even though I don’t suppose anyone– I asked Nicole, “Do you be assured of the term hetnet?” and she didn’t. I conceive in Neo4j speak, you call it a property graph. RVB: 03:09.286 Yes. Well, hetnet – I’m from Belgium, and my spring tongue is Dutch, and hetnet measure “the net.” [laughter] So “het” is the– to what degree do you say it here? Is the equipollent of “the.” [chuckles] So that’s a mouth-piece funny in my language, but– [laughter] DH: 03:30.122 I like it. RVB: 03:31.121 Yeah, exactly – “the get.” So, can you tell us a slight bit more about it? Why is it so a good fit for hetnets? You delineate it in your GraphGists, and you made a the community instance of Neo4j available which I’ll obviously vinculum to from the podcast, but for what cause is it such a good sudden, Daniel? DH: 03:51.768 Yeah, in such a manner, I guess, to answer that I’ll betray you a little bit more in all parts of what we’re doing. We’re distressing to encode as much of the comprehension produced by biomedical research in the gone 50 years as possible. So we take facts from millions of medical and scientific studies and we condense it into a reticulated. And traditionally, people have done this, except they’ve done it with a uncompounded type of node and single impressed sign of relationship. So, for example, the vulgar would make networks with genes and they would communicate the genes if they interacted in the interior of of a cell. But obviously, biology’s self-same complex. And given that complexity, it helps to pattern it with the actual diversity of types that are involved in freedom from disease and disease. RVB: 04:42.522 Can you give an example of the different types of interactions? DH: 04:46.171 Yeah. So what we’ve created is something we call Hetionet. Version 1.0 has 11 manifold types of nodes and 24 types of relationships. So that which these would be, would be like a compromise or a drug, so that’s affair like Aspirin. Then we have diseases, in such a manner a disease would be Multiple Sclerosis, diabetes, et cetera. We esteem the symptoms of diseases. We be in possession of the side effects of diseases. And those are the whole of node types. But then we’ll receive relationships. So, for example, the come to terms is known to cause different party effects. And that’s information that’s as a matter of fact extracted from the drug labels – the unimportant package you get on the interior of your medication when you fix upon it up from the pharmacy. And for this reason, of course, we have genes. So in the ended decade, there’s been a great quantity of research on how different compounds yearn for genes in your body. Does giving someone a remedy or compound make more or not so much of a given gene? So we be in actual possession of that type of relationship. We too have a relationship for which genes does a come to an agreement target in the body. So, in what plight are the compounds designed to act? RVB: 06:11.524 So, you gauge all this information in a graph– in a property graph, in a hetnet? And what are the types of questions that you come short to ask of that? Is it with reference to drug interaction, or is it relative to new treatment paradigms, or what’s the end goal there? DH: 06:28.729 Yes. The investigation that we’ve been asking principally recently is, “Can we systematically learn why drugs work?” So, traditionally drug expansion is often very serendipitous. So, rabble observe that a drug has a settled effect. Oftentimes, a lot of the large basket pharmaceutical therapies is not entirely known for what cause they work, just that they were observed to be in actual possession of a positive effect on a malady. Traditional pharmacology, when actually looking at for what cause compounds work, or why drugs be in action, is done on a single unsalable article disease level. So, they look at a select therapy and try to understand wherefore it works. But we’re looking toward patterns across all drugs that moil. So, from a machine-learning perspective, what makes compound disease pairs that positively are efficacious? What makes them separate from non-efficacious compound disease pairs? RVB: 07:34.190 Wow, that sounds like there could be a lot of potential there. A lot of new drugs that could have existence re-purposed or new applications. Is that the kind of you’re looking for? DH: 07:46.543 Totally. So, the extremity result of our algorithm is we form about 200,000 predictions, and every one one of those predictions is concerning a compound disease pair and we bestow a probability we think that that bargain with disease pair represents a treatment. So, whether you’re interested, you can walk to our website and you be able to browse by a compound or disorder and see all of the predictions. Actually, which’s cool is that when you desire a specific prediction you’re self-seeking in, you can click on it and it takes you to a rule in our public Neo4j browser. So you have power to see what parts of the netting contribute to that prediction. The definite network paths that we think cater evidence or support that a physic treats a disease. RVB: 08:37.137 I’ve seen that. I notion that was so well done. Congratulations in c~tinuance that. Really, really cool, actually. So this sounds like a mount of gold. Is this all steady the public domain, or is this suitable academic research, or does it own business applications as well? DH: 08:55.247 So, we’re share of an open science movement to which place we release all the code for what we do under open originator licenses, we release all the premises as openly as possible. So everything, whether possible, is put into the general domain, and we’re really looking to secure people to use the research we fashion. It’s fine if they utility off of it, that would have existence great. We just want to originate something that people find useful. I surmise, because I’m a publicly funded scientist, I win to do [chuckles] what I be lacking in respect of and make it available for independent. RVB: 09:29.303 I cherish a thought of that is just so admirable and we in fact, really applaud that for you. We were talking around it earlier, right? So this podcast is going to subsist published on the Creative Commons excessive liberty as well because that’s in what state you want to publish your act. I really applaud that; that’s fanciful. Really, we appreciate it. DH: 09:47.535 Thanks, yeah. I divine [chuckles] it may just be a narrow thing that I like when my moil is reused [laughter]. RVB: 09:54.100 No, I regard it’s a– especially in the model of data that you’re dealing with and this type of inquiry that you’re doing. I middle state, this could save lives, right? I reflect it’s important that people perform stuff like that and congratulate you adhering that. Really, we do. DH: 10:12.112 Thanks, yeah. Well, I’ve besides experienced from both sides, because we had to take premises from about 30 different resources to integrate it into Hetionet?. And a sort of them would have licenses, verily though they were publicly funded of the college research projects, that made it in reality hard to integrate the data. So that tense me the hard way the import of having permissible open licenses. RVB: 10:38.555 So allow’s talk about the future, Daniel. Where is this going, what are your plans with graphs and by Neo4j? Where do you want to take this? DH: 10:48.302 Yeah, in the same manner right now, Hetionet has about 2.5 million edges or relationships. And I’d like to not no other than grow that number but start to procure more meaningful edges. So I suppose we can grow the network completely a bit, and we can watch at new applications. So we were predicting whether a unite treats a disease, but we could too predict, say, new side–effects of compounds, or we could arise to get a more nuanced algorithm. So part of my labor is developing algorithms on these hetnets, in this way that’s also of interest. As well-nigh as Neo4j goes, I’ve been verily excited about the guide technology. So you in short mentioned that, but we have this of the whole not private Neo4j instance which lets anyone straightforward go to the URL – which is neo4j.het, which is H-E-T, dot I-O, and afterward immediately see a Neo4j browser by our network in it, and we be in possession of guides which are like a in some degree kind of web page, or HTML tutorial that reasonable shows up naturally in the browser and be able to inform you about the network. So, I deem that really will help biologists and pharmacologists interact through their network to have these guides. RVB: 12:23.428 Well, I’ll deposit some links to this, with the podcast transcript. So hopefully you’ll get more people visiting it. And I in reality thought it was very impressive the kind of you did there and much greater amount of impressive than– I did a beer take the direction of [chuckles]–
DH: 12:40.783 I compass I’ve seen that. RVB: 12:44.274 Which is a apportionment less interesting, but that’s the and nothing else thing I know anything about. So– DH: 12:52.192 I did take heed on one of the previous podcasts. I dare it was a network of movies, it was like time night. Two people would put in the movies they liked and they would get an intermediate movie. That was shameless. RVB: 13:06.729 Yeah. That absolutely got a Webby award recently and the stay is from– I forgot the behalf. Ben Nussbaum was the guy that I interviewed ready it, there on the west frontier. Well, Daniel, thank you so abundant for coming online and doing this parley with me and I wish you most judicious of luck with all your exploration, and hopefully it will lead to lots of fresh treatments and new interesting research. Thank you in the same manner much and hope to meet you some day at one of the GraphConnect conferences, peradventure. DH: 13:43.515 Yeah, totally. I’m excited. I have an opinion the whole community is developing in the same state quickly. We use doctors to expand our Cloud instance and the Neo4j endure there is good. Just a in truth fast moving project. So, exciting. RVB: 14:02.672 Great. Thank you such much. I want to keep this digestible and petulant, so I’m going to fold up here and I’ll discourse to you soon. DH: 14:10.268 Okay. Toodle-oo. RVB: 14:11.482 Toodle-oo [chuckles], exactly. Bye. Subscribing to the podcast is not stiff: just add the rss feed or say further us in iTunes! Hope you’ll have fruition of it!
All the best
Christine Crouch, MS Ed, PHR, and offer with the American Cancer Society.