How connecting data may lead to discoveries in medical research: Dr Alexander Jarasch and Professor Martin Hrabe de Angelis explain that novel research methods produce tremendous amounts of data that cannot be analysed with classic analysis tools. Scientists need to look for new approaches, such as graph technology.
Similar ambitions exist in many other countries--this is also a major question for all researchers worldwide at the moment, including in Germany. If we are serious about dealing with the challenges they represent for patients and society and healthcare systems as a whole, we need to study these diseases in much more depth in order to provide novel methods for prevention and treatment of diabetes.
We believe these new technologies will be crucial in gaining new insights into the workings and causes of these chronic conditions and diseases. The problem faced by everyone trying to do this is that the analysis methods we have been relying on may have reached their limits due to the vast amount of data produced by novel research methods (e.g. omics). The really promising avenue is to use big data levels of data, so as to combine and better connect data to go further.
Integrate and link together more and more data points
That's complicated by the fact that nowadays, research, especially in life sciences, is not limited to one technology or even one discipline. The German Centre for Diabetes Research, where we work, is a multi-centre organisation that combines all the different data that originates from different studies, reports, surveys and research projects from different locations in in the country. We have masses of data from clinical trials and patient information, and our data covers various disciplines, from studies on molecular level to pathway analyses and animal models.
To answer the interesting and suggestive biomedical questions about diabetes, we have to connect this data and look for new insights, patterns and correlations. That's because we realise it is no longer enough to answer a biological or medical question from one direction, we need to integrate and link more and more data.
This is the next step, not just in biomedicine, but also in the healthcare sector, which is increasingly turning away from general blockbuster drugs and moving to individualised treatment or precision medicine. For this to progress, it is necessary to network significantly more and, above all, look at as many aspects of the problem as we can. This is why the DZD and other researchers think graph databases--the technology that powered the Paradise Papers investigation --could help in the prevention, discovery of new subtypes, early diagnosis and treatment of major illnesses. It's important to know that we aren't just using Excel or standard business relational (table) databases any more --we add a whole new layer with graph databases. The standard technology we use in each of our research locations in Germany is a relational database, as well as spreadsheets and document files. But once we realised more and more of that data is connected, we started looking for a solution to bring our data closer in relation to each other, and create an overall context for our research.
Relational databases have their merit. However, we needed something to bring these data silos together and uncover connections--to be able to jump from one data point to another is crucial for us. That's why we turned to graph technology.
To see why, be aware diabetes is a metabolic disease, but it's not sufficient for researchers to only look through metabolic data. They also have to take into account data of other disciplines, such as genomics or proteomics. In the human body, everything is connected in metabolic pathways; a gene encodes a protein that is active in a metabolic pathway and metabolises a metabolite, which in turn is able to regulate another gene. In a way, our metabolism is a network of thousands of components that are connected with each other, which is a graph data model.
Link diabetes research with Alzheimer's
That's why it's so important to be able to uncover these connections and to create a new layer of analysis on top of this data, so we use technology from the graph database world called Neo4j. The great thing about Neo4j is that it has a visual interface we can use for queries and experimentation. We are using it to deepen our 'map' of diabetes--to uncover hidden relationships and pursue the resulting new questions.
For example, we do a lot of important research on animal models to study processes and then compare them to humans, so there is a lot of animal data from mice and pigs. This can generate a hypothesis we want to pursue --for example, 'In the pig model is the prediabetes type X due to causes A and B?' Is this regulated similarly? Are there similar processes?
We think we can link the molecular human data from the basic research with the highly standardised animal model data. In a graph representation, abnormalities, patterns or connections can then be recognised, which will then lead to further research questions. In the long term, it would also be interesting if data from diabetes research could also be used for other areas, such as cancer or Alzheimer's research in order to uncover possible connections.
This isn't the only advanced technology we see as being useful. For example, we will definitely use machine learning techniques with graph software to identify unknown patterns--for example, to try to identify new subtypes of diabetes we find discussed in the literature. Another example is Natural Language Processing--we'd like to build a system that automatically reads scientific texts from literature databases, analyses them and together with our research data generates hypotheses that can be evaluated by DZD scientists. Also conceivable: predictive models that can prescribe the course of the disease to a certain degree of probability.
This is all coming, and we are certain that our data management and analysis approach will take us to the next level in precision medicine, prevention and treatment of diabetes. In general, technology and data absolutely have a central role to meeting the Grand Challenges that the UK wants to take on.
Dr Alexander Jarasch, is head of data and knowledge management at Munich's head-office of the German Centre for Diabetes Research, the DZD (Das Deutsche Zentrum fur Diabetesforschung)
Professor Dr Martin Hrabe de Angelis is speaker and member of the board of the DZD, director of the Institute of Experimental Genetics, Helmholtz Zentrum Munich; and Chair of Experimental Genetics, School of Life Science Weihenstephan, Technical University of Munich, Germany
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||LABORATORY INFORMATICS|
|Author:||Jarasch, Alexander; de Angelis, Martin Hrabe|
|Publication:||Scientific Computing World|
|Date:||Aug 1, 2018|
|Previous Article:||Research 'misconduct' will be big... very big: Mark Newton, a consultant from Heartland QA, gives his take on the scale of research misconduct taking...|
|Next Article:||Simulating the future of cycling: BY USING SIMULATION SOFTWARE, ROAD BIKE MANUFACTURERS CAN DELIVER HIGHER PERFORMANCE PRODUCTS IN LESS TIME AND AT A...|