NFDI DeBioData

DeBioData (DBD) is a consortium assembled from German Life Science Institutions and associated Research Infrastructures working in the field of pre-clinical research and human disease biology. The scope of the consortium’s interests extends from basic scientific studies on molecular targets and their linkage to disease through to the analysis of late stage in-vivo proof-of-concept studies and their translation towards clinical investigations.

Consortium members

Official spokesperson of the consortium:

Dr. Philip Gribbon
Fraunhofer ITMP Drug Discovery

Contact within EU-OPENSCREEN ERIC:



  1. Mathias Wilhelm, Stephanie Heinzlmeir and Bernhard Kuster, Technical University of Munich
  2. Ursula Bilitewski and Mark Bronstrup, Helmholtz Institute of Infection Research, Braunschweig
  3. Katja Herzog and Wolfgang Fecke, EU-OPENSCREEN ERIC, Berlin
  4. Matthias Rarey, Zentrum für Bioinformatik Hamburg, Universität Hamburg

Key objectives

  • stimulate and support basic scientific studies into human disease biologies, (including infectious diseases), and therefore help improve the currently poor translation between the lab-bench and the bedside
  • enable a robust, efficient and qualified network of data infrastructures to extend knowledge about disease relevant biological mechanisms by facilitating the sharing of relevant pre-existing qualified data (with DOI, metadata, validated workflows) which have yet to be elevated to FAIR standards (Findable Accessible Interoperable and Reusable)
  • whilst acknowledging the complexity of data types and standards, develop DBD into a qualified infrastructure which links together a network of pre-existing indication, chemical, biological, ‘omics and target-centric databases as well as any novel upcoming relevant resources, which collectively will form a unified resource for University, SME and large pharma industry based researchers
  • work together with tool providers, data originators and data users to define realisable standards for FAIR data in the disease target through to pre-clinical research domains by developing requirements and guidelines on the FAIRification processes
  • deliver robust FAIR data methods, aligned to emerging European standards which individual German Institutions can adopt to integrate their own data resources into the DBD networked databases, tools and workflows
  • provide data originators/owners secure and compliant solutions by means of novel blockchain-like solutions to assure data integrity
  • allow scientific users to have a “one-stop shop” web-based service to search, find, collect and aggregate FAIR data important for realising their disease and pre-clinical projects

Targeted data types

Quantitative Bioassay related data analyses including but not restricted to:

  • Compound Primary, Secondary and Selectivity screening
  • Pathogen profiling results (MIC and Time-Kill analyses etc)
  • Resistance profiling (eg. from clinical isolates)
  • Toxicity and liability results (Cyto-toxicity etc)
  • In-vitro safety assays (P-450, HERG, Cardiac Ion channel panels)
  • Chemo-informatic analyses (cLog P, TPSA, Ro5 parameters etc)
  • Chemical and biological descriptors (structures, sequences etc)
  • In-vitro ADME studies
  • Physico-chemical assessments (solubility, Log P, etc.)
  • Imaging data covering cellular phenotypes

Computational, modelling and Simulation (selected):

  • Molecular structure results (docking, homology models, MD, conformational search, similarity, etc.)
  • Parameter files
  • Scripts and meta-data
  • Chemo-informatic descriptors (cLog P, TPSA, Ro5 parameters, FPs)
  • Binding modes and pharmacophores

Discovery (in-vitro) Structural and 'Omics related Data (selected):

  • Genetic data
  • Proteomic data
  • Transcriptomic data
  • Protein structural information
  • Structural and 'Omics data (other)

Planned measures and services

Our concept will be to strategically align data providers, users, tool developers and relevant infrastructure platforms at major German Research sites, which share our common research goals. Moreover, the network-focussed approach implies that all German Research Institutes in any region can be associated with our activities, which will serve to maximise the impact of DBD across the scientific community. Users will be able to address their scientific questions making use of a larger data repository than previously, and, within the network have access to a defined collection of machine-learning and artificial intelligence-based tools and workflows. This will allow users to generate, test and validate general prediction models and/or processes in their specific data domain. The higher aggregation levels achievable in this way will pave the way to more precise models and enhance our capabilities to probe and understand the fundamental determinants of cell and tissue function and the deviations associated with (pre-) disease states.

Together with identified interested German stakeholders and in cooperation with other NFDI projects, the consortium will identify the technologies which best can be adapted and adopted by single institutional users to allow single researchers to:

  • be attracted to share (un)published data without losing their provenance and acknowledgement
  • retain data on owner servers whilst data is also made accessible through maintained and sustainable web services or documented application programming interface (API) for data retrieval
  • have access to common ontologies, standards and quality controls for their data (e.g. detailed metadata and resource description language (RDF)
  • use a central access process to access enabling the user to search and interoperate/integrate their workflows to expanded datasets which can be subsequently aggregated with complementary resources
  • have access to the full spectrum of qualified target biology and pre-clinical research data from target genetics, medicinal chemistry, antibody design, bioinformatics workflows, “omics” databases, cellular phenotype data resources, to in-vivo efficacy, toxicological and ADME studies
  • stimulate drug discovery by improving knowledge about biological mechanisms which will allow for development of novel therapeutic options
  • predict toxic effects due to interfering with biological system and/or with ecosystems and identify targets for therapy or avoidance of toxicities
  • drive translation potential by means of future integration and alignment with data infrastructures dealing with clinical data, a feature which will become increasingly in focus as GRDP-compliant data are routinely deployed in the future
  • share/annotate data to go beyond uploading and subsequently forgetting about data and eventually allow researchers to integrate their data/tools with any other data/tools in the resource.