Masterthesis: Enabling a data management system to support the "good laboratory practice"

This page describes the intention behind the above mentioned masters thesis. It also provides background information and describes the methodology and approach to the thesis.

The thesis will be in English, therefore this page is in English as well.

Start of the thesis is: 2010-11-02

More information here: https://wiki.sistec.dlr.de/DataFinderOpenSource/DataFinderThesis/EnablingGLP Master Thesis Wiki page

Section 1 : Introduction and Relevance of Thesis:

With the "Principles of Good Laboratory Practice and Compliance Monitoring" the OECD provided research institutes with guidelines and a framework to ensure good and reliable research. In it the "Good Laboratory Practice" is defined as "a quality system concerned with the organizational process and the conditions under which non-clinical health and environmental safety studies are planned, performed, monitored, recorded, archived and reported." ([8] p.14) This definition can be extended to other research fields as well, since being able to prove the quality of research is highly relevant for credibility and reliability in the research community. Next to organizational processes and environmental guidelines, part of the good laboratory practice is to write and maintain a laboratory notebook, when conducting an experiment. (In the principles it is called study plan and described in part 9) As computer aided experiments get more and more powerful, the data received get more elaborated and voluminous. Therefore handling these amounts of data gets increasingly complicated. In order to get hold of the situation, the German Aerospace Center (DLR) as Germany's largest research institute, developed an open source data management application. This data management system is supposed to help the researcher to manage their received data. It allows heterogeneous storage backend, flexible extensions to its interfaces and metadata support. [9] The next step is to extend this data management client in such a way it can also be used as a continuous integrated platform to support the good laboratory practice, simply meaning as an electronic laboratory notebook. [6, on how what is important in a laboratory notebook] The focus of this master thesis will be to describe the prerequisites for a laboratory notebook and the integration of some features into a data management application.

Section 2: Background and important prior research:

Part of the master thesis will be the analysis of already existing and predominantly proprietary electronic laboratory notebooks. These notebooks are used for requirements analysis since their usage in the research sector is due to costs and other disadvantages not applicable. Another foundation for the analysis of laboratory notebooks will be the comparison to medical patient documentation [1]. In the field of documenting the origin of data, work based on provenance will be consulted and used.[2,3,5] This research field explores ways on how to implement the provenance idea into an application as well on what needs to be considered when trying to record the history of data.

Section 3:Research Objectives

When converting a data management client into a laboratory notebook several questions need to be answered. For example:
  • What are the main criteria in a laboratory notebook in order to make it conform to the good laboratory practice?
  • How can these criteria be implemented into a data management application/electronic laboratory notebook?
  • What other issues might be interesting when managing research data electronically?
Considering these questions, three main points are detected. In a laboratory notebook it is necessary to:
  • show and proof the data origin
  • provide the ability to validate data
  • archive the data for proof and reproducibility

Section 4:Research Methodology and Approach:

In order to get a closer understanding of a laboratory notebook, this master thesis will analyze different already existing electronic laboratory notebooks, as well as common procedures with a laboratory notebook. This analysis is supposed to clarify requirements that need to be met, in addition to the points stated before. The analysis will be conducted through literature study, analysis of software and unstructured interviews with researchers. In addition the master thesis will provide an extension to the data management application "DataFinder"[9] including the before mentioned points (Origin, Validity and Permanency).
  • To show and proof the origin of data, Provenance is used [5], it will be based on findings and implementation details as described in [2].
  • To validate data a mechanism will be implemented to add digital signatures to the metadata of a data item.
  • To archive data for proof and reliability the possibility to store data with the web service developed in the BeLab project is implemented. [10]

Section 5:Presentation of expected outcome and its importance

The master thesis should conclude that extending a data management application to an electronic laboratory notebook is possible. It should also show concepts and an implementation of a prototype meeting the main requirements for an electronic laboratory notebook, such as provenance, validity and permanency. If the data management application is able to fulfill these criteria it could help a lot of researchers to simplify and improve their work.

Section 6: Conclusion and expected work:

The work needing to be done for this master thesis, will be to extract requirements necessary for a laboratory notebook, then a generic workflow of a experiment process needs to be developed and modeled into a provenance model [4]. This model and process needs to be integrated into the data management application [3], as well as some supporting functions such as signing and securely archiving data.

Section 7: Literature References

[1] Parallelen in der Datenrepräsentation zwischen Laborbüchern und sensorerweiterten elektronischen Patientendokumenten von Tobias Paasche (Bachelorarbeit TU Braunschweig, PTB, und Medizinischen Hochschule Hannover (PLRI) )

[2] Heinrich Wendel: Using Provenance to Trace Software Development Processes ( Master thesis at University of Bonn)

[3] Munroe, S., Miles, S., Groth, P., Jiang, S., Tan, V., Moreau, L., Ibbotson, J. and Vazquez-Salceda, J. (2006) PrIMe: A Methodology for Developing Provenance-Aware Applications. Technical Report , ECS, University of Southampton. (http://eprints.ecs.soton.ac.uk/13215/)

[4] Open Provenance Model (OPM) http://openprovenance.org/

[5] Why and Where? A characterization of data Provenance http://repository.upenn.edu/cis_papers/210/

[6] Bliefert, C. & Ebel H. F. (1998). Schreiben und Publizieren in den Naturwissenschaften – 4., völlig neu bearbeitete Auflage. Weinheim: WILEY-VCH Verlag. google books Kapitel 1.3

[8] OECD GLP No 1: OECD Principles on Good Laboratory Practice (http://www.oecd.org/document/63/0,2340,en_2649_34381_2346175_1_1_1_37465,00.html)

[9] DataFinder - flexible data management on Launchpad launchpad.net/datafinder

[10] BeLab: Beweissischeres elektronisches Laborbuch http://www.belab-forschung.de/

Comments