V+Ü Empirical Evaluation in Informatics (Empirische Bewertung in der Informatik) SS 2014

This is the homepage of the lecture (Vorlesung) "Empirische Bewertung in der Informatik" (Empirical Evaluation in Informatics) and its corresponding tutorial (Übung).

Description

As an engineering discipline, Informatics is constantly developing new artifacts such as methods, languages/notations or concrete software systems. In most cases, the functional efficiency and effectiveness of these solutions for the intended purpose is not obvious -- especially not in comparison to other already existing solutions for the same or similar purpose.

For this reason, methods for evaluating the efficacy of these solutions must be a routine part of Informatics -- a fact which unfortunately only slowly has become recognized. Evaluation is needed by those who create new solutions (that is in research and development), but also by the users, as these need to evaluate the expected efficacy specifically for their situation. These evaluations need to be empirical (that is based on observation), because the problems are nearly always too complicated for an analytical (that is a purely thought-based) approach.

This lecture presents the most important empirical evaluation methods and explains where these have been used (using examples) and should be used, how to use them and what to consider when doing so.


Administration

Lecturers

Requirements/target group, classification, credit points etc.

see entry in the KVV course syllabus

Registration

  • KVV (course syllabus): All participants need to have registered in the KVV.

  • For the tutorials every participant need to have registered in the KVV.
    • Subscribe to » Empirische Bewertung in der Informatik SS 14«

Dates

Examination modalities

Necessary criteria for obtaining the credit points:


Content

Most of the linked documents and videos can only be accessed from the FU Berlin network (externally you receive a 403/Forbidden: "You don't have permission to access ...", use a VPN connection in this case).

Lecture topics

The lecture divides into three sections:

The individual lectures:
  1. (14.4.2014) Introduction - The role of empiricism:
    • Term "empirical evaluation"; theory, construction, empiricism; status of empiricism in Informatics
    • Hypothetical examples of use
    • quality criteria: reliability, relevance
    • Note: scale types
  2. (28.4.2014) The scientific method: (Video 2014-04)
    • Science and methods for gaining insights; classification of Informatics
    • The scientific method; variables, hypotheses, control; internal and external validity; validity, reliability, relevance
  3. (5.5.2014) How to lie with statistics: (Video 2014-05)
    • When looking at somebody else's conclusions from data: What is actually meant? What specifically? How can they know it? What is not said?
    • Does the measurement distort the meaning? Is the sample biased?, etc.
    • Material: book on the topic; Study on alternative ink; article with arguments against hypothesis testing: "The earth is round (p < 0.05)".

  4. (12.5.2014) Empirical approach:
    • steps: formulate aim and question; select method and design study; create study situation; collect data; evaluate findings; interpret results.
    • example: N-version programming (article, reply to the criticisms against it)
  5. (19.5.2014) Survey: (Video 2014-05)
    • example: relevance of different topics in Informatics education (article)
    • method: selection of aims; selection of group to be interviewed; design and validation of the questionnaire; execution of the survey; evaluation; interpretation
  6. (26.5.2014) Controlled experiment: (Video 2014-05)
    • example 1: flow charts versus pseudo-code (article, criticized prior work)
    • method: control and constancy; problems with reaching constancy; techniques for reaching constancy
    • example 2: use of design pattern documentation (article)
  7. (2.6.2014) Quasi experiment:
    • example 1: comparison of 7 programming languages (article, detailed technical report)
    • method: like controlled experiment, but with incomplete control (mostly: no randomization)
    • example 2: influence of work place conditions on productivity (article)
  8. (16.6.2014) Benchmarking: (Video 2014-06)
    • example 1: SPEC CPU2000 (article)
    • Benchmark = measurement + task + comparison; problems (costs, task selection, overfitting); quality characteristics (accessibility, effort, clarity, portability, scalability, relevance) (article)
    • example 2: TREC (article)

  9. (23.6.2014) Data analysis - basic terminology: (Video 2014-06)
  10. (30.6.2014) Data analysis - techniques:
    • Samples and populations; average value; variability; comparison of samples: significance test, confidence interval; bootstrap; relations between variables: plots, linear models, correlations, local models (loess)
    • Article: "A tour through the visualization zoo"

  11. (7.7.2014) Case study: (Video 2014-07)
    • example 1: Familiarization with a software team (article)
    • method: characteristics of case studies; what is the 'case'?; use of many data types; triangulation; validity dimensions
    • example 2: An unconventional methods for für requirements inspections (article)
  12. (14.7.2014) Other methods: (Video 2014-07)
    • The method landscape; simulation; software archeology (studies on the basis of existing data); literature study;
    • example simulation: scaling of P2P file sharing (article)
    • example software archeology: code decline (article)
    • example literature study: a model of the effectiveness of reviews (article)

  13. (oops, term is already over!) Summary and advice:
    • Role of empiricism; quality criteria; generic method; advantages and disadvantages of the methods; practical advice (for data analysis; for conclusion-drawing; for final presentation); outlook

Aims of the tutorials

Practice sheets

(These links will be added continuously as the course proceeds.)

Changes over the years

Literature


(Comments)

Should you have comments or suggestions concerning this page, you may add them here (possibly with date and name):