Named Entity Recognition (NER) is the practice of identifying entities in unstructured text . NER is also referred to as entity extraction.
A variety of tools and APIs are available to support the task of extracting entities from text .
This work should produce a comprehensive survey of entity extraction tools and APIs. A subset of tools and APIs should be selected for further study.
To test the APIs and tools, an extensible software should be created that allows the execution of an API call for each of the surveyed tools and APIs.
The features and results of the surveyed tools and APIs should be compared in a structured and consistent manner.
Consultation of search engines and study of related scientific literature
Selection of a subset of the surveyed NER tools and APIs to investigate further
Development of an extensible software to test each tool with a sample text. A suitable software pattern is the adapter pattern with a common interface.
Testing each tool and API with the software and collecting the results
Comparison of results
M.Sc. students will need to think about ways of detecting and reducing false positives and false negatives in the results, e.g. by involving the crowd .
Please contact Jonas Oppenländer (firstname.lastname@example.org), Königin-Luise-Str. 24-26, room 115, for further information.
 Marrero, M., Urbano, J., Sánchez-Cuadrado, S., Morato, J., Gómez-Berbís, J.M. (2013): Named Entity Recognition: Fallacies, challenges and opportunities. Computer Standards & Interfaces, Volume 35, Issue 5, , pp.482-489.
 Nadeau, D., Sekine, S. (2007): A survey of named entity recognition and classification. Lingvisticæ Investigationes, Volume 30, Issue 1, pp. 3 –26.
 Braunschweig, K., Thiele, M., Eberius, J., Lehner, W. (): Enhancing Named Entity Extraction by Effectively Incorporating the Crowd. BTW Workshops, 13, pp. 181-195.