Klingt spannend? Dann freuen wir uns auf eine aussagekräftige Bewerbung über unseren Partner Campusjäger.
As the digitization of the worlds libraries and print archives continues steadily, the demand for automated processing of such documents grows. Hereby, resarchers and practicioners would like to digitally process such documents with tools from computer vision (CV) and optical character recognition (OCR). Further they would like to search and filter for certain document meta-data. However, all of this presumes the availablity of such extracted features and meta-data. As state-of-the-art machine learning (ML) classifiers still do not reach desired accuracy levels, especially on old documents or those from fringe contexts, manual labeling effort is required.
For the scope of this thesis, we limit the context to segmenting advertisements from scanned pages of newspapers and magazines. This poses an interesting use-case for, for instance, advertising researchers. Associated colleagues at the University of Mannheim (UniMA) have already manually created a labeled set of 9000 segmented pages of the US magazine "The Economist", ranging from the 1840s to today. We expect a thesis student to develop an interactive labeling system in order to support the extension of this segmentation traing data-set to many more pages. Interactive labeling hereby strives to combine automatic steps (e.g. the trained model) with incremental user input. The work-packages entail:
Design science research is a well established methodology in the information systems field, which deals with the scientific view on artifacts, such as the labeling system that should be developed during this thesis. Hereby so called design knowledge can be derived from the development process and the finished artifact.
We expect the student to be familiar with web development. The system should be devloped with a modern web application frontend framework (e.g. Vue with Vuetify) or be forked from an existing open source segmentation system. Further we expect the model to be trained based on standard Python frameworks. Experience in this regard is required as well.
Unser Jobangebot Interactive Labeling of Scan Segmentations klingt vielversprechend?
Dann freuen wir uns auf eine Bewerbung über Campusjäger. Bei unserem Partner Campusjäger kann man sich in nur wenigen Minuten ohne Anschreiben für dieses Jobangebot bewerben und den Status der Bewerbung live verfolgen.
Hier geht's direkt zur Bewerbung: https://www.campusjaeger.de/s/Y1JmR0x-interactive-labeling-of-scan-segmentations
Für noch mehr Informationen über uns geht's hier zu unserem Unternehmensprofil bei Campusjäger: