Research Software Engineer

University of Amsterdam Institute for Logic, Language and Computation

Research Software Engineer – Project: From Books to Knowledge Graphs

Publication date 28 January 2021

Closing date 28 February 2021

Level of education PhD

Hours 19 hours per week

Salary indication €2,790 to €4,012 gross per month, based on 38 hours per week

Vacancy number 21-052

The Institute for Logic, Language and Computation (ILLC) and the Department of Media Studies of the Faculty of Humanities are looking for a research software engineer with a background in machine learning, knowledge engineering and/or digital humanities, to join the industry collaboration (KIEM 2020) project From Books to Knowledge Graphs.

This project is a collaboration among UvA (Dr. Giovanni Colavizza), the University of Lausanne, Switzerland (Dr. Matteo Romanello) and the publisher Brill, which aims to develop an open framework to extract knowledge graphs from scholarly publications using Brill’s classics catalog as a case study.

Project From Books to Knowledge Graphs

The scientific publishing industry is rapidly transitioning towards information analytics. This shift is disproportionately benefiting large companies. These can afford to deploy digital technologies like knowledge graphs that can index their contents and create advanced search engines. Small and medium publishing enterprises, instead, often lack the resources to fully embrace such digital transformations. This divide is acutely felt in the arts, humanities and social sciences. Scholars from these disciplines are largely unable to benefit from modern scientific search engines, because their publishing ecosystem is made of many specialized businesses which cannot, individually, develop comparable services.

In this project, we aim to start bridging this gap by democratizing access to knowledge graphs – the technology underpinning modern scientific search engines – for small and medium publishers in the arts, humanities and social sciences. Their contents, largely made of books, already contain rich, structured information – such as references and indexes – which can be automatically mined and interlinked. We plan to develop an open-source framework for extracting structured information and create knowledge graphs from it. The framework will be released with a commercial-friendly license to encourage its re-use. We will as much as possible consolidate existing proven technologies into a single codebase, instead of reinventing the wheel.

What are you going to do?

As  research software engineer you will:

  • help designing an information extraction pipeline which extracts structured entity information from scholarly publications (e.g., references and indexes) and performs entity linkage, in close collaboration with industry partners (Brill);
  • develop such pipeline, by re-using existing solutions or developing novel ones;
  • test and apply the pipeline on Brill’s classics publication catalog;
  • document work and code via comments, tests and technical documentation;
  • publish code and data openly and support early adopters in re-using them;
  • assist and take an active role in the execution of other project-related activities (e.g., outreach).

What do we require?

The following is required:

  • a degree in a related field, including but not limited to digital humanities, computer science, machine learning, knowledge engineering. Alternatively, a humanities (or other) degree with proven experience in research software engineering;
  • excellent written and spoken English;
  • proved capacity to design, develop and document research software. While the project members mostly use the Python stack, other options are also possible.

The following is preffered:

  • previous experience with machine learning, and in particular information extraction tasks (Named Entity Recognition, entity linkage, etc.;
  • previous experience with knowledge engineering;
  • a PhD in a related field (Computer Science, Machine Learning, Knowledge Engineering, Digital Humanities see above);
  • an interest in the SME publishing sector;
  • an interest in Classics and/or digital humanities.

Our offer

You will be appointed as Research Software Engineer, ufo-profile ICT developer, for 19 hours per week (0.5 FTE) for a period of 12 months at the Media Studies Department of the Faculty of Humanities. Remote work is the default, given the current situation. The starting date of the contract is 1 April 2021. The gross monthly salary (on full-time basis) will range from €2,790 to €4,012 gross per month, depending on experience and qualifications, in accordance with the Collective Labour Agreement of Dutch Universities. This is exclusive 8% holiday allowance and 8.3% end-of-year bonus.


If you have any questions, feel free to contact:

  • Dr Giovanni Colavizza

Job application

The UvA is an equal-opportunity employer. We prioritise diversity and are committed to creating an inclusive environment for everyone. We value a spirit of enquiry and perseverance, provide the space to keep asking questions, and promote a culture of curiosity and creativity.

Does this profile sound like you? If so, we are eager to receive your application.

Applications should include:

  • a letter of motivation (max 1 A4 page);
  • a short CV (2-3 A4 pages, you can include extras as an appendix);
  • a link to previous work (e.g., GitHub profile or personal website) and/or the outcomes of a previous project relatable with the opening. Please make it clear what your role and contributions in this project actually were;
  • the name and contact of two referees (no need for letters).

Applications must be sent as attachments via "Apply Now" (see below) before 28 February 2021.

Shortlisted candidates will be interviewed by early March 2021.

