This demonstration shows how contextualized document vectors can be used to retrieve information from a large healthcare dataset, such as 11,000+ documentson COVID19 by semantic scholar (2020).The model used is trained on Wikipedia data, see our WWW2020 paper and GitHub for more details on the implementation.


How to use?


Enter the name of a disease and optionally a specific aspect into the query field. The system will retrieve the top 25 passages from the dataset that answer your query.


Browse through the dataset and analyze how relevant each sentence in a document is for your query. The shade of blue visualizes the relevance score of a sentence.

Try some examples

Datasets used in this demo: 11.1K articles from PMC Open Access, CC BY-NC-SA

Access a different dataset:

  1. Wikipedia (Encyclopedia articles about diseases)
  2. CORD-19 (COVID-19 Open Research Dataset)


CORD19 Healthcare Retrieval