Nothing Special   »   [go: up one dir, main page]

Skip to content

jbesomi/Korono

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

👑 Korono

A Question-Answering system for COVID-19 papers

GitHub stars ReadTheDoc PyPI - Downloads GitHub issues GitHub license

Introduction • Getting started • Under the hoods • Server and Client API

Introduction

information. The amount of documents related to COVID-19 is increasing exponentially. With such a massive amount of information, it's getting harder for the research community to find the relevant pieces of information.

Search-engine-on-steroids. Korono is a question-answering platform conceived to facilitate the research of information regarding COVID-19. You can think of Korono as a search-engine-on-steroids.

Working principle. Korono engine is composed of two phases: the search engine phase and the question-answering phase. In the first place, given a query q, the search engine returns all relevant papers from that query. Later on, the answer is extracted from each paper and displayed.

Getting started

You can either use the online version (coming soon) or run your own server.

Run a server locally:

./run_server.sh

Run client and ask a question:

> from korono import client
> client.get_answers("What is coronavirus?")

Under the hoods

Search engine. The search engine use a ranking algorithm known as Okapi BM25, where BM stands for best matching. BM25 is a bag-of-words retrieval function that sort documents based on the query terms appearing in each document.

Question answering. The questions are extracted from the corpus using Transformers, large neural networks language models. As of now, only the distilbert-base-uncased-distilled-squad model is supported. Soon, we plan to extend support.

Server and Client API

Server API

  • load_data.get_df() Returns the underline dataset.

  • load_data.get_metadata_df() Returns the CORD-19 metadata pandas DataFrame.

  • korono_model.answer_question(question, context) Given a question and a context, returns the answer.

  • korono.model.get_summary(text) Given a text, the model returns the abstractive summary.

  • korono_model.find_start_end_index_substring(context, answer) Return the start and end index, if they exists, of the answer string in the context string.

Client API

  • client.get_answers_json(question) Return a JSON object of the form:
      {
         "results": [
               {
               "context": "coronavirus is an infectious disease",
               "question": "what is coronavirus?",
               "answer": "an infectious disease",
               },
         ]
      }
  • client.get_answers(question) Return a list of all answers.