The University of Texas Health Science Center at Houston (UTHealth) has teamed up with the White House Office of Science and Technology Policy and the National Institute of Standards and Technology (NIST) to develop search engines that will help streamline COVID-19 research for healthcare experts fighting the virus.
UTHealth is one of just four institutions in the country working with NIST on the TREC-COVID initiative. The Allen Institute for Artificial Intelligence (AI2), the National Library of Medicine (NLM), and Oregon Health & Science University (OHSU) are also part of the joint effort.
“There is currently a lot of information coming out about COVID-19,” said Kirk Roberts, PhD, assistant professor at UTHealth School of Biomedical Informatics. “With so much going on, it makes it difficult for anyone to handle that much information. To help solve this information overload problem, the best thing to do is to develop a search engine much like Google or PubMed where clinicians and scientists can access evidence-based COVID-19 information they need very quickly.”
The effort follows in the footsteps of prior information retrieval evaluations under NIST’s Text Retrieval Conference (TREC) paradigm, including the ongoing TREC Precision Medicine track led by Roberts.
“UTHealth School of Biomedical Informatics is a national and international leader in artificial intelligence application in medicine and health care,” said Jiajie Zhang, PhD, dean and The Glassell Family Foundation Distinguished Chair in Informatics Excellence of the School of Biomedical Informatics. “During this pandemic, we are mobilizing and utilizing our large pool of expertise in data mining, machine learning, drug discovery, and natural language processing to help with the detection, tracking, mitigation, and treatment of COVID-19 patients. The TREC-COVID challenge led by Roberts and colleagues across the nation is one of many initiatives we are undertaking at the school and we are proud to partner with such organizations.”
TREC-COVID team members from UTHealth, NLM, and OHSU will develop and release a series of sample queries for registered participants. Those who register will run the initial queries on their search systems against the COVID-19 Open Research Dataset (CORD-19) document set, a resource of more than 44,000 research articles and related data about COVID-19 and the coronavirus family of viruses. Results will be submitted to NIST. Biomedical experts from NIST, UTHealth, AI2, NLM, and OHSU will then assess the results and evaluate the retrieval systems’ overall performance. Participants will have one week to submit their search results, and NIST will post results approximately a week after the submission deadline closes, with an expected spacing of about two weeks between each new dataset round being released.
The team initially anticipates conducting approximately five consecutive rounds of search system assessments. Organizations interested in participating can sign up on the NIST website.
“Health experts used to worry that pandemics would spread without people being aware of it, but during this COVID-19 crisis there is no shortage of information out there,” Roberts said. “In fact, nowadays we’re swamped with data and opinions. One of the key roles for artificial intelligence to play in this and future epidemics is to cut through the noise to provide the best available evidence to clinicians and policy decision-makers.”