The deluge of Covid-related data and the AI helping doctors make sense of it
Last February, as the world woke up to the fact it was dealing with a deadly pandemic, something else was starting to go viral - misinformation.
From theories over the origins of the coronavirus to advice on how to contain its spread and treat it, the conflicting narratives emerged almost immediately.
“One of the things that was really making me crazy as a mother and as a human being on Earth, was that I didn’t know who to trust,” says Jennifer Marsman, an AI engineer and cognitive search specialist at Microsoft based in Canton, Michigan.
Across the US, a suburb away from Microsoft’s Redmond headquarters in Washington, one of the first major clusters of Covid-19 infections to emerge in the US was rapidly growing at the Kirkland Life Care Centre.
The virus ripped through the residents and staff of the nursing home, eventually killing 35 people – more than New Zealand’s entire Covid-19 death toll to date. It was to be just the beginning.
In March 2020, the French authorities advised against using the popular anti-inflammatory drug Ibuprofen for fear that it might worsen Covid-19. It was a spurious claim, based on a handful of anecdotes. But its impact was immediate and global.
“Non Ibuprofen medications were skyrocketing on Amazon because nobody was going to the grocery store at that point,” Marsman remembers.
“I give my kids Ibuprofen. So I’m freaking out.”
As more conflicting information about the pandemic emerged, Marsman, who was once voted “techie whose innovation will have the biggest impact”, started formulating a plan.
“I’m a machine learning person, a data person. Where is the data I can trust?” thought Marsman, who will present the keynote speech at next week’s Aotearoa AI Summit in Auckland.
“I’m never going to cure Covid but maybe I can make a difference. Maybe I can use machine learning to help get information in the hands of these medical professionals who can.”
At this point, research papers on various aspects of Covid-19 were already filtering out from the scientific community. The Covid-19 Research Dataset (CORD-19) was set up as a global repository for scientific data and papers and now includes over 400,000 scholarly articles.
While the effort helped collate scientific knowledge on Covid-19, Marsman wanted to use it to assist front line doctors who were having to make life and death decisions in hospitals as Covid patients filled their wards. Clinicians were scrambling to keep up to date with the latest evidence-based techniques to fight the virus and save lives. Marsman set out to make that task easier.
She drew on the open-access CORD-19 data set to feed Microsoft’s Cognitive Search tools based on the Azure Cloud platform. These tools allow unstructured data contained in a range of sources, from SQL databases and Sharepoint directories to PDFs, Office document files and JSON files to be quickly analysed to create a high-quality search index.
Entity extraction, a text analysis technique that uses natural language programming (NLP) specific data from unstructured text, allowed Marsman to focus on a medical taxonomy of terms that doctors would want to quickly locate and read up about.
Use of semantic recognition improved the effectiveness of searches. Azure’s machine learning allows the system to improve as more searches are undertaken and the number of indexed articles grows.
“If I say something that is very akin to kidney but not kidney per se, it can pull in related things,” explains Marsman.
Marsman describes Covid-19 Search, the Azure-based website she and her colleague Liam Cavanagh built, as “ugly but functional”. The aim, after all, was to quickly let clinicians access accurate scientific information quickly.
One of the first feature requests, from a hospital that became an early use of the website, was for a date range search option. With new scientific articles uploaded daily to CORD-19 through much of last year, clinicians wanted the ability to limit searches to the newest and most relevant scientific papers as understanding of Covid-19 evolved.
Graph visualizations in Covid-19 Search also allow users to gain a visual snapshot of the relationship between topics, paper authors and journals.
Marsman and Cavanagh put the code base for Covid-19 Search on Github available for anyone to use or modify to suit their own needs.
“If people want to fork the codebase or continue to work on it they can,” says Marsman who plans to maintain Covid-19 Search and sees wider scope for AI-powered cognitive search across the healthcare sector.
“There’s definitely a need for this beyond the scope of Covid,” she says.
“For instance, how do I get information about clinical trials out to the patients who want to know the results of it?”
Modelling Covid’s many scenarios
Here in New Zealand, Azure was being put to a different use in the effort to contain the pandemic. The University of Auckland-based research group Te Pūnaha Matatini was on board with the Government early in the crisis, using its data science expertise to model scenarios for the spread of the virus.
In Wellington, software start-up CloseAssociate was also drawn into the effort, a major pivot for a company that specialises in developing software for membership organisations
“It’s not part of the core business of CloseAssociate. They’ve just kind of done this because they’ve seen the need and wanted to participate,” says Matt Bostwick, Partner Director at Microsoft New Zealand.
CloseAssociate were able to take the data from Te Pūnaha Matatini and use a free AI-powered modelling tool hosted in Azure, to generate scenarios to help health leaders plan the Covid-19 response.
“They were able to take that data, build some AI models and enable the planners from different parts of logistics system across New Zealand to go, well, what does it mean for us?” says Bostwick.
“You’ve now got interactive website-based insights that anyone can kind of use and interact with.”
Microsoft supported the CloseAssociate work with an AI for Good grant. Bostwick says AI and machine learning has huge potential to draw insights from disparate sets of data held by healthcare providers.
“We’re building products that kind of can become the backbone of the health system. You’re actually looking at connected datasets and looking at data across multiple different parts of the health system.”
Jennifer Marsman will appear via video at the Aotearoa AI Summit on Wednesday, May 12.