Authors: Vafa Bayat and Steven Phelps (Bitscopic, CA, USA)
In this interview we speak to Dr Steven Phelps, the Director of Data Sciences at Bitscopic (Palo Alto, CA, USA) and Dr Vafa Bayat the Head of Research & Development at Bitscopic about the model they have created that uses data collected from standard laboratory tests to diagnose COVID-19 in patients via machine learning. The work has been published in Clinical Infectious Diseases.
Please could you introduce yourselves by giving a brief background of your career to date?
I joined Baylor College of Medicine (Houston, TX, USA) as an MD/PhD student and received my PhD from Hugo Bellen’s lab, in which I helped identify dozens of novel genes associated with neurological and neurodegenerative diseases, which laid the groundwork for the lab’s involvement in the creation of the NIH’s Undiagnosed Disease Network program. After completing my MD, I came to Stanford University (CA, USA) for a pathology residency and a postdoc. In my postdoc I worked on the identification of pathological mechanisms for a rare but very severe disease called Pontocerebellar Hypoplasia. After completing my postdoc and an internship at 23andMe (Sunnyvale, CA, USA), I joined Bitscopic as its head of R&D.
I studied low-temperature physics at Stanford University (CA, USA) under Nobel Laureate Douglas D Osheroff, and received a PhD in physics at Princeton University (NJ, USA) under Nobel Laureate PJE Peebles. After a postdoc and academic appointments at the Technion in Haifa, Israel, where I pursued a new method of estimating the masses of nearby galaxies, I returned to the USA where I have been involved in startups in various fields as a Data Scientist.
Please could you give an overview of your recent work on: ‘A COVID-19 prediction model from standard laboratory tests?
After noticing back in January that COVID-19 patients described in some of the early published scientific articles had characteristic off-normal results in certain medical labs and vital signs, we had the idea that a suitably trained machine learning algorithm might be able to identify a ‘diagnostic fingerprint’ for COVID-19 based on small typical shifts in an ensemble of labs. It is the kind of problem that machine learning is very well-suited to solve and with the help of a state-of-the-art algorithm we were able, out of a field of over 70 possible predictors, to narrow down our test to the 20 most important. This enabled us to define a quick and easy-to-implement test for COVID-19 with an overall prediction accuracy of 86%, a very good number considering that the method uses cheap and generic lab methods.
What was the state of testing and diagnostics at the start of the COVID-19 pandemic and what were the shortcomings associated with these?
When we began the project, the CDC and WHO were in the process of creating PCR tests, but they had numerous problems, especially a lack of sensitivity. They also ran into problems sending the tests out to hospitals, with those hospitals complaining that some of the supplies they received were contaminated. Even today, using PCR tests alone can result in a high rate of false negatives. In addition, throughout the past six months there have been resource constraints on widespread testing, resulting in long delays in areas experiencing high demand and more recently, emergency approval for pooling of samples.
Can you explain the advantages of using this machine-learning model to diagnose COVID-19?
Our machine learning model has been trained on the lab results of tens of thousands of patients who took a COVID-19 test. It only requires 15 or more of 20 standard lab tests to make its prediction, one of which is simply patient temperature. The others are part of the CBC and Chemistry lab sets that hospitals perform routinely on patients, such as during standard checkups. We have confirmed that our machine learning algorithm works equally well across ages and genders and is agnostic to coexisting conditions. In addition, we showed that when multiple negative COVID-19 molecular test results were obtained from the same patient over a period of a few days, followed by a positive result, our prediction model identified the initially negative results as consistent with COVID-19 infection two-thirds of the time. This suggests that the prediction model might be used as a fully independent complement to molecular testing and help pinpoint false negatives.
Is it possible to use data from tests that don’t require specialist labs and equipment so that the model could be used globally, including in low-income countries with less developed healthcare?
This is another potential use case for our test. Since it is based on labs produced in a common blood draw, such as platelets and white blood cell counts, even lesser equipped hospitals should be able to administer it affordably and with rapid turnaround time. While it should be noted that with a sensitivity of 82% and specificity of 87% it does not approach the accuracy of the PCR test, it could be considered when no other option exists.
In your opinion, what advancements are required to further enhance the accuracy of your model?
The most serious challenge to improving the accuracy of our test is that it has been trained on the results of the PCR test. This limits us in two significant ways: first, the PCR test is suspected to have a high false negative rate, and second, since most of those who take the PCR test are symptomatic (owing to the scarcity of testing capacity), our test’s accuracy in asymptomatic cases is not well known. There is a need to adjust our algorithm to predict the presence of the virus in asymptomatic populations.
Finally, is there anything else you would like to add about your research or thoughts on the field?
We are encouraged by the results we have seen so far and believe that machine learning methods such as the one we have employed have potentially far-reaching implications for the detection of a host of conditions beyond COVID-19, turning the often blunt instrument of lab results into a precision tool.
The opinions expressed in this interview are those of the interviewee and do not necessarily reflect the views of Infectious Diseases Hub or Future Science Group.
You might also like:
- Adaptive study design and the potential of convalescent plasma therapy for COVID-19 – an interview with José Javier García
- The genetic basis for a high level in clinical variability among COVID-19 patients – an interview with Alessandra Renieri
- COVID-19: Updates on vaccines and therapeutics