The Microbe Directory – an interview with Heba Shaaban and David Westfall

The Microbe Directory is a collective research effort aiming to profile over 7,000 microbes in order to provide further information for any researchers carrying out metagenomic analyses, or anyone interested in a specific microbe!

Following a recent publication in Gates Open Research describing this inventory, we spoke to two of the Project Leaders, Heba Shaaban (Hunter College, NY, USA) and David Westfall (Weill Cornell Medicine, NY, USA), about the project, how it’s being used in research and how you can get involved.

Listen to the podcast, or read the interview in full below.

01.05 – Introducing the Microbe Directory

02.03 – What makes this database different from existing projects, such as MicrobeWiki?

02.49 – Building the inventory

04.32 – Using the directory

05.20 – Why is important to be able to summarize characteristics such as biofilm-formation and antimicrobial resistance?

06.06 – Why were the characteristics seen on the database chosen?

06.45 – Who do you envisage this being useful to? Do you have any examples of projects that have used the directory?

08:20 – Do you have any future plans for improving the directory, or expanding further?

Martha: Hi and welcome to Infectious Diseases Hub, I’m Martha, the Editor, and today I’m joined by Heba Shaaban and David Westfall,

Heba: So, my name is Heba Shaaban and I’m a Student at City University of New York-Hunter College pursuing a bachelor’s degree in biochemical sciences and anthropology and I’ve been working in Christopher Mason’s lab for about 2 years now.

David: And I’m David, so I’m a third year medical student at Weill Cornell Medicine

Martha: Both of whom are project leaders for a collective research effort  called the microbe directory, which we’ll hopefully be hearing a little bit more about today.

In many metagenomic analyses a sample is processed for the DNA it contains and the microbes present in the sample are reported, however, many such analyses stop at this stage. The Microbe Directory allows this research to be taken a step further and we’re about to learn a little bit more about what it comprises.

Heba: So the Microbe Directory is basically an inventory that profiles around 7500 unique microbial species. It’s a large  aggregation of data on various microbiological characteristics, such as optimal pH, optimal temperature, Gram stain and so on, and our project is basically a database that can be used downstream of large-scale metagenomic analyses but it can also be used by any individual who wishes to learn more about a particular microbe.

And the reason we started this project, it started when our principal investigator, Dr Christopher Mason would discover hundreds or thousands of bacterial species from the different projects that he’d work on and he was always frustrated because he’d end up searching for a particular microbe for hours to learn more about it. So he decided that there should be a better, faster way to learn more about a particular set of microbes, and that’s when the project really started.

Martha: Great, what makes this database different from existing projects such as MicrobeWiki and similar things?

David: So, I mean, MicrobeWiki’s a great resource but its better if you are looking up one particular bacterium because it’s all basically rich text and it’s very qualitative. The Microbe Directory differs in that, again, we’re a quantitative resource. So if I had a sample, say 300–400 bacteria, I couldn’t use MicrobeWiki to compute optimal pH of that sample, or cluster on any of the other parameters.

Martha: 7667 microbes is no small number, and Heba and David told me a bit more about who they recruited to put together the project and how it was done.

So, it’s got upward of 7000 microbes in it, right? That must have been a lot of work!

Heba: 7667 to be exact! We decided that we really needed a large workforce to work on it, so we recruited students from other institutions. We decided to recruit students to partake in this project because we kind of wanted to make science more equitable and accessible – you know, we have many bright young scientists that can really make significant contributions to science if given the chance. So with a large project that requires a large workforce, it provides a great opportunity for young scientists to showcase their skills independently.

And what we basically did is we held training sessions at Weill Cornell and we had tutorials for them and guidelines to follow about how they should curate the internet for these specific parameters, and they did this independently for a total of 20 weeks  until the database was complete.

Martha: So, how many students approximately took part, do you know?

Heba: 47 students

Martha: Okay great, so quite a lot of microbes per student!

Heba: Yes! So, we divided it into only 10 species per week, so it doesn’t get too overwhelming. Each student had to curate for 4 to 5 hours and send their entries back to us, and for the first couple of weeks we heavily monitored their entries to make sure there were no significant errors and then after that they were just randomly checked.

Martha: With the emphasis very much on  community interaction there are several ways that you could get involved with the Microbe Directory, not just viewing it online, but also contributing and David explains a little bit more about these.

David: So probably the easiest way to use it is to go on our web interface, and that’s at That kind of provides an interface, somewhat similar to say MicrobeWiki – where you get online, you can search for a bacterium and it will pull up information.

Also through our website though you can get on to our GitHub page, where we host basically different versions of the database, for example, a Python version and an excel version. So those are the versions that are really going to be useful for bioinformatitions to use.

Martha: The Microbe Directory may be used downstream of metagenomic, taxonomic analyses by collecting data on various microbe characteristics which allows one to link simple taxonomic classifications to much more interesting parameters.

So, why do you think it’s important to be able to summarize characteristics such as biofilm formation and antimicrobial resistance?

David: It’s really just additional data points for researchers, so let’s say you swab your mouth or saliva and you get a list of microbial species, now it’s great to have that list but that doesn’t really tell you that much about the species. If you were able to get additional data points – so they’re all non-biofilm forming or they’re all susceptible to antibiotics and they grow near a pH of 7 – this I think could provide a lot of different insights to what you’re studying. So again its additional data and I think the whole point is that it really could drive new ways to curate your research.

Martha: Why were the characteristics that were on the database chosen?

David: So we actually started with the more qualitative free-form text input, but the problem with that we found was that it was really hard to standardize. The different students would basically all do something slightly different it was really hard to make meaningful predictions based off of that. So by keeping the database simple, to parameters that are easily quantifiable for instance, does it form a biofilm, does it not – a binary on/off parameter – it basically made it a lot easier to standardize and collect.

Martha: So, who do you envisage this being useful to and do you have any examples of projects that are using the directory?

Heba: Yeah, so we envision the Microbe Directory as being a widely used platform for metagenomic analyses or any researcher that discovers a novel species they’ve been working on they wish to catalogue along with a reference to their work or their publication. And we hope that with individual contributions to the site it’s going to refine our understanding of the microbial community.

In terms of examples its actually being used in two projects currently, one is MetaSub and the other is ‘Stuck on You’ and they’ve both been projects that have been going on in our lab for a while.

So ‘Stuck on You’ is basically a longitudinal project, and its held at the JP Morgan Conference in San Francisco (CA, USA)  in January every year, which has attendees from all over the world. Basically our team of scientists swab and sequence attendees phones to try and learn more about the activities, habits and  even travel histories of the people that are represented by these microscopic genetic traces left on their phones.

And so what the microbe directory does is after the samples have been sequenced it is used to profile the microbial community on individual phones – we do this every year at the JP Morgan conference to see how these communities change over time. And if you go to the stuck on you site you can actually see taxonomic breakdown of the individuals samples and they’re anonymous – so you can check that out if you’d like.

Martha: Definitely, that sounds fascinating – I’d hate to think what was on my phone

Heba: We get that all the time!

David: Probably better not to know sometimes!

Martha: So finally, do you have any future plans for expanding the directory or improving it further?

David: So I kind of mentioned earlier, one of the things we really wanted to do with this directory was make it a community project. So if you go onto our web interface, you can actually submit edits to the values for any of the microbes and so basically what happens is the edits you submit, you’ll do a citation and we have sort of an administrative view, where we can review the changes and either merge them into the database, or reject them with reasons.

Alternatively we also have that GitHub page, so if you wanted to make a more substantial structural change to the database we have the mean to do that. And then finally we also have Bitbucket page that hosts the source code for our website so basically anyone who’s interested could come on and contribute and really make some substantial changes.

Martha: Okay, great, so it’s all about people just getting involved and contributing and using it in their own way.

David: Exactly, I mean these microbial communities are way too big for any small group of people to do, you really need a community to step in and help out.

You might also like:


Leave A Comment