Good's coverage index

6,881 views
Skip to first unread message

Beatriz Gil Pulido

unread,
Sep 20, 2016, 11:48:56 AM9/20/16
to Qiime 1 Forum
Hi, 

Could someone help me to give an explanation of the good's coverage index? I'm using it to support the rarefaction curves in my analysis but I'm not really sure about how to explain that index. I would appreciate any tip. 

Thanks! 

Colin Brislawn

unread,
Sep 20, 2016, 12:38:20 PM9/20/16
to Qiime 1 Forum
Good morning,

Here is the official documentation of Good's Coverage, from skikit-bio

Good’s coverage estimator is defined as
1 - (F1 / N)
where F1 is the number of singleton OTUs and N is the total number of individuals (sum of abundances for all OTUs).


Colin

Wale Adebayo

unread,
Sep 20, 2016, 1:25:45 PM9/20/16
to Qiime 1 Forum
So, a Good's coverage index of 0.96 means approximately just 4% of your OTUs are probably not covered during sequencing. As Collins said it uses singleton to calculate. Other similar index like Huang use doubletons also, but Good's is popular

Colin Brislawn

unread,
Sep 20, 2016, 5:11:24 PM9/20/16
to Qiime 1 Forum
Hello Wale,

So, a Good's coverage index of 0.96 means approximately just 4% of your OTUs are probably not covered during sequencing.
Be careful! Let make sure we get this definition right! 

1 - (F1 / N)
N does NOT equal the number of OTUs in the sample. N equals the number of reads in the sample (which is the sum of counts from all OTUs). 

So if a sample has a Good's coverage == .96, this means...
Correct answer: 4% of your reads in that sample are from OTUs that appear only once in that sample.
Wrong answer: 4% of your OTUs in that sample appear only once. 

These stats are tricky, and it's important to understand how they work and what specifically they are measuring. 

Lots of other alpha diversity metrics use singletons and doubletons too. Chao1 is pretty common, and ACE is a version of chao which uses 10 as the threshold, instead of 2. 

You can also calculate the number of singletons or doubletons:

Keep in touch,
Colin

Beatriz Gil Pulido

unread,
Sep 21, 2016, 12:16:47 PM9/21/16
to Qiime 1 Forum
Hi Collin, 

Thanks for your answer and suggested reads. I saw in a publication that Good's coverage index was used to support the rarefaction curves. And in the paper is indicated " ... 0.99 suggested the coverage degree of the MIseq sequencing was high and anticipant". 

So, what I understood from that was as closer to 1 the Good's coverage index is, better cover in your sampling you did. I don't know if I am wrong with that. That is why I asked for a definition of the Coverage Index. 

Regards, 

Beatriz. 

Beatriz Gil Pulido

unread,
Sep 21, 2016, 12:24:33 PM9/21/16
to Qiime 1 Forum
I asked sometime ago but I cannot find the reference for it: when you calculate the alpha metrics, at which taxonomy level are the index calculated?

Thanks, 

Bea. 

Colin Brislawn

unread,
Sep 21, 2016, 12:29:50 PM9/21/16
to Qiime 1 Forum
Hello Beatriz,

"So, what I understood from that was as closer to 1 the Good's coverage index is, better cover in your sampling you did."
Yes, that's correct. 

"when you calculate the alpha metrics, at which taxonomy level are the index calculated?"
By default, they are calculated at the OTU level (which depends on 3% clustering distance, and not on taxonomic classification). 

I hope that helps,
Colin


Beatriz Gil Pulido

unread,
Sep 21, 2016, 12:34:41 PM9/21/16
to Qiime 1 Forum
Hi Collin, 

Many thanks, it helps a lot. 

Regards, 

Bea. 

Beatriz Gil Pulido

unread,
Sep 22, 2016, 11:01:23 AM9/22/16
to Qiime 1 Forum
Hi Collin, 

Is there any way to calculate the alpha metrics at different % of similarity cut-off? Let say at 5% instead of 3%?

Thanks, 

Bea. 

Colin Brislawn

unread,
Sep 22, 2016, 12:46:05 PM9/22/16
to Qiime 1 Forum
Hey Bea,

Is there any way to calculate the alpha metrics at different % of similarity cut-off? Let say at 5% instead of 3%?
I don't think there is a qiime script to do that, but you can take a look here: http://qiime.org/scripts/ 

If you wanted to try to do that, you could pick OTUs at different levels (say 10%, 5%, 3% 1%, 0% (dereplication)) then calcalulre alpha metrics for each of these. 

Colin

Reply all
Reply to author
Forward
0 new messages