Higher Chao1 Values (20000-30000)

868 views
Skip to first unread message

KV

unread,
Jul 18, 2016, 4:58:37 AM7/18/16
to Qiime 1 Forum
Dear Team, 

I have a fundamental question on Chao1 values. These values were obtained from illumina sequencing  through QIIME analysis and showed a significantly higher than observed OTUS, which is unusual.The sequence numbers were normalized to 44170 sequence reads for all the samples.


I am curious to know whether Chao1 values around 20000  are acceptable. Please check the below table which details alpha diversity measures for the samples. 

 I would like to know whether the data of Chao1 makes sense.  Also, what could be the factor behind this?

 I'll greatly appreciate your kind guidance regarding this issue. 

Thank you in advance.

Kind Regards,
Krishna


Samples

Number of observed OTUs

Richness estimate (Chao1)

Shannon Diversity Index (H)

Simpson Diversity Index (D)

Phylogenetic diversity (PD)

Good’s coverage

1

3449

20017

5.0

0.87

210

0.94

2

3629

17890

4.0

0.65

197

0.93

3

4221

22226

4.8

0.81

230

0.92

4

4083

23825

5.9

0.93

244

0.92

5

4845

26178

6.6

0.93

287

0.91

6

5157

26517

7.1

0.96

309

0.91

7

5566

26663

7.3

0.97

332

0.90

8

5563

29650

6.6

0.94

333

0.90

0

4702

24495

6.1

0.92

275

0.92







Colin Brislawn

unread,
Jul 18, 2016, 1:31:12 PM7/18/16
to Qiime 1 Forum
Hello Krishna,

Thanks for getting in touch with us. 

When I calculate choa1 values, they are almost always higher than observed OTUs because of how the Chao metric works. Here is a pretty good resources that explain how alpha diversity metrics are calculated, and why chao1 is often higher. Basically:
If a sample contains many singletons, it is likely that more undetected OTUs exist, and the Chao 1 index will estimate greater species richness than it would for a sample without rare OTUs.

Observed: 3449, chao1: 20017
Looks like many OTUs appear in that sample only once, greatly increasing the choa1 metric. 

Let me know if that helps,
Colin

KV

unread,
Jul 20, 2016, 9:38:52 AM7/20/16
to Qiime 1 Forum
Hi Colin, 

Thank you so much for your immediate response. Much helpful.
 I would like to know whether we can represent these data  (chao1) as such or singletons need to be removed for chao1 analysis.

Regards
Krishna

Colin Brislawn

unread,
Jul 20, 2016, 12:39:21 PM7/20/16
to Qiime 1 Forum
Hello Krishna,

I would like to know whether we can represent these data  (chao1) as such or singletons need to be removed for chao1 analysis.
Singletons should not be removed, because chao1 uses those singletons in it's calculations! If you were to filter out singletons, then chao1 == observed species. 

This paper does a good job explaing the chao1 metric: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC93182/ 

"where Sobs is the number of observed species, n1 is the number of singletons (species captured once), and n2 is the number of doubletons (species captured twice)."



I hope this helps,

Colin




Reply all
Reply to author
Forward
0 new messages