BLINK optimization

91 views

Skip to first unread message

san...@gmail.com

unread,

Apr 21, 2020, 9:14:44 PM4/21/20

to BLINK Forum

The BLINK user manual v1.01 states (P14, 7.4, opimization) that bin_size and bin_ selection can be optimized. Can you please explain how these values are derived and used?

My understanding is the bin_size array is the number of bp in the whole genome and this is based on cumulative positional data for each chromosome. The example 3 50 5 0.5 means 3e+6 bp in the whole genome and bin lengths of 50e+6, 5e+6 and 0.5e+6 are used. Is this correct? If so, how can bin length be greater than the genome?

The bin_selection examples of 3 10 20 30 means a bin_selection array of 3e+6 for the whole genome (as above) with 10, 20 and 30 QTNs (SNPs) from each bin. Is this correct?

If LD (Pearson correlation?) is used to remove QTNs from bins, the bin sizes should be a portion of each chromosome, perhaps up to the LD decay length? As an example, 300-1000 kb. Is this the idea behind bin size optimization?

I have also read the paper in GigaScience which states BLINK only optimizes the number of QTNs.

I also understand LD can also be adjusted with the default value of 0.7 used (LD=0.7). Is this correct? If relaxed to say LD=0.8, more QTNs would be retained and tested in each bin. This makes a large difference to output if the bins are not optimized also.

It seems to me optimization of bin size is critical but my understanding is either completely wrong or not clear enough to make use of this properly. I appreciate any advice or explanation. Many thanks.

Dr Garth M. Sanewski

Principal Horticulturist, Horticulture & Forestry Science

Department of Agriculture and Fisheries

----------------------------------------------------------------------------------------------------------

T 07 53811333 Fax 54535901 E garth.s...@daf.qld.gov.au W www.daf.qld.gov.au

47 Mayers Rd, Nambour. QLD 4560

SCMC 5083, Nambour QLD 4560

Reply all

Reply to author

Forward

0 new messages