How to include covariates in Mega2famSKATRC

26 views
Skip to first unread message

Baojian Fan

unread,
May 17, 2020, 10:36:13 PM5/17/20
to Mega2
I attempted to incorporate covariates (age and sex) in Mega2famSKATRC by reading a customized table into R. However, it returned an error message "tryFn() <simpleError:: argument is of length zero>". I guess that it probably caused by the fact that R failed to link the individual ID between the Mega2 'SQLite' database and the covariates table. I also tried to include the covariates when generating the Mega2 'SQLite' database from PLINK files but I could not figure out how to do it. Can anyone help on this? Thank you.

Dan Weeks

unread,
May 18, 2020, 10:01:26 AM5/18/20
to Mega2
As long as the covariate data frame is set up to match the order of individuals in your phenotype file envir$phe, it looks to me like the code should work properly if your covariate matrix is passed to the Mega2famSKATRC function via the 'covariates' argument.  This is how that matrix is passed by Mega2R to the 'famSKAT_RC' function:

    argn = list(PHENO=envir$phe[,pheno], genotypes=geno, id=id, fullkins=envir$KIN,

               covariates=covariates, sqrtweights_c=sqrtweights_c, sqrtweights_r=sqrtweights_r,

               binomialimpute=binomialimpute, acc=acc, maf=maf, phi=phi)


    tim = system.time ({

        skat = do.call(famSKAT_RC, argn)

    })


Hope that helps,
Dan

Dan Weeks

unread,
May 18, 2020, 10:05:22 AM5/18/20
to Mega2
To include covariates when generating the Mega2 'SQLite' database fro PLINK files, put them in a 'phe' PLINK phenotype file:

0) Done with this menu - please proceed

 1) Select input file format:                  PLINK binary PED format (bed)

 2) Enter PLINK parameters:                     --missing-phenotype -9 --trait default

 3) Input file stem:                           bed

 4) Binary data file:  (PLINK .bed) [required] bed.bed

 5) Pedigree file:     (PLINK .fam) [required] bed.fam

 6) PLINK map file:    (PLINK .bim) [required] bed.bim

 7) Phenotype file:    (PLINK .phe) [optional] _

Dan Weeks

unread,
May 18, 2020, 10:09:26 AM5/18/20
to Mega2
Also, note that there may already be a column named 'sex' in the envir$phe data frame.  If so, if you want to include sex as a covariate, you may need to give it a different name (like 'gender') in your covariate matrix.

Baojian Fan

unread,
May 18, 2020, 10:05:43 PM5/18/20
to Mega2
Thanks Dan for your prompt response.

I did pass the covariate matrix (see below) to the Mega2famSKATRC function via the 'covariates' argument.

--------------------------------
> head(Cov)
  FID    IID Age Gender
1 401 401__1  -9      2
2 401 401__2  -9      1
3 401 401__3  -9      1
4 401 401__4  -9      2
5 401 401__5  -9      1
6 401 401__6  60      2
--------------------------------

The phenotype file imported from the Mega2 'SQLite' database looks like this:
---------------------------------
> head(ENV$phe)
  FID    IID    LT
1 401 401__1 -9.00
2 401 401__2 -9.00
3 401 401__3 -9.00
4 401 401__4 -9.00
5 401 401__5 -9.00
6 401 401__6  4.52
-------------------------------

Here are the R scripts that I used:
---------------------------------
ENV$verbose = TRUE

Mega2famSKATRC(pheno = 3, covariates = Cov$Age + Cov$Gender, genes = c("ARPP21"), envir = ENV, sqrtweights_c=wuweights_c, sqrtweights_r=wuweights_r, binomialimpute = TRUE, acc = 1e-06, maf = 0.05, phi = c(0, 0.2, 0.5, 0.9))
-----------------------------------

However, R still returned an error message "tryFn() <simpleError:: argument is of length zero>".

Can you let me know your advice? Thank you.

Baojian Fan

unread,
May 18, 2020, 10:30:22 PM5/18/20
to Mega2
I attempted to include age in a 'phe' PLINK phenotype file (see below) when generating the Mega2 'SQLite' database. I was using  Mega2 v5.0.0 because it seems that Mega2R v1.0.5 (the latest version) does not accept a 'SQLite' database generated by Mega2 v6.0.0 (the latest version).

--------------------------
FID     IID     Age
401     1       -9
401     2       -9
401     3       -9
401     4       -9
401     5       -9
401     6       60
401     7       -9
401     8       -9
401     9       -9
401     10      36
.
.
.
----------------------

But I got an error (see below). I am confused why there is a duplicate marker name 'Age' in the .map file. Any comments would be helpful. Thanks. 

===== Errors/warnings of type "duplicate_markers":
ERROR: Duplicate marker name 'Age' in 'Indian_WGS_autoSNP_v6_chr3.map'
===== 1 total records of type "duplicate_markers" are in MEGA2.ERR

Duplicate locus names, see MEGA2.ERR for details.
ERROR: Found fatal errors in locus file.
ERROR: Please fix errors and restart.
ERROR: annotated_ped_file.cpp:4489 Mega2 terminated. Error "INPUT_DATA_ERROR" (#4).
===========================================================
Run parameters stored in batch file 2020-5-18-22-11/MEGA2.BATCH
See run summaries in directory 2020-5-18-22-11
   MEGA2.LOG, MEGA2.ERR
The script 'mega2log2html.pl' exited normally.
To view the HTML-formatted run summaries, open
/n/data2/meei/op/fan/Indian_WGS/Mega2/chr3/2020-5-18-22-11/MEGA2run.html
in a web browser.
===========================================================
FAILURE! FAILURE! FAILURE! due to ERROR messages noted previously in the LOG and ERR files.



Dan Weeks

unread,
May 19, 2020, 9:46:46 AM5/19/20
to Mega2
I think the 'Cov' matrix should only include columns of the covariates you'd like to adjust for, and instead of passing in as

'covariates = Cov$Age + Cov$Gender,'

should be passed in as

'covariates = Cov'


Dan Weeks

unread,
May 19, 2020, 12:50:09 PM5/19/20
to Mega2
Regarding the problem of Mega2 6.0.0 databases not working with Mega2R, this issue has been fixed in Version 1.0.7 of Mega2R, which I have just submitted to CRAN so hopefully it will be there soon.  If you would like it sooner, you could install it from the BitBucket repository:  https://bitbucket.org/dweeks/mega2r/src/master/

Dan Weeks

unread,
May 19, 2020, 3:18:02 PM5/19/20
to Mega2
Regarding the "there is a duplicate marker name 'Age' in the .map file" problem, I am not sure why that happened.  But Mega2 v 5.0.0 is old and so it would be better to use the current version.  

If I use the bed.* example PLINK files, and add a 'bed.phe' file that looks like this:

FID IID P1 P2 T1 T2

1  1  0  0  1  21.2

1  2  0  0  2   1.3

1  3  0  0  1   0.9


then Mega2 Version 6.0.0 converts it to a 'dbmega2.db' which is read in properly by Mega2R Version 1.0.7 with no map file complaint:

> ENV <- read.Mega2DB("dbmega2.db")

> p <- mkphenotype(ENV)

> head(p)

  FID IID P1 P2 T1   T2 default

1   1   1  0  0  1 21.2       0

2   1   2  0  0  2  1.3       0

3   1   3  0  0  1  0.9       0

4   1   4  1  2  2 19.1       0

5   1   5  1  2  1 18.3       0

6   1   6  0  0  2  0.7       0

Baojian Fan

unread,
May 19, 2020, 3:46:55 PM5/19/20
to Mega2
When I was importing plink binary files (i.e., .bed, .bim and .fam) to mega2 v6.0.0 or v5.0.0, there is a bug saying "segmentation fault" , but it worked well for .ped and .map plink files.
Reply all
Reply to author
Forward
0 new messages