Gact/UTL package for calling SNPs

35 views
Skip to first unread message

Rachel Tavares

unread,
Dec 21, 2023, 1:47:13 PM12/21/23
to R/qtl discussion
Hi Karl, 

I am working to use the Gact/UTL package to convert my vcf files (founders and samples) into a R/qtl genfile. I wasn't sure if you were familiar with this package but I am having some trouble with my last line of code to run the function. The infiles were uploaded, I uploaded a blank geno.csv file,  set sample and founder ids,  set alleles, and gave the max seq. length.  It is saying at the end file is not TRUE and I am not sure where the problem lays. I know this is a separate R package (https://github.com/gact/utl/blob/master/R/convert_vcf_to_genfile.R), however, if you have any insights I would greatly appreciate it. 

Rachel

CODE:

infiles <- c('RIL.vcf, parents.vcf')
genfile <- 'geno.csv'
sample.ids <- utl::read_samples_from_vcf('RIL.vcf')
founder.ids <- utl::read_samples_from_vcf('parents.vcf')
alleles <- utl::mapping(c(FOUNDER1='A', FOUNDER2='B'))
max.seqlength <- 43273689

convert_vcf_to_genfile <- function(infiles, genfile, samples, founders, alleles=NULL,
                                   max.seqlength=NULL, na.string=c('-', 'NA')) {

    stopifnot( is_single_string(genfile) )
    na.string <- match.arg(na.string)

    clashing.ids <- intersect(samples, founders)
    if ( length(clashing.ids) > 0 ) {
        stop("sample/founder ID clash - '", toString(clashing.ids), "'")
    }
    sample.data <- read_snps_from_vcf(infiles, samples=samples, max.seqlength=max.seqlength,
                                      require.any=TRUE, require.polymorphic=TRUE)

    founder.data <- read_snps_from_vcf(infiles, samples=founders, max.seqlength=max.seqlength,
                                       require.all=TRUE, require.polymorphic=TRUE)

    geno.mat <- make_geno_matrix(sample.data, founder.data, alleles=alleles)
    snp.loc <- parse_snp_marker_ids(colnames(geno.mat))

    id.col <- c('id', '', rownames(geno.mat))
    geno.mat <- rbind(colnames(geno.mat), snp.loc$chr, geno.mat)
    geno.mat <- cbind(id.col, geno.mat)

    utils::write.table(geno.mat, file=genfile, na=na.string, sep=',',
                       quote=FALSE, row.names=FALSE, col.names=FALSE)

    return( invisible() )
}

utl::convert_vcf_to_genfile(infiles, genfile, sample.ids, founder.ids,
                            alleles=alleles, max.seqlength=max.seqlength)
Error in FUN(X[[i]], ...) : file.exists(file) is not TRUE

Reply all
Reply to author
Forward
0 new messages