HiC-Pro bowtie mapping issue

388 views
Skip to first unread message

Xiaoyong Fu

unread,
Dec 11, 2020, 10:41:11 PM12/11/20
to HiC-Pro
Hi, I am using hic-pro/2.11.4 in HPC. It runs OK for splitting fastq inputs and creating a hg38 hindiii digestion map. But when I ran the step 1 qsub HiCPro_step1**.sh , I had the error reports with the log info as: 

##HiC-Pro mapping

stat: Bad file descriptor

Warning: Could not open read file "rawdata/00_SRR6493702_1.fastq" for reading; skipping...

Error: No input read files were valid

(ERR): bowtie2-align exited with value 1

My config-hicpro.txt set up for input files format as:

PAIR1_EXT = 1

PAIR2_EXT = 2

My paired fastq files format are: 00_SRR6493702_1.fastq, 00_SRR6493702_2.fastq,...

and the reads in the file for 00_SRR6493702_1.fastq is like:

@SRR6493702.9999996.1 9999996 length=76

TATTATCTTCTTCTTCAGATTTTTTAACATGCTCAACATATTCTGTTACATCAACATTATGCTTAATGGCATTCTA

+SRR6493702.9999996.1 9999996 length=76

AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE6

I really appreciate if you can help me out for this error?

Thanks!

Xiaoyong Fu

Baylor College of Medicine

Houston, TX

nservant

unread,
Dec 12, 2020, 9:04:05 AM12/12/20
to HiC-Pro
Hi,
Could you please show me your command line, and input folder organization.
Of note, you should use PAIR1_EXT = _1 and PAIR2_EXT = _2
Best

Message has been deleted

Xiaoyong Fu

unread,
Dec 13, 2020, 11:47:36 AM12/13/20
to HiC-Pro
Hi, 

Thank you for reply. I actually tried the ext as _1 and _2, still the same error occurred. 

I used the command line: 
HiC-Pro -i ./Jin_MCF7_P -o ./Jin_MCF7_P_outputs -c ./config-hicpro.txt -p

I ran this command under the dir: /project/schiff/XF/HiC, and this dir structure (after ran the command) is:

/project/schiff/XF/HiC

├── chrom_hg38.sizes

├── config-hicpro.txt

├── fastq_dump_1.pbs

├── fastq_dump.pbs

├── hg38_hindiii.bed

├── HiC.pbs

├── Jin_MCF7_P

│   └── rawdata

├── Jin_MCF7_P_outputs

│   ├── bowtie_results

│   ├── config-hicpro.txt

│   ├── HiCPro_step1_MCF7_P_split.sh

│   ├── HiCPro_step2_MCF7_P_split.sh

│   ├── inputfiles_MCF7_P_split.txt

│   ├── logs

│   ├── rawdata -> /project/schiff/XF/HiC/Jin_MCF7_P

│   └── tmp

├── Jin_MCF7_TamR

│   └── SRR6493751.fastq.gz

├── split_reads_1.pbs

├── split_reads.pbs

├── SRR6493702_1.fastq.gz

├── SRR6493702_2.fastq.gz

└── SRR6493702.fastq.gz

The dir str of Jin_MCF7_P is:

/project/schiff/XF/HiC/Jin_MCF7_P

└── rawdata

    ├── 00_SRR6493702_1.fastq

    ├── 00_SRR6493702_2.fastq

    ├── 01_SRR6493702_1.fastq

    ├── 01_SRR6493702_2.fastq

    ├── 02_SRR6493702_1.fastq

    ├── 02_SRR6493702_2.fastq

    ├── 03_SRR6493702_1.fastq

    ├── 03_SRR6493702_2.fastq

    ├── 04_SRR6493702_1.fastq

    ├── 04_SRR6493702_2.fastq

    ├── 05_SRR6493702_1.fastq

    ├── 05_SRR6493702_2.fastq

    ├── 06_SRR6493702_1.fastq

    ├── 06_SRR6493702_2.fastq

    ├── 07_SRR6493702_1.fastq

    ├── 07_SRR6493702_2.fastq

    ├── 08_SRR6493702_1.fastq

    ├── 08_SRR6493702_2.fastq

    ├── 09_SRR6493702_1.fastq

    ├── 09_SRR6493702_2.fastq

    ├── 10_SRR6493702_1.fastq

    ├── 10_SRR6493702_2.fastq

    ├── 11_SRR6493702_1.fastq

    ├── 11_SRR6493702_2.fastq

    ├── 12_SRR6493702_1.fastq

    ├── 12_SRR6493702_2.fastq

    ├── 13_SRR6493702_1.fastq

    ├── 13_SRR6493702_2.fastq

    ├── 14_SRR6493702_1.fastq

    ├── 14_SRR6493702_2.fastq

    ├── 15_SRR6493702_1.fastq

    ├── 15_SRR6493702_2.fastq

    ├── 16_SRR6493702_1.fastq

    ├── 16_SRR6493702_2.fastq

    ├── 17_SRR6493702_1.fastq

    ├── 17_SRR6493702_2.fastq

    ├── 18_SRR6493702_1.fastq

    ├── 18_SRR6493702_2.fastq

    ├── 19_SRR6493702_1.fastq

    ├── 19_SRR6493702_2.fastq

    ├── 20_SRR6493702_1.fastq

    ├── 20_SRR6493702_2.fastq

    ├── 21_SRR6493702_1.fastq

    └── 21_SRR6493702_2.fastq

My config file is as following:

#######################################################################

## SYSTEM - PBS - Start Editing Here !!

#######################################################################

N_CPU = 2

LOGFILE = hicpro.log


JOB_NAME = MCF7_P_split

JOB_MEM = 64gb

JOB_WALLTIME = 12:00:00

JOB_QUEUE = batch

JOB_MAIL = xiao...@bcm.edu


#########################################################################

## Data

#########################################################################


PAIR1_EXT = _1

PAIR2_EXT = _2


#######################################################################

## Alignment options

#######################################################################


FORMAT = phred33

MIN_MAPQ = 0


BOWTIE2_IDX_PATH = /project/schiff/XF/Bowtie2Index

BOWTIE2_GLOBAL_OPTIONS = --very-sensitive -L 30 --score-min L,-0.6,-0.2 --end-to-end --reorder

BOWTIE2_LOCAL_OPTIONS =  --very-sensitive -L 20 --score-min L,-0.6,-0.2 --end-to-end --reorder


#######################################################################

## Annotation files

#######################################################################


REFERENCE_GENOME = hg38

GENOME_SIZE = chrom_hg38.sizes


#######################################################################

## Allele specific

#######################################################################


ALLELE_SPECIFIC_SNP =

#######################################################################

## Digestion Hi-C

#######################################################################


GENOME_FRAGMENT = hg38_hindiii.bed

LIGATION_SITE = AAGCTAGCTT

MIN_FRAG_SIZE = 100

MAX_FRAG_SIZE = 100000

MIN_INSERT_SIZE = 100

MAX_INSERT_SIZE = 600


#######################################################################

## Hi-C processing

#######################################################################


MIN_CIS_DIST =

GET_ALL_INTERACTION_CLASSES = 1

GET_PROCESS_SAM = 1

RM_SINGLETON = 1

RM_MULTI = 1

RM_DUP = 1


#######################################################################

## Contact Maps

#######################################################################


BIN_SIZE = 500000 1000000

MATRIX_FORMAT = upper


#######################################################################

## ICE Normalization

#######################################################################

MAX_ITER = 100

FILTER_LOW_COUNT_PERC = 0.02

FILTER_HIGH_COUNT_PERC = 0

EPS = 0.1

Thanks again for your response and help!

Best,

Xiaoyong

nservant

unread,
Dec 13, 2020, 1:07:25 PM12/13/20
to HiC-Pro
humm ... just wondering if there is something wrong with the 'rawdata'.
Could you please ;
- Put your fastq files in something like /project/schiff/XF/HiC/Jin_MCF7_P/SRR649370/*.fastq
- Run HiC-pro -p again
- Check that it finds the list of input files in Jin_MCF7_P_outputs/inputfiles_MCF7_P_split.txt
- Submit the job to your cluster from the Jin_MCF7_P_outputs folder ... for instance, `qsub HiCPro_step1_MCF7_P_split.sh`
Thanks



Xiaoyong Fu

unread,
Dec 13, 2020, 5:13:23 PM12/13/20
to nservant, HiC-Pro
Thank you for your reply. I changed the input files as you suggested. But after qsub the step1 job to the cluster, I got the error as:

qsub: Unknown queue MSG=cannot locate queue


The inputfiles_MCF7_P_split.txt showing:

SRR6493702/00_SRR6493702_1.fastq

SRR6493702/01_SRR6493702_1.fastq

SRR6493702/02_SRR6493702_1.fastq

SRR6493702/03_SRR6493702_1.fastq

SRR6493702/04_SRR6493702_1.fastq

SRR6493702/05_SRR6493702_1.fastq

SRR6493702/06_SRR6493702_1.fastq

SRR6493702/07_SRR6493702_1.fastq

SRR6493702/08_SRR6493702_1.fastq

SRR6493702/09_SRR6493702_1.fastq

SRR6493702/10_SRR6493702_1.fastq

SRR6493702/11_SRR6493702_1.fastq

SRR6493702/12_SRR6493702_1.fastq

SRR6493702/13_SRR6493702_1.fastq

SRR6493702/14_SRR6493702_1.fastq

SRR6493702/15_SRR6493702_1.fastq

SRR6493702/16_SRR6493702_1.fastq

SRR6493702/17_SRR6493702_1.fastq

SRR6493702/18_SRR6493702_1.fastq

SRR6493702/19_SRR6493702_1.fastq

SRR6493702/20_SRR6493702_1.fastq

SRR6493702/21_SRR6493702_1.fastq


BTW, my cluster version is: Rocks 6.1 (Emerald Boa)


Thanks again for your help!

Best

Xiaoyong


--
You received this message because you are subscribed to a topic in the Google Groups "HiC-Pro" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hic-pro/5BEC9TDOz7g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hic-pro+u...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/hic-pro/bd1152d2-9608-45b5-b65f-3db70b9ca9c2n%40googlegroups.com.

nservant

unread,
Dec 14, 2020, 2:54:51 AM12/14/20
to HiC-Pro
You must fill in these info in the config file

#######################################################################
## SYSTEM AND SCHEDULER - Start Editing Here !!
#######################################################################
N_CPU = 2
SORT_RAM = 768M
LOGFILE = hicpro.log

JOB_NAME = 
JOB_MEM = 
JOB_WALLTIME = 
JOB_QUEUE = 
JOB_MAIL = 

The error you have is because JOB_QUEUE is not defined.
Best
Reply all
Reply to author
Forward
0 new messages