rsem-sam-validator

24 views
Skip to first unread message

Xiaofeng Qian

unread,
Feb 18, 2024, 6:01:48 PMFeb 18
to RSEM Users
I want to make the alignment file (.bam) satisfying RSEM's requirements.
But when I using this command, I met an error.
Here is the log file, is anyone can help me?

Here is the log file,
samtools sort -n -@ 1 -m 1G -o PA_5.AddRG.Reorder.convert_forRSEM.tmp.bam /storage2/workspace/RNAseq_SED_vs_PA/data/F23A430002092_MUSfxwlT/Alignment_hisat/bam/PA_5.AddRG.Reorder.Sort.bam
[bam_sort_core] merging from 13 files and 1 in-memory blocks...

rsem-scan-for-paired-end-reads 1 PA_5.AddRG.Reorder.convert_forRSEM.tmp.bam PA_5.AddRG.Reorder.convert_forRSEM.bam
.......................
Finished!

Conversion is completed. PA_5.AddRG.Reorder.convert_forRSEM.bam will be checked by 'rsem-sam-validator'.
rsem-sam-validator PA_5.AddRG.Reorder.convert_forRSEM.bam
.
The two mates of paired-end read V350220273L3C001R00100223784 are not adjacent!
The input file is not valid!

Xiaofeng Qian

unread,
Feb 18, 2024, 6:21:37 PMFeb 18
to rsem-...@googlegroups.com
this is my bam file


$samtools view -H PA_5.AddRG.Reorder.Sort.bam
@HD VN:1.6 SO:coordinate
@SQ SN:chr1 LN:195154279 M5:ebdcb59bd671b01de5a67d9803f500ea UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr10 LN:130530862 M5:cc1e9a05f16cacc4955f56bd0f579267 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr11 LN:121973369 M5:227171e89d468ee817a7fb0ccaba38cd UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr12 LN:120092757 M5:b0f91aa89b888e3c1f29263818b02210 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr13 LN:120883175 M5:fe93f607754e075f2849b35af77af07d UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr14 LN:125139656 M5:5587dd4c57372cc4fe78fe1f6f8b68d0 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr15 LN:104073951 M5:99f61c36433d8fc45bd265cffa4be357 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr16 LN:98008968 M5:8d8e220195ef6f4b400548600131d0d6 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr17 LN:95294699 M5:ca46671b6cafa344dddca2e6b0db0991 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr18 LN:90720763 M5:717a1f751989489983e41c7f195c202a UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr19 LN:61420004 M5:96b19db237dfb928d0695a3b366acf40 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr1_GL456210v1_random LN:169725 M5:0cc560d98f9f22f4385397db82e1c108 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr1_GL456211v1_random LN:241735 M5:36e85680c669756c9a1554cf31c9de03 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr1_GL456212v1_random LN:153618 M5:4c4dc3bfe987e3bc4ef4756bef269373 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr1_GL456221v1_random LN:206961 M5:e21da65d7276b256b8edf92660a928b0 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr1_MU069434v1_random LN:8412 M5:9de130f8b9ca81fa376dac13e6044f0b UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr1_GL456239v1_random LN:40056 M5:026a56195f744bfc337bd35c5c406ca3 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr2 LN:181755017 M5:08bc6a9988099d0d4cc457341933249a UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr3 LN:159745316 M5:2399e48fe19b768e0738efcedbc04b7a UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr4 LN:156860686 M5:c998850568fb66c8a0438a521965bf9a UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr4_JH584295v1_random LN:1976 M5:ebc2f8438cbd080b53dc1cf528bf070e UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr5 LN:151758149 M5:ded8e8ee2aae6a3e8c081e1a7c2e7c26 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr5_JH584296v1_random LN:199368 M5:9b5b5f3af54ac1c2e91964a2c8b3f9ee UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr5_JH584297v1_random LN:205776 M5:efb1b00ffad6dd710ffd5d46ce94a25c UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr5_JH584298v1_random LN:184189 M5:1910644b4393b414d16774d3a1b73c49 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr5_GL456354v1_random LN:195993 M5:61643b629b3105fd2f32cc82871ca8e0 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr5_JH584299v1_random LN:953012 M5:b6bc88bfe26ef155b5fe2a7b90830ca5 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr6 LN:149588044 M5:314397656eb3d25aaa70d407df67c6a8 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr7 LN:144995196 M5:cdc994b8172f4e1e3099a2b9a41f4684 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr7_GL456219v1_random LN:175968 M5:a9a02958a866bdbde864f45dba4678e9 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr8 LN:130127694 M5:33a6d892dcc698e5af446bd3cce98d61 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chr9 LN:124359700 M5:dce50387ea191577d2f47408be52d093 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrM LN:16299 M5:11c8af2a2528b25f2c080ab7da42edda UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456359v1 LN:22974 M5:603a636ed64b91e321f4016e95496e2e UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456360v1 LN:31704 M5:e920aab56a78dad48b143260fdd46def UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456366v1 LN:47073 M5:5a13a662666e5216edc9777123a90721 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456367v1 LN:42057 M5:c515676047df349605e1f0b9a2c931e2 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456368v1 LN:20208 M5:aafc72058b568c532611be3ed4de1421 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456370v1 LN:26764 M5:b1497fc4d9c50793ce42287b807023a6 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456372v1 LN:28664 M5:63d6c279d38036a35a614ebf3b5aff04 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456378v1 LN:31602 M5:f2e4730a3ebda87af4e300991b5b1fdd UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456379v1 LN:72385 M5:2e317d895c77ed23c1ead5d964787944 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456381v1 LN:25871 M5:cd52161045aaf547b16aa269806d0fa2 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456382v1 LN:23158 M5:0a6e2bdf8e33bbcf41a1f22d28b31040 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456383v1 LN:38659 M5:957a72650aef5448bb88d22c95091a3f UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456385v1 LN:35240 M5:14d999bf71d97ab109e0d120421813e1 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456387v1 LN:24685 M5:ee8d8a56353bfd1e03794413a6de4790 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456389v1 LN:28772 M5:33a8ae4e638372e6786e2139b868c9de UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456390v1 LN:24668 M5:98ab2165ef0892fb13b2826abef47f28 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456392v1 LN:23629 M5:4a6c8910adf1323d66f179e7b22c12b3 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456394v1 LN:24323 M5:d2fc329384f30769e5bb8e9e5547f88b UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_GL456396v1 LN:21240 M5:00bc9e8162a01525c883724d0f152c3a UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_JH584304v1 LN:114452 M5:3da6902c6bfae6e54184fe309460b5ce UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrUn_MU069435v1 LN:31129 M5:168e4d99aa1fa20c9740db4a2d0d4340 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrX LN:169476592 M5:2f96da7dbcddf6f0866de651f42417bb UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrX_GL456233v2_random LN:559103 M5:fea40855e95eb187dcbe56ae210d41dc UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrY LN:91455967 M5:493dcdf262cba21cae56d4092d5c9202 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrY_JH584300v1_random LN:182347 M5:63017c95f5328b37975b41ca1019ae14 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrY_JH584301v1_random LN:259875 M5:6b2fb727b7f71259d4b2aa53f921f83e UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrY_JH584302v1_random LN:155838 M5:8265ab6c29db6b3f7edbd3d0bf9f6649 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@SQ SN:chrY_JH584303v1_random LN:158099 M5:23d8b4c1dbfa3cc96e2e2d5f56914a31 UR:file:/ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic.fa
@RG ID:PA_5 LB:PA_5_library PL:illumina SM:PA_5 PU:machine
@PG ID:hisat2 PN:hisat2 VN:2.2.1 CL:"/ifswh4/BC_PUB_T1/Pipeline/Software/miniconda3/envs/hisat2/bin/hisat2-align-s --wrapper basic-0 --phred33 --sensitive --no-discordant --no-mixed -I 1 -X 1000 -p 8 -x /ifswh4/BC_PUB_T2/BioSysDB/BGI/v2201/Index/10090/UCSC/mm39/Index_geneome_Hisat2/genomic --novel-splicesite-outfile /ifswh7/BC_COM_RNA/F23A430002092/MUSfxwlT/result/backups/Alignment_hisat/PA_5/PA_5.junction -1 /tmp/65019.inpipe1 -2 /tmp/65019.inpipe2"
@PG ID:samtools PN:samtools PP:hisat2 VN:1.15.1 CL:/ifswh4/BC_PUB_T1/Pipeline/Software/miniconda3/envs/samtools/bin/samtools view -b -S -o /ifswh7/BC_COM_RNA/F23A430002092/MUSfxwlT/result/process/Alignment_hisat/PA_5/PA_5.bam -
@PG ID:samtools.1 PN:samtools PP:samtools VN:1.19.2 CL:samtools view -H PA_5.AddRG.Reorder.Sort.bam

Xiaofeng Qian <qianxiao...@gmail.com> 于2024年2月19日周一 00:01写道:
--
RSEM website: http://deweylab.biostat.wisc.edu/rsem/
---
You received this message because you are subscribed to the Google Groups "RSEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rsem-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rsem-users/fa871346-fce8-4f40-b501-fbb159ae09c6n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages