A question for conserved elements by using phastCons

33 views
Skip to first unread message

lenis vasilis

unread,
Oct 17, 2013, 7:28:51 AM10/17/13
to gen...@soe.ucsc.edu
Hello everybody,

My name is Vasilis and I am a PhD student. I am a computer scientist and my knowledge in biology is limited, unfortunately.
I want to study the conserved elements among mammals and more specific among ruminants.
To do that first I must find a way to identify them. In order to do that I used the PhastCons files that you have for 46 species (human genome as reference). I found from these files the consecutive nucleotides with probability of conservation more than 99% and for length I am giving different values (more than: 100, 150, 200, 250, 300, 350 bp). After that, I’m extracting these subsequences of the reference genome (human) and I’m blast them back to each genome.
I am finding around 49.000 conserved element with length more than 100bp all over the human genome, but when I’m blasting them to another genome for example on the mouse genome, the number of conserved are only 445. (I’m collecting only the hits with 100% similarity and the same length).
It suppose that all of these conserved elements must be found and on the other species.
Could you tell me if this methodology that I am following is the appropriate?
I’m trying to find if there is a problem with my methodology or I’m miss-calculating something.

Thank you very much in advance,
Vasilis.

Luvina Guruvadoo

unread,
Oct 17, 2013, 4:13:30 PM10/17/13
to lenis vasilis, gen...@soe.ucsc.edu
Hi Vasilis,

Unfortunately, your question is beyond the scope of this mailing list as
we we do not offer advice on scientific direction. This list is intended
to address questions concerning the use of the Genome Browser.

We do have an existing Conserved Elements track on the human hg19
assembly, which you can read more about here:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=cons46way. The papers
listed under "References" may also be useful.

If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly-accessible
forum. If your question includes sensitive data, you may send it instead
to genom...@soe.ucsc.edu.

- - -
Luvina Guruvadoo
UCSC Genome Bioinformatics Group


On 10/17/2013 4:28 AM, lenis vasilis wrote:
> Hello everybody,
>
> My name is Vasilis and I am a PhD student. I am a computer scientist and my knowledge in biology is limited, unfortunately.
> I want to study the conserved elements among mammals and more specific among ruminants.
> To do that first I must find a way to identify them. In order to do that I used the PhastCons files that you have for 46 species (human genome as reference). I found from these files the consecutive nucleotides with probability of conservation more than 99% and for length I am giving different values (more than: 100, 150, 200, 250, 300, 350 bp). After that, I�m extracting these subsequences of the reference genome (human) and I�m blast them back to each genome.
> I am finding around 49.000 conserved element with length more than 100bp all over the human genome, but when I�m blasting them to another genome for example on the mouse genome, the number of conserved are only 445. (I�m collecting only the hits with 100% similarity and the same length).
> It suppose that all of these conserved elements must be found and on the other species.
> Could you tell me if this methodology that I am following is the appropriate?
> I�m trying to find if there is a problem with my methodology or I�m miss-calculating something.

Vasileios Panagiotis Lenis [vpl]

unread,
Jun 9, 2017, 11:30:22 AM6/9/17
to gen...@soe.ucsc.edu
Hello UCSC genome browser team,

I’m trying to use bamToPsl tool from Kent’s toolbox and even if my bam file is small (around 2Mb) I have a Segmentation fault (core dumped).
It generates a psl file (a few Kb) but it crushes. I tried to sort the bam file and the psl that generates is larger but it crosses again.
Am I doing something wrong?

Thank you in advance,
Vasilis.



--------------------------------------------------------------------
Un o’r 4 prifysgol uchaf yn y DU a’r orau yng Nghymru am fodlonrwydd myfyrwyr.
(Arolwg Cenedlaethol y Myfyrwyr 2016)
www.aber.ac.uk

Top 4 UK university and best in Wales for student satisfaction
(National Student Survey 2016)
www.aber.ac.uk

Brian Lee

unread,
Jun 13, 2017, 4:19:26 PM6/13/17
to Vasileios Panagiotis Lenis [vpl], gen...@soe.ucsc.edu
Dear Vasilis,

Thank you for using the UCSC Genome Browser and your question about a segmentation fault when using bamToPsl. We are seeing a similar issue when testing a bam file on the size of 1.2Gb where unaligned sequences were causing a crash, while not running into problems with smaller files.

Thank you again for reporting this issue and using the UCSC Genome Browser. A fix for this bug has been implemented in our test environment and should be released to our public site with our next release of v351 after Wednesday July 5th (http://hgdownload.soe.ucsc.edu/admin/exe/).

If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/FBB589CF-6E47-4E5B-94AC-139D4505B9C6%40aber.ac.uk.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Reply all
Reply to author
Forward
0 new messages