Issue with current JASPAR native tracks

213 views
Skip to first unread message

Rafael Riudavets Puig

unread,
Aug 9, 2023, 12:35:47 PM8/9/23
to gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hello,


We were recently in touch regarding an issue with the current JASPAR native tracks. Very briefly, the current tracks seem to not be in sync with the latest version of our tracks (see below for the part of the e-mail we sent previously referring to this).


--

I also wanted to bring to your attention that the current native tracks for the 2022 release seem to be outdated. For example, this TFBS is annotated as TFAP4::ETV1, with matrix ID MA1779.1. However, this matrix ID in JASPAR corresponds to a completely different profile. The last update of the UCSC tracks seems to have happened on  2021-12-23, but our latest version of the tracks dates to 2022-05-07 (you can find the bigBed files here). My best guess is that there was a miscommunication when the tracks got updated on our side and this update did not happen on the UCSC side.

--


Could you help us with this? Also to make this type of issue easier to tackle in the future, we would appreciate if we could define some standard workflow to fix the genome browser tracks whenever needed, or update them with a new release.


Thank you very much in advance for your help,


Rafael Riudavets Puig


Jairo Navarro Gonzalez

unread,
Aug 11, 2023, 6:28:59 PM8/11/23
to Rafael Riudavets Puig, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hello,

Thank you for using the UCSC Genome Browser and sending your inquiry.

We have created an internal ticket to track the release of the JASPAR update. Unfortunately, I cannot estimate when the files will be updated on the live servers. In the future, for major releases, it would be great to create the hub first and then provide us the URL to the hub.txt file so we can review and incorporate the dataset into the Genome Browser.

If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/6b62013d5c2d49ed8d6b799d4d9cb5ee%40ncmm.uio.no.

Rafael Riudavets Puig

unread,
Sep 13, 2023, 12:22:48 PM9/13/23
to gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Dear UCSC Genome Browser team,


We finally have a track hub for the JASPAR 2024 release which is ready for updating the native tracks. You can find the track hub here: https://frigg.uio.no/ftp/mathelier/JASPAR_genome_browser_tracks/2024/hub.txt


Just to note, some of the HTML pages will have small changes (e.g. reference to the latest publication, etc) once the article describing the new release is published. How difficult would it be to get these updated on your side when the article and new database release are published?


Best regards,


Rafael



From: Jairo Navarro Gonzalez <jnav...@ucsc.edu>
Sent: 12 August 2023 00:28:45
To: Rafael Riudavets Puig
Cc: gen...@soe.ucsc.edu; Anthony Mathelier; Ieva Rauluseviciute
Subject: Re: [genome] Issue with current JASPAR native tracks
 

Jairo Navarro Gonzalez

unread,
Sep 20, 2023, 6:33:27 PM9/20/23
to Rafael Riudavets Puig, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hello,

Thank you for using the UCSC Genome Browser and sending your inquiry.

We have created an internal ticket to update the native JASPAR track and will send an announcement when the update is complete. In the meantime, we recommend that you add the 2024 update to the JASPAR TFBS hub that is available on the Public Hubs page:

http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt

Adding the 2024 update to the public hub will allow users to access this data while we work on updating the native tracks on our side.

Once the HTML pages are updated, send us another email, and we can pull in the changes once they are ready.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.


All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser

Jairo Navarro Gonzalez

unread,
Oct 2, 2023, 5:04:13 PM10/2/23
to Rafael Riudavets Puig, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute
Hello, 

We were looking to get the updated PFMs file from the downloads page, https://jaspar.genereg.net/downloads/, but the latest file is from the 2022 release. Have the PFMs been updated in this release? If so, can you send us a link to the file? 

Jairo Navarro 
UCSC Genome Browser

Rafael Riudavets Puig

unread,
Oct 3, 2023, 12:01:21 PM10/3/23
to Jairo Navarro Gonzalez, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hi,


Sorry about that. The new data is currently available at out test page (https://testjaspar.uio.no/downloads/) until the reviewing process for this release's article is completed. Would it work for you to use the test URL?


Best,


Rafael


From: Jairo Navarro Gonzalez <jnav...@ucsc.edu>
Sent: 02 October 2023 23:03:57

Jairo Navarro Gonzalez

unread,
Oct 6, 2023, 7:06:26 PM10/6/23
to Rafael Riudavets Puig, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute
Hello,

Thank you for sending us a URL to the data. It works perfectly for our purposes. We are brainstorming internally about ways to automate the addition of future JASPAR updates, so we will need a way to access this data along with the hub files in the future.


Jairo Navarro
UCSC Genome Browser

Rafael Riudavets Puig

unread,
Oct 9, 2023, 1:27:33 PM10/9/23
to Jairo Navarro Gonzalez, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hello,


As an idea, would it work if we added the required data in a directory within the track hub URL? As an example for this instance, the URL of the PFMs could be in something like: https://frigg.uio.no/ftp/mathelier/JASPAR_genome_browser_tracks/2024/PFMs.zip (note that this URL does not exist at the moment). We automated the process of generating the track hubs, so adding this file should be simple enough. In this way, all the required data would be there whenever the tracks got updated. We could let you know when there is a new version and we would be sure that the PFMs and the tracks are always in sync, since they would be generated together. 


Best,


Rafael


From: Jairo Navarro Gonzalez <jnav...@ucsc.edu>
Sent: 07 October 2023 01:06:13

Maximilian Haeussler

unread,
Oct 10, 2023, 9:39:18 AM10/10/23
to Rafael Riudavets Puig, Jairo Navarro Gonzalez, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute
This would work. Also, instead of a /2024/ URL could this made a /current/ URL ? Same file formats, no change to file names, etc. Just a stable archive. We can then automate the update and won't have to change something. 

I wonder if this is worth the effort, you may not update every year in the future...?

Rafael Riudavets Puig

unread,
Oct 25, 2023, 12:32:21 PM10/25/23
to Maximilian Haeussler, Jairo Navarro Gonzalez, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hi,


For now at least, we plan to create new genome browser tracks for every new release. We created a /current/ URL (https://frigg.uio.no/JASPAR/JASPAR_genome_browser_tracks/current/hub.txt) that will always point to the latest version (in this case, 2024). I hope this would work for you. One more question just to make sure I add the correct file in the directory, which exact file are you using from the downloads page? I am asking because we have the data in different formats (one single txt file with all PFMs, zipped files, etc) and I want to make sure I am adding the correct one.


Best,


Rafael


From: Maximilian Haeussler <mhae...@ucsc.edu>
Sent: 10 October 2023 15:38:38
To: Rafael Riudavets Puig
Cc: Jairo Navarro Gonzalez; gen...@soe.ucsc.edu; Anthony Mathelier; Ieva Rauluseviciute

Jairo Navarro Gonzalez

unread,
Oct 31, 2023, 7:03:09 PM10/31/23
to Rafael Riudavets Puig, Maximilian Haeussler, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hello,

Thank you for creating the /current URL that we can use in the future. We obtained the zip version of the JASPAR CORE PFMs. Specifically, we used this URL to obtain the latest PFMs:

https://testjaspar.uio.no/download/data/2024/CORE/JASPAR2024_CORE_non-redundant_pfms_jaspar.zip

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser

Rafael Riudavets Puig

unread,
Nov 1, 2023, 1:08:12 PM11/1/23
to Jairo Navarro Gonzalez, Maximilian Haeussler, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Dear Jairo,


Thanks for your answer. I just added the file in the following URL: https://frigg.uio.no/JASPAR/JASPAR_genome_browser_tracks/current/PFMs.zip

For each release, we can place the PFMs in a file named PFMs.zip located in the same directory as the hub.txt file. This way you would know where to find the required information. Does that sound good?


Best,


Rafael


From: Jairo Navarro Gonzalez <jnav...@ucsc.edu>
Sent: 01 November 2023 00:02:55
To: Rafael Riudavets Puig
Cc: Maximilian Haeussler; gen...@soe.ucsc.edu; Anthony Mathelier; Ieva Rauluseviciute

Jairo Navarro Gonzalez

unread,
Nov 1, 2023, 5:56:45 PM11/1/23
to Rafael Riudavets Puig, Maximilian Haeussler, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hello,

Thank you for sending your follow-up. Yes, the current directory contains everything we need to automate future updates. We will let you know if any issues occur on our side.

Jairo Navarro Gonzalez

unread,
Nov 7, 2023, 7:53:37 PM11/7/23
to Rafael Riudavets Puig, Maximilian Haeussler, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Rafael Riudavets Puig

unread,
Nov 8, 2023, 12:14:34 PM11/8/23
to Jairo Navarro Gonzalez, Maximilian Haeussler, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Dear Jairo, 


Apologies for that, I just replaced the bigBed files with new ones following the schema from 2022. I hope these work, but please let me know if something is wrong.


Best,


Rafael


From: Jairo Navarro Gonzalez <jnav...@ucsc.edu>
Sent: 08 November 2023 01:53:24

Gerardo Perez

unread,
Jan 30, 2024, 4:14:20 PM1/30/24
to Rafael Riudavets Puig, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hello, Rafael.

My name is Gerardo Perez, and I am from the UCSC Genome Browser. We are working on adding the JASPAR 2024 data to the UCSC Genome Browser. We noticed an issue with the JASPAR2024_hg38.bb. We suspect that the bigBed was built using an older version of the bedToBigBed utility. When running bigBedInfo with this bigBed, we don’t get zoom levels:

$ bigBedInfo https://frigg.uio.no/JASPAR/JASPAR_genome_browser_tracks/current/hg38/JASPAR2024_hg38.bb
version: 4
fieldCount: 7
hasHeaderExtension: yes
isCompressed: yes
isSwapped: 0
extraIndexCount: 0
itemCount: 18,230,308,889
primaryDataSize: 170,623,017,979
zoomLevels: 0
chromCount: 455
basesCovered: 3,049,277,220
meanDepth (of bases covered): 49.359647
minDepth: 1.000000
maxDepth: 1095.000000
std of depth: 47.300427

Whereas the JASPAR2022.bb does have zoom levels:

$ bigBedInfo https://hgdownload.soe.ucsc.edu/gbdb/hg38/jaspar/JASPAR2022.bb                          
version: 4
fieldCount: 7
hasHeaderExtension: yes
isCompressed: yes
isSwapped: 0
extraIndexCount: 0
itemCount: 12,589,080,473
primaryDataSize: 121,096,743,436
primaryIndexSize: 789,538,396
zoomLevels: 10
chromCount: 194
basesCovered: 2,948,541,081
meanDepth (of bases covered): 45.715023
minDepth: 1.000000
maxDepth: 993.000000
std of depth: 42.837314

Could you rebuild the bigBed using the latest binaries? You can find our utilities with the latest binaries in the utilities directory on our downloads server: https://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute


Rafael Riudavets Puig

unread,
Feb 5, 2024, 2:40:05 PM2/5/24
to Gerardo Perez, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Dear Gerardo,


I am currently looking into this. I will come back to you as soon as the new bigBeds are finished.


Best,


Rafael


From: Gerardo Perez <gpe...@ucsc.edu>
Sent: 30 January 2024 22:14:06

Rafael Riudavets Puig

unread,
Feb 9, 2024, 12:45:11 PM2/9/24
to Gerardo Perez, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Dear Gerardo,


We just finished rebuilding the bigBed files with the latest version of the bedToBigBed binary. Hopefully this will have fixed the issue, but let us know if the error still persists. 


Best regards,


Rafael


From: Rafael Riudavets Puig
Sent: 05 February 2024 08:04:55
To: Gerardo Perez

Gerardo Perez

unread,
Feb 21, 2024, 2:24:24 PM2/21/24
to Rafael Riudavets Puig, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hello, Rafael.

Thank you for rebuilding the bigBed files with the latest version of the bedToBigBed binary. 

We took another look at the JASPAR2024 tracks for hg38, hg19, mm39, and mm10 and noticed that the filterValues.name items are not in alphabetic order. Here are how the filterValues.name items are listed for hg38: https://frigg.uio.no/JASPAR/JASPAR_genome_browser_tracks//2024/hg38/trackDb.txt

This will show the "Filter by Transcription factor gene name" items out of alphabetic order on the track description page:

jaspar_filterValues2024.jpg

The JASPAR2022 has the Transcription factor gene names in alphabetical order, which helps to search for a Transcription factor. Is the current order of Transcription factor gene names intended for the 2024 release?

Please include gen...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on a publicly accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute

Gerardo Perez

unread,
Feb 21, 2024, 4:06:23 PM2/21/24
to Rafael Riudavets Puig, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute
Hi Rafael,

Yes, the fix will require ordering the TF names from the trackDb.txt files.


I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute



On Wed, Feb 21, 2024 at 12:11 PM Rafael Riudavets Puig <r.r....@medisin.uio.no> wrote:
Hi Gerardo,

That is not intended, thanks for pointing it out. If I remember correctly, fixing this would only require ordering the TF names in the trackDb.txt files, correct? We can quickly fix that if this is the case.

Best,

Rafael

From: Gerardo Perez <gpe...@ucsc.edu>
Sent: Wednesday, February 21, 2024 8:24:07 PM
To: Rafael Riudavets Puig <r.r....@medisin.uio.no>
Cc: gen...@soe.ucsc.edu <gen...@soe.ucsc.edu>; Anthony Mathelier <anthony....@ncmm.uio.no>; Ieva Rauluseviciute <ieva.raul...@ncmm.uio.no>

Rafael Riudavets Puig

unread,
Feb 21, 2024, 6:34:38 PM2/21/24
to Gerardo Perez, Rafael Riudavets Puig, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute
Hi Gerardo,

That is not intended, thanks for pointing it out. If I remember correctly, fixing this would only require ordering the TF names in the trackDb.txt files, correct? We can quickly fix that if this is the case.

Best,

Rafael

From: Gerardo Perez <gpe...@ucsc.edu>

Sent: Wednesday, February 21, 2024 8:24:07 PM
To: Rafael Riudavets Puig <r.r....@medisin.uio.no>
Cc: gen...@soe.ucsc.edu <gen...@soe.ucsc.edu>; Anthony Mathelier <anthony....@ncmm.uio.no>; Ieva Rauluseviciute <ieva.raul...@ncmm.uio.no>

Rafael Riudavets Puig

unread,
Feb 27, 2024, 12:09:21 PM2/27/24
to Gerardo Perez, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Dear Gerardo, 


We just updated the trackDb.txt files so that the TF names are in alphabetical order. I hope this solves the issue.


Best,


Rafael


From: Gerardo Perez <gpe...@ucsc.edu>
Sent: 21 February 2024 22:06:07

Gerardo Perez

unread,
Mar 4, 2024, 5:56:11 PM3/4/24
to Rafael Riudavets Puig, gen...@soe.ucsc.edu, Anthony Mathelier, Ieva Rauluseviciute

Hello, Rafael.

Thanks for updating the trackDb.txt files. This update resolved the TF names to be in alphabetical order. Everything else looked good. We went ahead and made the JASPAR 2024 tracks available on our main site. We are planning to send a release announcement but first wanted to check with you. Below is a draft announcement. Let us know if any of that information seems inaccurate and if it acknowledges the proper group(s).

New JASPAR tracks: Human (hg19/hg38) - Mouse (mm10/mm39)

We are excited to announce the new JASPAR 2024 tracks for human (GRCh37/hg19 and GRCh38/hg38) and mouse (GRCm39/mm39 and GRCm38/mm10). These tracks represent genome-wide predicted binding sites for transcription factor binding profiles in the JASPAR CORE collection. JASPAR CORE is an open-source database containing a curated, non-redundant set of binding profiles derived from published collections of experimentally defined transcription factor binding sites for eukaryotes. The JASPAR 2024 update expanded the JASPAR CORE collection by 20% (329 added and 72 upgraded profiles). JASPAR continues to uphold its core principles (i) providing high-quality TF binding profiles, (ii) fostering open access, and (iii) ensuring ease of use, which has been useful for the scientific community in studying gene transcription regulation.

The JASPAR database is a joint effort between several labs (please see the latest JASPAR paper). Binding site predictions and UCSC tracks were computed by the Wasserman Lab. We would like to thank Jairo Navarro and Gerardo Perez at UCSC for building and testing these tracks.


Twitter draft announcement:

We are excited to announce the new JASPAR 2024 tracks for hg19, hg38, mm10, and mm39 which represent genome-wide predicted binding sites for transcription factor binding profiles in the JASPAR CORE collection (@jaspar_db).

Learn more about the release: <URL to our News Archives>


Please include gen...@soe.ucsc.edu in any replies to ensure visibility by the team. All messages sent to that address are archived on a publicly accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute



Anthony Mathelier

unread,
Mar 5, 2024, 2:44:08 PM3/5/24
to Gerardo Perez, Rafael Riudavets Puig, gen...@soe.ucsc.edu, Ieva Rauluseviciute
Dear Gerardo.

Thanks for your email and all the support provided for the JASPAR tracks.

I modified your original text below.

Thanks
Best
AM

New JASPAR tracks: Human (hg19/hg38) - Mouse (mm10/mm39)

We are excited to announce the new JASPAR 2024 tracks for human (GRCh37/hg19 and GRCh38/hg38) and mouse (GRCm39/mm39 and GRCm38/mm10). These tracks represent genome-wide predicted binding sites for transcription factors with binding profiles in the JASPAR CORE collection. JASPAR CORE is an open-source database containing a curated, non-redundant set of binding profiles derived from collections of experimentally defined transcription factor binding profiles. The JASPAR 2024 update expanded the JASPAR CORE collection by 20% (329 added and 72 upgraded profiles). JASPAR continues to uphold its core principles (i) providing high-quality TF binding profiles, (ii) fostering open access, and (iii) ensuring ease of use, which has been useful for the scientific community in studying gene transcription regulation.

The JASPAR database is a joint effort between several labs (please see the latest JASPAR paper). Binding site predictions and UCSC tracks were computed by the Computational Biology & Gene Regulation group. We would like to thank Jairo Navarro and Gerardo Perez at UCSC for building and testing these tracks.


Twitter draft announcement:

We are excited to announce the new JASPAR 2024 tracks for hg19, hg38, mm10, and mm39 which represent genome-wide predicted binding sites for transcription factors with binding profiles in the JASPAR CORE collection (@jaspar_db).

Learn more about the release: <URL to our News Archives>





-- 
Anthony Mathelier, PhD
Associate Director - Centre for Molecular Medicine Norway (NCMM)
Group Leader - Computational Biology & Gene Regulation Group, NCMM
Professor II - Centre for Bioinformatics, University of Oslo
Adjunct Researcher - Dept. of Medical Genetics, Oslo University Hospital
http://mathelierlab.com
anthony....@ncmm.uio.no
(+47) 22840561

Gerardo Perez

unread,
Mar 5, 2024, 3:03:24 PM3/5/24
to Anthony Mathelier, Rafael Riudavets Puig, gen...@soe.ucsc.edu, Ieva Rauluseviciute
Hello, Anthony.

Thank you for the edits. We are planning to send out the announcement release today.


Gerardo Perez
UCSC Genomics Institute

Anthony Mathelier

unread,
Mar 5, 2024, 3:24:08 PM3/5/24
to Gerardo Perez, Rafael Riudavets Puig, gen...@soe.ucsc.edu, Ieva Rauluseviciute
Awesome. Thanks very much!

Best
AM
Reply all
Reply to author
Forward
0 new messages