Question about the goldfish blat search

15 views
Skip to first unread message

Zhang, Suiyuan (NIH/NHGRI) [C]

unread,
May 16, 2022, 3:57:30 PM5/16/22
to gen...@soe.ucsc.edu

Hello:

Currently we have the goldfish genome blat server host at our data server (https://goldfish.nhgri.nih.gov).  The gfserver at this location is running well with command line.  The track hub has following link https://genome.ucsc.edu/cgi-bin/hgTracks?hubUrl=https://goldfish.nhgri.nih.gov/carAur01/hub.txt&genome=hub_259251_carAur01&position=lastDbPos

 

We recent found out that the gold fish blat search on UCSC did not work anymore. We had upgrade the server but the domain name remains same(IP address changed).

Graphical user interface, text, application, email

Description automatically generated

 

The following errors showed:

 

Graphical user interface, text, application, email

Description automatically generated

 

 

Do you have any suggestions to fix this?

 

Thank you very much!

 

 Best Regards,

 

Suiyuan

 

 

Suiyuan Zhang [C]

Bioinformatics Scientist

Columbus Technologies & Services, Inc.

NHGRI/NIH

Pronouns: He, His, Him

Email: zha...@mail.nih.gov

Office: 301-496-7925 |  Fax: 301-480-1109

50 South Drive,  Bethesda, MD  20892

 

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination, or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

 

 

Brian Lee

unread,
May 16, 2022, 5:39:50 PM5/16/22
to Zhang, Suiyuan (NIH/NHGRI) [C], gen...@soe.ucsc.edu

Dear Suiyuan,

Thank you for using the UCSC Genome Browser to build your assembly hub, including blat servers, and your question about the error message.

It appears likely that after the server upgrade maybe the public permission to access those ports for the gfServers is not open. If I try the gfServer status check on the IP for the server and the ports listed in the genomes.txt file for the assembly hub, I get a timeout response:

$ gfServer status goldfish.nhgri.nih.gov 17777
TCP non-blocking connect() to goldfish.nhgri.nih.gov IP 2607:f220:404:2102::9c28:f224 timed-out in select() after 10000 milliseconds - Cancelling!TCP non-blocking connect() to goldfish.nhgri.nih.gov IP 156.40.242.36 timed-out in select() after 10000 milliseconds - Cancelling!

$ gfServer status goldfish.nhgri.nih.gov 17779
TCP non-blocking connect() to goldfish.nhgri.nih.gov IP 2607:f220:404:2102::9c28:f224 timed-out in select() after 10000 milliseconds - Cancelling!TCP non-blocking connect() to goldfish.nhgri.nih.gov IP 156.40.242.36 timed-out in select() after 10000 milliseconds - Cancelling!

Please try asking your system administrators to ensure those ports 17779 and 17777 are open for connections on goldfish.nhgri.nih.gov. The browser looks for the blat servers as defined in the genomes.txt file here: https://goldfish.nhgri.nih.gov/carAur01/genomes.txt

...
blat goldfish.nhgri.nih.gov 17779
transBlat goldfish.nhgri.nih.gov 17777

Also please ask your admins to check for firewalls and to unblock those ports (at least from our public site, host genome.ucsc.edu has addresses 128.114.119.131 & 128.114.119.132). It may also be worth restarting the gfServers, although it sounds like you checked them on the command-line.

Thank you again for your inquiry and for using the UCSC Genome Browser. If you have any further public questions, please send new questions to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly accessible forum to help others find answers to similar questions. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu, which is a private internal list to our support team.

All the best,


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/BLAPR09MB691397CC52535F8B78A9EE959ECF9%40BLAPR09MB6913.namprd09.prod.outlook.com.

Zhang, Suiyuan (NIH/NHGRI) [C]

unread,
May 23, 2022, 1:42:07 PM5/23/22
to Brian Lee, gen...@soe.ucsc.edu

Hi Brain:

Thank you very much for your help! Both ports are open and ready for blat search. But it runs a bit slow, I would like to ask if there is any way to improve.

Does building index help? (like in http://genomewiki.ucsc.edu/index.php/Running_your_own_gfServer).  If we did not run this step. Will the any of the search start on the website help to establish this index.

 

Best Regards,

 

Suiyuan

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe.

 

Brian Lee

unread,
May 25, 2022, 3:27:00 PM5/25/22
to Zhang, Suiyuan (NIH/NHGRI) [C], gen...@soe.ucsc.edu

Dear Suiyuan,

Thank you for using the UCSC Genome Browser and your question about speeding up gfServer responses for your goldfish assembly hub.

To answer the first question, the index approach will not assist in your issues. The index is a design for allowing many gfServers to be available for a collection of assembly hubs, where the gfServers are only activated on demand, so it isn't applicable, as you have a dedicated gfServer running (as seen with the gfServer status IP command). You should be getting instantaneous results based on that check.

I apologize for the delay in responding, I have been investigating the issue trying to figure out what may be happening as I am able to see the delay that you are experiencing.  In my testing I am feeling more confident that it is likely a firewall issue with your new gfServer and that you want to check with your system administrators. Here is an example to try, put in a DNA string for the goldfish assembly on blat for your hub. You will get a result after about 7 seconds, now return and use that same string of DNA again, and the results will be instantaneous (you can do it a few more times just to prove you are getting a quick response). Now click into the Blat page, and put in the same DNA, but modify it just a small amount. Type in about 30 extra T's or A's and before clicking submit, copy your input, so you can use it again. Suddenly with the new input there is a delay.  If you use that same sequence to paste into the Blat page again, it will be instantaneous.  This suggests to me there may be something that is inspecting new information, perhaps a kind of firewall, triggering the delay you are experiencing.

I wanted to propose that you may wish to submit your sequence to GenBank, and then we could build an Assembly Hub for you, which would include Blat and PCR (you could still attach all your annotations to the assembly  hub as a Track Hub).  Last year we announced our new GenArk (Genome Archive) collection of assembly hubs, and it also features Blat and PCR servers (using this index method mentioned so the gfServers are only running when a user is interacting with the DNA with active searches).   Here is a new announcement about a new assembly request page: https://genome.ucsc.edu/goldenPath/newsarch.html#052422

On that page we do have a gold fish assembly, and three listed currently at GenBank we could build, or if you deposit your sequence with GenBank and email us the GCA/GCF accession, we can build your specific assembly hub:

goldFish.png

Here is a link that loads the current GoldFish assembly we do have (the initial blat will be slow as the gfServer index is used, but subsequent searches should be fast): https://genome.ucsc.edu/h/GCF_003368295.1

We do want to help you with your current arrangement as well, where the speculation is that it is a firewall item. You may want to also restart your gfServers.  We suggest checking with your system administrators to ensure  they free up requests from our main public site (host genome.ucsc.edu has addresses 128.114.119.131 & 128.114.119.132).

Thank you again for your inquiry and for using the UCSC Genome Browser. If you have any further public questions, please send new questions to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly accessible forum to help others find answers to similar questions. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu, which is a private internal list to our support team.

All the best,

Reply all
Reply to author
Forward
0 new messages