Maximum Number of Records for Genbank?

9 views

Skip to first unread message

Nathan Lemoine

unread,

Jun 28, 2014, 10:12:42 PM6/28/14

to dendrop...@googlegroups.com

I'm using the GenBank interoperability functions to download sequences from GenBank. I appear to have hit a wall for how many records can be downloaded at once (somewhere right around 550 records). It's not a problem with accession ID's or anything, because I can download smaller subsets of records no matter how I split them up. But once I get to 550 or 551 records, I get an error:

<TITLE>Request Error</TITLE>

</HEAD>

<BODY>

<big>Request Error (invalid_request)</big>

</TD></TR>

Your request could not be processed. Request could not be handled

</TD></TR>

This could be caused by a misconfiguration, or possibly a malformed request.

</TD></TR>

For assistance, contact your network support team.

</TD></TR>

</TABLE>

</blockquote>

</BODY></HTML>

Traceback (most recent call last):

File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2827, in run_code

exec code_obj in self.user_global_ns, self.user_ns

File "<ipython-input-124-41b75487b2bc>", line 1, in <module>

lep_seq = genbank.GenBankDna(ids = accGuide['COI_Accession'])

File "/Library/Python/2.7/site-packages/dendropy/interop/genbank.py", line 365, in __init__

email=email)

File "/Library/Python/2.7/site-packages/dendropy/interop/genbank.py", line 348, in __init__

email=email)

File "/Library/Python/2.7/site-packages/dendropy/interop/genbank.py", line 130, in __init__

verify=verify)

File "/Library/Python/2.7/site-packages/dendropy/interop/genbank.py", line 202, in acquire

gb_recs = GenBankResourceStore.parse_xml(string=xml_string)

File "/Library/Python/2.7/site-packages/dendropy/interop/genbank.py", line 59, in parse_xml

root = ElementTree.fromstring(s)

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1300, in XML

parser.feed(text)

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed

self._raiseerror(v)

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror

raise err

ParseError: mismatched tag: line 7, column 2

It's not a huge problem, because I think I can always split the record downloads up into smaller bits and use the acquire method, but I was just wondering if this is a real cap or if I'm doing something wrong.

Nate

Jeet Sukumaran

unread,

Jun 29, 2014, 11:09:10 AM6/29/14

to dendrop...@googlegroups.com

We certainly have not implemented any logic to limit the number of
records on the local (DendroPy) side of things. If this is indeed what
is happening, then it must be an issue with the query protocol (we use
`efetch.fcgi`).

Thank you for pointing this out: I will put it on the list of things to
fix. As we have noted before, though, we are in the middle of getting
DendroPy 4 ready for release, so, assuming that this issue can be fixed,
the fix will probably only happen after that (unless I can squeeze in
sometime in between). Apologies for the delay.

-- jeet

> --
> You received this message because you are subscribed to the Google
> Groups "DendroPy Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to dendropy-user...@googlegroups.com
> <mailto:dendropy-user...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--

--------------------------------------
Jeet Sukumaran
--------------------------------------
jeetsu...@gmail.com
--------------------------------------
Blog/Personal Pages:
http://jeetworks.org/
GitHub Repositories:
http://github.com/jeetsukumaran
Photographs (as stream):
http://www.flickr.com/photos/jeetsukumaran/
Photographs (by galleries):
http://www.flickr.com/photos/jeetsukumaran/sets/
--------------------------------------

Reply all

Reply to author

Forward

0 new messages