Connection timed out

536 views
Skip to first unread message

djbur...@gmail.com

unread,
Jul 27, 2015, 7:41:25 AM7/27/15
to fosstrak
Another question/comment...!

The data I'm sending to my EPCIS demo database is taken from another database. I've processing the data into files, one for each day, and then POSTing all files, one by one, to Fosstrak using wget. The original database has more and more data per day as time goes on - for older data, files are only a few 10s of k in size. For more recent data, the xml files are nearer 3-4MB and contain something like 250 transactions and maybe 2000-2500 events.

I'm now getting to the point where each file takes about 12-15 minutes to load into Fosstrak (on localhost!) and the connection is timing out:

C:>wget --post-file=xml\file_20150610.xml --header="Con
tent-Type: text/xml" -O log_file_20150610.xml http://localhost:8080/epcis-reposi
tory-0.5.0/capture
--2015-07-27 12:02:41--  http://localhost:8080/epcis-repository-0.5.0/capture
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... Read error (Connection timed out) in hea
ders.
Retrying.

--2015-07-27 12:17:41--  (try: 2)  http://localhost:8080/epcis-repository-0.5.0/
capture
Connecting to localhost|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1407 (1.4K) [text/html]
Saving to: `log_file_20150610.xml'

100%[======================================>] 1,407       --.-K/s   in 0s

2015-07-27 12:31:55 (107 MB/s) - `log_file_20150610.xml' saved [1407/1407]

(Note that the file size is not the same as reported by Windows!)

The obvious solution is to split the file up more, so it's smaller, and/or increase the timeout time.

However, is there something fundamental within Fosstrak I could do to speed up loading of files?

I'm using standard Fosstrak 0.5.0 on MySQL 5.6, Tomcat 8.

Thanks,
Dave

marcantoine...@orange.com

unread,
Jul 27, 2015, 8:41:55 AM7/27/15
to foss...@googlegroups.com

Hi Dave,

 

Apart from the obvious solutions that you quoted (splitting into smaller files containing one or two events is definitely more appropriate), could you first check this: does Fosstrak partly store the events that come from a same single file?

 

You can check the epcis-repository.log on the Tomcat’s log directory to see if the timeout is due to a long processing time, and query your repository after the timeout capture to check how many events where stored –if any.

 

HTTP Post capture is synchronous, which is not the best way to process big files. Alternatives may require developing an asynchronous protocol (like the ones proposed in EPCIS standard) or a front-end that would split big files into single events before processing the capture.

 

Marc-Antoine

 

De : foss...@googlegroups.com [mailto:foss...@googlegroups.com] De la part de djbur...@gmail.com
Envoyé : lundi 27 juillet 2015 13:41
À : fosstrak
Objet : [fosstrak] Connection timed out

--
You received this message because you are subscribed to the Google Groups "fosstrak" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fosstrak+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

djbur...@gmail.com

unread,
Jul 27, 2015, 10:10:45 AM7/27/15
to fosstrak, marcantoine...@orange.com
Thanks Marc-Antoine.

Yes, that was what I meant to add in my previous message - wget retries the file and you end up with duplicates in Fosstrak! I now have to detect & delete the duplicates...

Using separate files for each event is not really practical, due to Windows limitations (unless I create, post and delete the event file in one go, or if I set up a nested directory structure based on year/money/day or something).

There's nothing in epcis-repository.log to indicate any failure, so I think the timeout is occurring in wget itself. Indeed, according to http://www.gnu.org/software/wget/manual/html_node/Download-Options.html the default timeout for wget is 900 seconds - which just happens to be 15 minutes, which is what the output extract I gave previously shows! Thus, using something like

wget --post-file=xml\file_20150610.xml --header="Content-Type: text/xml" -T 1200 -O log_file_20150610.xml http://localhost:8080/epcis-repository-0.5.0/capture

would be an immediate workaround. But splitting the files up, at least into smaller numbers of events, would be a better longer-term solution.

Thanks again!
Dave
Reply all
Reply to author
Forward
0 new messages