Issue using HTTR (and maybe CURL)

1,299 views
Skip to first unread message

ArjunaCap

unread,
May 30, 2016, 5:12:53 PM5/30/16
to manipulatr
my question doesn't lend itself to a great reproducible example (it involves querying a remote server from a virtual machine, pulling data and appending it to a csv file).

nevertheless, I'd like to share a sketch of my code and the associated error msg, and hope for any helpful diagnoses.

the code sketch:
library(dplyr)
library
(httr)
library
(rvest)
library
(readr)


get_data
<- function() {
  url
<- "httpbin.org/xml"    # test site
  url_response
<- GET(url)    # make sure we got a good response
 
if (http_error(url_response) == FALSE) { # do nothing on a bad response
    data
<- GET("httpbin.org/xml") %>% content(type = 'text/xml') %>% xml_node('title') %>% xml_text()  # get data
    data_frame
(responses = data) # one line df; write_csv expects df
 
} else {
   
Sys.sleep(3) # try again on 401 etc
 
}
}


while(lubridate::year(Sys.time()) < 2020) {  # do it for a long time
 x
<- get_data()
 write_csv
(x, "tmp.csv", append = TRUE) # rudimentary db

 
Sys.sleep(5)
}


and the error message:


Error in curl::curl_fetch_memory(url, handle = handle) :

Couldn't connect to server

Calls: get_and_write_the_data ... request_fetch -> request_fetch.write_memory -> <Anonymous> -> .Call

Execution halted



Any idea what's happening?  I don't think it's a memory leak, and sys utilities indicate same.

Ista Zahn

unread,
May 30, 2016, 7:23:17 PM5/30/16
to ArjunaCap, manipulatr

Not sure, but why are you testing one thing and then processing another? Wouldn't something like

get_data <- function() {
  url <- "httpbin.org/xml"    # test site
  url_response <- GET(url)    # make sure we got a good response

  if (!http_error(url_response)) { # do nothing on a bad response
    data <- url_response %>%content(type = 'text/xml') %>% xml_node('title')%>% xml_text()  # get data


    data_frame(responses = data) # one line df; write_csv expects df
  } else {
    Sys.sleep(3) # try again on 401 etc
  }
}

make more sense?

Best,
Ista

--
You received this message because you are subscribed to the Google Groups "manipulatr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to manipulatr+...@googlegroups.com.
To post to this group, send email to manip...@googlegroups.com.
Visit this group at https://groups.google.com/group/manipulatr.
For more options, visit https://groups.google.com/d/optout.

Michael Cawthon

unread,
May 30, 2016, 8:47:01 PM5/30/16
to Ista Zahn, manipulatr
Yes, granted, that is better code. The clumsy, redundant (but functional) example is an artifact of imperfectly translating my real [much more complex, non minimal] case into a MRE in order to get help from others. 

In any case, thank you reading and responding. 

Ista Zahn

unread,
May 30, 2016, 11:38:46 PM5/30/16
to ArjunaCap, manipulatr

Your example tries to check for failure and continue, but you checked the wrong thing, so you get an error on failure. Check the right thing and your code should work as intended I would think.

Best,
Ista

Ista Zahn

unread,
May 31, 2016, 9:08:22 AM5/31/16
to ArjunaCap, manipulatr

Well, I guess I should stop posting late at night. Since the error (presumably) comes from the GET call checking for response errors won't help.

You could wrap the GET call in 'try' and capture any connection errors that way. If the problem is server-side that's probably the best you can do.

There are some posts on stackoverlow about the "Couldn't connect to server" error, e.g., http://stackoverflow.com/questions/31741762/r-error-installing-package-error-in-curlcurl-fetch-memoryurl-handle-ha but there doesn't seem to be much in the way of answers.

Best,
Ista

Hadley Wickham

unread,
May 31, 2016, 10:33:51 AM5/31/16
to Ista Zahn, ArjunaCap, manipulatr
The next version of httr will have RETRY():
https://github.com/hadley/httr/pull/372

You might want to look at the source to see how to tackle this sort of
problem robustly.

Hadley
--
http://hadley.nz

Michael Cawthon

unread,
May 31, 2016, 1:41:58 PM5/31/16
to Hadley Wickham, Ista Zahn, manipulatr

very helpful

Ista- I now see your orig point about my separate GETs- thanks for the clarification.  Also, have now wrapped the GET in try; testing now to see if it's successful, and considering a more sophisticated retry scheme (a la https://www.awsarchitectureblog.com/2015/03/backoff.html, found in the source code notes from Hadley's github link)

Hadley- thanks for everything

-- 

Michael Cawthon
Chief Investment Officer
Green Street Energy LLC
mcaw...@greenstenergy.com
p: 479-442-1407
Reply all
Reply to author
Forward
0 new messages