accessing server data on shinyapps

2,112 views
Skip to first unread message

Patrick Toche

unread,
Apr 12, 2014, 8:07:46 AM4/12/14
to shiny-...@googlegroups.com
How can I access (download/delete) data saved by my app on the shinyapp server?

I recently migrated to shinyapps.io, but I can't find as much help as for spark.rstudio.com

What would be the shinyapps.io equivalent of running the following?

# copy any .Rds file to local directory
scp -r myn...@spark.rstudio.com:ShinyApps/data/*.Rds ~/shiny/data/

# delete remote file:
rm myn...@spark.rstudio.com:ShinyApps/data/*.Rds

Thanks!

Jeff Allen

unread,
Apr 14, 2014, 11:31:03 AM4/14/14
to shiny-...@googlegroups.com
Hi Patrick,

Good news/bad news here. The bad news is, this specific task will be a bit more difficult in ShinyApps in the short-term. The good news is that once you take the hit of refactoring your code, your apps will be much more scalable and reliable.

ShinyApps is designed using the new model for how servers should be organized (here's one good explanation), but the central tenant is: never trust hardware. No server should be expected to run eternally, no single hard drive should be expected to run without failure, etc. So we built ShinyApps around the model that if any device ever starts acting up, we can shoot it and have minimal consequences. The good news is that this will make the service much more reliable in the long-run, as we won't have to manually move your applications off of a server if the memory goes bad, rather we'll just instantly spin your apps up in a better machine.

But unfortunately, that means that you can't trust that your application will always be running on a machine with access to the same hard drives. We will eventually want to come up with ways for you to centrally store data reliably in ShinyApps, but that feature isn't there just yet. So that means that you shouldn't expect files you write to disk in ShinyApps to be there eternally. The documentation discusses this briefly.

(One caveat for on-disk data is data that you bundle into your application at the outset when you deploy it. We make sure that static data is available to your application wherever it's running. So we're just talking about data that your application creates dynamically.)

So how do you write data out of a Shiny application? I usually recommend Amazon's S3 service. There are R packages that help you read/write data here (like this one). The basic idea is that you would read the data in from S3 when your application starts up, then write it out periodically as you change it.

Sorry this isn't yet well-documented. We'll be sure to add this to the Todo list; I think an example or two in this area would be helpful.

Let me know if you have any trouble with it.

Jeff
Message has been deleted

Patrick Toche

unread,
Apr 15, 2014, 10:14:34 AM4/15/14
to shiny-...@googlegroups.com
Thanks for answering Jeff, 

and giving such a thorough explanation!

I get it, I understand the rationale, and I will refactor my code. But I wonder if you have examples of shiny apps that are built in the way you advocate, so I can map my code without reinventing the wheel (something I would be quite incapable of doing anyhow)?

I am not too keen on the Amazon S3 Service because it requires a credit card and if you break the terms of the free service, they will charge: knowing how gaffe-prone I am, you can bet they will manage to charge me to bankruptcy, so I'll stay clear of that.

I don't expect to have to deal with a lot of data. My current project is a survey of about 20 questions to be answered about 4,000 times, over the course of several weeks. The size of a single completed survey is trivial. I'm thinking I could ftp the results to some directory where I host my webpage (I can store at least 1 gigabyte, which should be more than enough) or perhaps just email myself the data as it comes. Is that feasible? 

Before I used to save locally in a "data" directory:

# Save all answers after click on "submit"
observe({
  if(is.null(input$submit) || input$submit == 0) {return()}
  filename <- paste0("/data/answers-",input$userName, "-", as.numeric(Sys.time()), ".RData")
  save(values$A, file = filename, compress = "xz")
})

Could you point me to some code on how to direct the data via ftp and/or by email, instead of saving locally as above?  

Thanks!

Patrick.

I described the problem earlier there, but did not get feedback I could use  right away:

Jeff Allen

unread,
Apr 15, 2014, 11:58:36 AM4/15/14
to Patrick Toche, shiny-...@googlegroups.com
I agree this would be a useful example to add to our gallery. I'll see what we can whip up.

I understand your hesitation with S3, but I will say it's one of the safer pay-as-you-go services I've worked with. I, too, have made my share of gaffes, but accidentally uploading hundreds of GB of data to S3 would be quite a feat! For the amount of data you're dealing with, I'd be surprised if you were able to muster up an S3 bill over $10, even with a heinous bug.

The being said, if you wanted to send to FTP, it sounds like that's something RCurl can handle, though I've never done it in R myself. http://stackoverflow.com/questions/3620426/how-to-upload-a-file-to-a-server-via-ftp-using-r.

I don't have a full-fledged example for you, but the basic idea would be to read out the file from S3/FTP when Shiny starts, then write it back to update the file whenever you make a change (complete a survey). As long as you only run one instance of this process at a time, you won't have to worry about conflicts or processes overwriting each-other's data.

I'll be sure to post back if/when we're able to get an example together.

Jeff

April 15, 2014 at 9:14 AM
--
You received this message because you are subscribed to a topic in the Google Groups "Shiny - Web Framework for R" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/shiny-discuss/3lGPlkvzWF4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to shiny-discus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
April 14, 2014 at 10:31 AM
Hi Patrick,

Good news/bad news here. The bad news is, this specific task will be a bit more difficult in ShinyApps in the short-term. The good news is that once you take the hit of refactoring your code, your apps will be much more scalable and reliable.

ShinyApps is designed using the new model for how servers should be organized (here's one good explanation), but the central tenant is: never trust hardware. No server should be expected to run eternally, no single hard drive should be expected to run without failure, etc. So we built ShinyApps around the model that if any device ever starts acting up, we can shoot it and have minimal consequences. The good news is that this will make the service much more reliable in the long-run, as we won't have to manually move your applications off of a server if the memory goes bad, rather we'll just instantly spin your apps up in a better machine.

But unfortunately, that means that you can't trust that your application will always be running on a machine with access to the same hard drives. We will eventually want to come up with ways for you to centrally store data reliably in ShinyApps, but that feature isn't there just yet. So that means that you shouldn't expect files you write to disk in ShinyApps to be there eternally. The documentation discusses this briefly.

(One caveat for on-disk data is data that you bundle into your application at the outset when you deploy it. We make sure that static data is available to your application wherever it's running. So we're just talking about data that your application creates dynamically.)

So how do you write data out of a Shiny application? I usually recommend Amazon's S3 service. There are R packages that help you read/write data here (like this one). The basic idea is that you would read the data in from S3 when your application starts up, then write it out periodically as you change it.

Sorry this isn't yet well-documented. We'll be sure to add this to the Todo list; I think an example or two in this area would be helpful.

Let me know if you have any trouble with it.

Jeff


On Saturday, April 12, 2014 7:07:46 AM UTC-5, Patrick Toche wrote:
--
You received this message because you are subscribed to a topic in the Google Groups "Shiny - Web Framework for R" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/shiny-discuss/3lGPlkvzWF4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to shiny-discus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Patrick Toche

unread,
Apr 16, 2014, 5:39:22 PM4/16/14
to shiny-...@googlegroups.com, Patrick Toche
Thanks Jeff!

I'm going to look into the RCurl/ftp solution you suggest, as it seems to be the one I'm most likely to be able to manage.

The surveys are to be conducted at the end of a class by the students, so there will be several instances run simultaneously.

Patrick.

Jeff Allen

unread,
Apr 16, 2014, 6:17:06 PM4/16/14
to shiny-...@googlegroups.com, Patrick Toche
Sounds good, Patrick. Just to make sure it was clear -- multiple sessions (students) doing the quiz at the same time would likely not be an issue. The only problem would come if you were using Shiny Server Pro to run multiple processes of the quiz concurrently. In this case, the processes may be overwriting each other's updates. But if you have a single process and share the results data in the global process scope (http://shiny.rstudio.com/articles/scoping.html) you should be just fine.

Good luck!

Patrick Toche

unread,
Apr 17, 2014, 1:17:08 AM4/17/14
to shiny-...@googlegroups.com, Patrick Toche

Sounds good, Patrick. Just to make sure it was clear -- multiple sessions (students) doing the quiz at
the same time would likely not be an issue.
 
> the basic idea would be to read out the file from S3/FTP when Shiny starts, then write it back to 
update the file whenever you make a change (complete a survey). As long as you only run one instance of this process at a time, you won't have to worry about conflicts or processes overwriting each-other's data.

 
  
Thanks Jeff,

Based on my reading of the process you describe, I thought the problem could be:

2:01pm user 1 reads out the file from S3/FTP
2:02pm user 2 reads out the file from S3/FTP (same file as user 1 is reading)
2:03pm user 1 writes back to update the file
2:04pm user 2 writes back to update the file: won't this "overwrite" user 1's version of the file ???

However, an equally convenient alternative would be to save 1 file per user. In an earlier version I had a piece of code to generate unique random IDs based on user-info, clock-time, and a random generator, I thought I could name the files with that unique name and thus avoid accidental overlap.

Jeff Allen

unread,
Apr 17, 2014, 11:34:40 AM4/17/14
to Patrick Toche, shiny-...@googlegroups.com
Certainly. Saving the output to different files would do the trick.

And the issue you describe is exactly the one that concerns me when running multiple processes. If you're only running one process, though, you can manage this with proper scoping. Here's some pseudo-code:

server.R
===========

myData <- getDatasetFromFTP()

shinyServer(function(input, output){
    observe({
        # Take a reactive dependency on the submit button so this observer fires on submit
        input$submit

        # Supplement the results with the current survey results.
        myData <<- rbind(myData, getSubmittedSurvey())
   
        # Persist the results to FTP
        saveDatasetToFTP(myData)
    })
})


This arrangement would keep the updated dataset in memory constantly, only reading from FTP once when the application is started. As long as only one of these processes runs at a time, you'll always have the authoritative/updated copy of your results in `myData`.

Clear as mud?

Jeff
April 17, 2014 at 12:17 AM
--
You received this message because you are subscribed to a topic in the Google Groups "Shiny - Web Framework for R" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/shiny-discuss/3lGPlkvzWF4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to shiny-discus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
April 16, 2014 at 5:17 PM
Sounds good, Patrick. Just to make sure it was clear -- multiple sessions (students) doing the quiz at the same time would likely not be an issue. The only problem would come if you were using Shiny Server Pro to run multiple processes of the quiz concurrently. In this case, the processes may be overwriting each other's updates. But if you have a single process and share the results data in the global process scope (http://shiny.rstudio.com/articles/scoping.html) you should be just fine.

Good luck!

On Wednesday, April 16, 2014 4:39:22 PM UTC-5, Patrick Toche wrote:
--
You received this message because you are subscribed to a topic in the Google Groups "Shiny - Web Framework for R" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/shiny-discuss/3lGPlkvzWF4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to shiny-discus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
April 16, 2014 at 4:39 PM
Thanks Jeff!

I'm going to look into the RCurl/ftp solution you suggest, as it seems to be the one I'm most likely to be able to manage.

The surveys are to be conducted at the end of a class by the students, so there will be several instances run simultaneously.

Patrick.
--
You received this message because you are subscribed to a topic in the Google Groups "Shiny - Web Framework for R" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/shiny-discuss/3lGPlkvzWF4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to shiny-discus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
April 15, 2014 at 10:58 AM
I agree this would be a useful example to add to our gallery. I'll see what we can whip up.

I understand your hesitation with S3, but I will say it's one of the safer pay-as-you-go services I've worked with. I, too, have made my share of gaffes, but accidentally uploading hundreds of GB of data to S3 would be quite a feat! For the amount of data you're dealing with, I'd be surprised if you were able to muster up an S3 bill over $10, even with a heinous bug.

The being said, if you wanted to send to FTP, it sounds like that's something RCurl can handle, though I've never done it in R myself. http://stackoverflow.com/questions/3620426/how-to-upload-a-file-to-a-server-via-ftp-using-r.

I don't have a full-fledged example for you, but the basic idea would be to read out the file from S3/FTP when Shiny starts, then write it back to update the file whenever you make a change (complete a survey). As long as you only run one instance of this process at a time, you won't have to worry about conflicts or processes overwriting each-other's data.

I'll be sure to post back if/when we're able to get an example together.

Jeff

Reply all
Reply to author
Forward
0 new messages