Yeah, I think it might get expensive in the long run. I had a typo there and meant 100 epochs, not 10 epochs regarding the 3k bucks (150,000 Gb * 0.023 cent/Gb).
> On Feb 5, 2018, at 1:21 PM, G Reina <
gre...@eng.ucsd.edu> wrote:
>
> Thanks to all. I managed to find goofys (
https://github.com/kahing/goofys) last night. It allowed me to mount the S3 locally. From my tests with just python accessing an HDF5 file on the S3 bucket, the lag was well under 1 second to pull out a batch of data. However, I am concerned about the cost as Sebastian mentioned. That could be the rate limiting step.
>
> Best,
> -Tony
>
>
> On Mon, Feb 5, 2018 at 10:07 AM, Toby Boyd <
toby...@google.com> wrote:
> Here is the s3 support, not that it would be a good idea in this case. I remember complaining it was logging all the time a few weeks ago. Sorry no "how to", just wanted you to know it is there and included in default builds.
>
> Toby
>
> On Mon, Feb 5, 2018 at 10:02 AM, Sebastian Raschka <
se.ra...@gmail.com> wrote:
> I think you would need to mount the S3 bucket as a file system to do that, eg using tools like S3FS. For the reasons mentioned, this would probably not be a good idea for training the model anyways. I can see two issues with that
>
> 1) it will be slow due to network transfer speed constraints
>
> 2) it will become relatively expensive since S3 has a pay by access model. I think it’s currently around 0.023 cents per GB. After 10 epochs, you would be over 3k bucks. Here, it’s probably better to invest that money in a machine with a >1.5 Tb hard drive ;).
>
> My recommendation would be to just buy a relatively cheap hard drive to put the dataset on. Instead of buying an external USB hard drive, I’d recommend buying a regular hard drive with SATA port + a docking station (they cost around 15 bucks) with which you can either connect it to your computers SATA port, if you have one free, or USB if it must be (I am using that setup as well, works great).
>
> Best,
> Sebastian
>
>
> On Feb 5, 2018, at 11:24 AM, 'Toby Boyd' via Discuss <
dis...@tensorflow.org> wrote:
>
>> I believe s3 is supported and it would not matter where your calling it from assuming you have the correct permissions and such setup. That said, I suspect this would be really slow, although I am just guessing. If your dataset is 1.5TB and you do say 100 epochs (without local caching) you would be moving 150TB of data. When I am doing work on AWS I normally use EFS (mounted as local disk) or local disk. I could very well be really wrong but I am not sure s3 would be a great option. Disclosure: I have used EFS a good bit. I do not have any direct s3 experience.
>>
>> Toby
>>
>> On Sun, Feb 4, 2018 at 4:06 PM, G Reina <
gre...@eng.ucsd.edu> wrote:
>> I've got a 1.5 TB dataset on AWS S3. Is it possible to just point my local installation of TensorFlow to the S3 file in order to train the model on my local machine? Or, can I only do this from an EC2 instance?
>>
>> Thanks.
>> -Tony
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups "Discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to
>> To unsubscribe from this group and stop receiving emails from it, send an email to