cfdirectory very slow against S3 resource

301 views
Skip to first unread message

Andrew - CFD

unread,
Nov 7, 2013, 5:10:29 PM11/7/13
to ra...@googlegroups.com


Hi,

I'm finding that using the DirectoryExists function against an S3 bucket to check if a folder exists is very very slow, I'm finding anything from 30 to 60 seconds. The line of code in question looks like this:

<cfif DirectoryExists("/s3bucketmapping/foldername/")>

Where the s3bucketmapping is an S3 mapping set up in the Railo web admin for that site.

I have tried just pulling this single line out into a cfoutput and it does the same.

Am I doing something wrong? Is there another function I should be using to do this instead?

Thanks.

Andrew.

Andrew - CFD

unread,
Nov 8, 2013, 4:39:20 PM11/8/13
to ra...@googlegroups.com
Hi All,

After a bit more playing about I have found that it seems to be related to amount of content within the folder. If I create new empty folder it is nice and quick, however when tested against a folder that has 1000s of objects in it then it takes considerably longer. I'm not sure if this means there is an implementation issue within the Railo or if is just the way the S3 web service works. Anyone got any ideas?

Kind regards,

Andrew.

Andrew - CFD

unread,
Nov 8, 2013, 5:38:24 PM11/8/13
to ra...@googlegroups.com
More reading done and it would appear that unknown to me S3 is a flat file system and the idea of "directories" doesn't really exist in it, despite there being and option on the AWS S3 console for "Create Folder", but that you can create a feeling of "directories" by prepending the file name with slashes and characters, e.g. if you want image.jpg to appear under a directory called photos you would simply make the object name "/photos/image.jpg". Given that, does this explain why Railo takes longer with bigger directories, is it doing a request to AWS S3 for [directory-name]/* when you use the DirectoryExists function, and therefore with a very large directory it takes a very long time as it has to wait for S3 to return a list of all the available objects?

Andrew.

Alex Skinner

unread,
Nov 9, 2013, 2:53:21 AM11/9/13
to ra...@googlegroups.com

The question is what you are doing with the files. An approach would be rather than proactively checking the file exists in this way either have a separate cache of existing assets which you update asynchronously via a scheduled task or something.  Or code for failure if you believe it exists try and use it,  catch error and deal with if it doesn't.  Understanding the logic or how often you use the files or the likelihood of file not being there and circumstances is important.  But the key thing you've noticed is s3 is exposed in a local resource way but you need to treat it as if interacting with a slow remote resource.

Thanks

A

--
Did you find this reply useful? Help the Railo community and add it to the Railo Server wiki at https://github.com/getrailo/railo/wiki
---
You received this message because you are subscribed to the Google Groups "Railo" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/railo/09980066-53e5-45c3-8b1c-09dcc881b4e6%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Nando Breiter

unread,
Nov 9, 2013, 6:22:07 AM11/9/13
to ra...@googlegroups.com
Andrew,

Untested idea, don't know if it is possible - what about checking if a file exists with the name "/photos/*", for example. If regex expressions are allowed, that might return much more quickly. 

If regex expressions are not allowed, then you might hack your way around this by always uploading, for example, a test.gif file to a newly created S3 directory, then test for the existence of the file /directoryPath/test.gif to test if the directory exists.



Aria Media Sagl
Via Rompada 40
6987 Caslano
Switzerland

+41 (0)91 600 9601
+41 (0)76 303 4477 cell
skype: ariamedia


--
Did you find this reply useful? Help the Railo community and add it to the Railo Server wiki at https://github.com/getrailo/railo/wiki
---
You received this message because you are subscribed to the Google Groups "Railo" group.

Alex Skinner

unread,
Nov 9, 2013, 6:26:58 AM11/9/13
to ra...@googlegroups.com

That won't help because there are no such thing as directories it's because it has to do a file scan of the whole bucket directories are just faked paths.  You can't change the performance characteristics of s3 you can just code around it.

Thsnks

Alex

Andrew - CFD

unread,
Nov 9, 2013, 2:49:10 PM11/9/13
to ra...@googlegroups.com
Hi Alex and Nando,

Ok, I get all of that so thanks for the replies however maybe you can answer me this, if there is no such thing as directories on S3, why if I attempt to copy a file to a "path" that doesn't exist do I get the error "No such file or directory"?

Surely if there are no directories I should literally be able to specify ANY path and it shouldn't matter, it should just create the relevant object on S3.

Kind regards,

Andrew.

Alex Skinner

unread,
Nov 10, 2013, 2:38:03 AM11/10/13
to ra...@googlegroups.com

Hi I've never used the railo resource mapping or equivalent in ACF I'd  try http://amazons3.riaforge.org/

And see if you get what you expect then I guess go from there if you think it's a bug, it's worth raising it with Micha

Andrew - CFD

unread,
Nov 10, 2013, 4:04:58 AM11/10/13
to ra...@googlegroups.com
Thanks for the link but that shouldn't be required as Railo has S3 support built in.

Alex Skinner

unread,
Nov 10, 2013, 10:28:43 AM11/10/13
to ra...@googlegroups.com

I'm aware of that but if you are having problems with the built in wrapper working out whether it behaves the same way with the rest API is a good plan and with this you can see what's going on or you can debug java source for the wrapper

Andrew - CFD

unread,
Nov 10, 2013, 3:00:09 PM11/10/13
to ra...@googlegroups.com
Good thinking, I will give it a try. I also think I have found another issue with it when you attempt to delete an object, it appears to cause a Java heap space issue and also is extremely slow. I will have to find time to play with that wrapper and see if I get different results.

Kind regards,

Andrew.

Bilal

unread,
Nov 10, 2013, 8:30:09 PM11/10/13
to ra...@googlegroups.com
Folks:
we have worked quite a bit with S3 and found many quirks with both ACF and Railo libs. In the end we went native to solve them.
This is not unique to Railo though. We had the same issue with C#.

The directory browsing business sounds familiar to a missing delimiter problem in the library call. We experienced this problem on several platforms; 
S3 is a case sensitive complex key/value object store whose structured key names can be used to mimic directories.  It does not behave exacly like a file system. So some wrappers translate it by first scanning the whole key/value store for keys; then parse the keys and, then, present you the portion that would correspond to your directory.

S3 can quickly only retrieve the keys that correspond to your path and ignore everything else if you supply both a "prefix" and the "delimiter". We found that when we left off the "delimiter" in our calls, we saw the behavior you are describing.

Thus, I am making a guess but it could be that the Railo wrapper is also implemented in such a way that it leaves off  the "delimiter" to the call. It may need to first scan all keys to make them case insensitive.

I think there is an alternate S3 wrapper on Riaforge. You can try to use it to see whether you get the same response times.

HTH,
Bilal


Sam Jay

unread,
Jan 19, 2014, 11:58:39 PM1/19/14
to ra...@googlegroups.com

I tried directoryExists() function on two C3.large AWS EC2 instances. Both fresh identical servers.

Railo took closer to 20k tick counts to return and Adobe ColdFusion only took less than 100 every time.
Tested the same against different Railo servers outside AWS and they were all far too slow.

1. My “folders” are quite large. I did not test with empty buckets. Also as Andrew mentioned before if this happens on big buckets, is it fair to assume Railo is downloading the entire inventory on every single Directory call? If that is the case, it is quite DANGEROUS and this can rack up the S3 bill.

 

2. With ACFM, I can write files into new S3 “folders” directly. In other words, if the destination does not exist when we copy a file, it creates that path. Railo other hand throw destination directory does not exists error. I do not know exactly how many requests ACFM made to S3 when writing a file and creating a directory at the same time. It is generally better to minimize number of calls to S3, since every call costs money. In my case, 100 threads copy 100 files into S3 and into different folders. With Railo I had to add “if not directoryExists(), then create directory” into my code when run with Railo.

 

3. We can move into AWS Java SDK, but having this function inside CFM make things very easy. For example, we can zip a file into S3 in a single step.


Thanks

- Sam

Reply all
Reply to author
Forward
0 new messages