DirectoryExists() on S3 buckets is extremely slow on Lucee5

1,120 views
Skip to first unread message

Todd Kingham

unread,
Jul 11, 2016, 7:36:18 PM7/11/16
to Lucee
Hey Everyone,

I have an application that I use to write files to an S3 bucket. The bucket itself, admittedly, has quite a lot of folders and subfolders nested within it. When a page request is made within my application we check to see if the folder exists and create it if it doesn't. Pretty basic code.

------------------------------------------------
if(! directoryExists('/S3Mapping')){
    directoryCreate('/S3Mapping');
}
------------------------------------------------

When running this code on Lucee 4.5 the operation takes about 1.2 seconds to complete.  On Lucee 5 it's taking 10x as long! 13 seconds.  I've done some testing and the length of time it takes is exponential based on the size/complexity of the bucket directory. Although, I understand the more folders and subfolders a bucket has the more Lucee has to do to traverse the path, however, this was (and still is) working much more efficiently on 4.5 so the bucket size can't be the issue.  Has anyone had similar issues? Any ideas?

Cheers,

Bilal

unread,
Jul 13, 2016, 12:19:08 PM7/13/16
to Lucee

Todd:

When I have worked with S3 I found many quirks with both ACF and Railo/Lucee libs. In the end we went native to solve them. This is not unique to Lucee though. We had the same issues with C#.

 

S3 is a case sensitive complex key/value object store whose structured key names can be used to mimic directories.  It does not behave exactly like a file system.

To find out whether a directory exists, some wrappers translate it by first scanning the whole key/value store for keys; then parse the keys and, then, check the portion that would correspond to your directory.

Even in this there are quite a few performance difference depending on how you setup api for the call. For example, if you miss to provide a parameter called “delimiter” and/or “prefix” the execution takes substantially longer.

 

All that said.

I did take a peek into lucee 5 vs lucee 4 code. And here is what I see:

 

In lucee 4, the implementation passes a more comprehensive set of parameters to the AWS S3 HTTP API. The S3 call API is implemented by the lucee team:

e.g.

return s3.listContents(name, prefix, marker, maxKeys);

==> goes through HTTP

 

 

In Lucee 5 it looks like the jets3t open source library is used, so no more lucee team working on keeping up with S3. And, the call is to actually list the full content of the bucket like so:

listObjects(bucketName);

Then, the list is filtered as needed. The exists() call only filters down and check whether the result is null:

 

return get(bucketName, objectName,includePseudoFolder)!=null;

 

 

The better alternative today would be to use a newer exposed java class that checks for an object’s existence: doesObjectExist()

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3.html

 

So in short, you can create a ticket or implement your own call. 

For speed, it may be advisable to check on a simple S3 wrapper alternative and make the calls yourself to S3.

I think there is an alternate S3 wrapper on Riaforge. You can try to use it to see whether you get the same response times.

 

HTH,

Bilal

kinghamh...@gmail.com

unread,
Jul 13, 2016, 6:35:32 PM7/13/16
to Lucee
Thanks for the reply Bilal,

I think we will be moving towards using the Java SDK and building my own interface. We are needing to do that with some other AWS services anyway (DynamoDB and SQS) so this will be a more consistent (and maintainable) approach.

Thanks again :)

JP

unread,
Feb 1, 2017, 12:39:29 PM2/1/17
to Lucee
Just created a new bug:


Watch/Vote... hope this helps.
Reply all
Reply to author
Forward
0 new messages