sync problem: FileNotFoundException ... (Not a directory)

646 views
Skip to first unread message

TimOnGoogle

unread,
Dec 16, 2009, 2:28:43 PM12/16/09
to JetS3t Users
Hi all...

So I'm trying to synchronize down a bucket from S3, and there seems to
be one problem bucket. I keep geting "FileNotFoundException [path of
file or directory] (Not a directory)".

I noticed there were a bunch of files that had the same names as
directories in the bucket and in subdirs, left over from some previous
upload, possibly by disparate tools. I removed all of those, as a
test, in the folder that I was seeing complaints within, and started
seeing the above error in the same folder, but now, instead of seeing
it for the folder itself, I was seeing it for a file WITHIN the
folder.

What typically causes these problems? Here's the command:

synchronize.sh --properties .s3.properties DOWN mybucket mypath

where the properties are (besides keys):

acl=PUBLIC_READ
uploads.storeEmptyDirectories=true

and the full exception:

ERROR [org.jets3t.service.multithread.S3ServiceMulti
$ThreadGroupManager] A thread failed with an exception. Firing ERROR
event and cancelling all threads
java.io.FileNotFoundException: /prod-cdn/buckets/qa-
microsite.leapfrog.com/gaming_ca/AC_RunActiveContent.js (Not a
directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:179)
at
org.jets3t.service.multithread.DownloadPackage.getOutputStream
(DownloadPackage.java:162)
at org.jets3t.service.multithread.S3ServiceMulti
$DownloadObjectRunnable.run(S3ServiceMulti.java:2086)
at java.lang.Thread.run(Thread.java:619)
Exception in thread "main" java.io.FileNotFoundException: /prod-cdn/
buckets/qa-microsite.leapfrog.com/gaming_ca/AC_RunActiveContent.js
(Not a directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:179)
at
org.jets3t.service.multithread.DownloadPackage.getOutputStream
(DownloadPackage.java:162)
at org.jets3t.service.multithread.S3ServiceMulti
$DownloadObjectRunnable.run(S3ServiceMulti.java:2086)
at java.lang.Thread.run(Thread.java:619)

James Murty

unread,
Dec 17, 2009, 1:14:44 AM12/17/09
to jets3t...@googlegroups.com
Hi,

It looks like you are downloading objects with file hierachies in the key name but Synchronize is not creating all of the containing directories.

Synchronize, and all JetS3t tools, should automatically create directory hierarchies if necessary when downloading file. This makes me wonder which version of JetS3t are you using? This may be a bug that was fixed a while back, are you using the latest version 0.7.1?

Another possibility is that Synchronize is confused about some items in your bucket and is creating a file when it should be creating a directory. This could certainly happen if you have used other tools to put objects in the bucket.

You can check this by taking the file path mentioned in one of the errors (/prod-cdn/buckets/qa-microsite.leapfrog.com/gaming_ca/AC_RunActiveContent.js) and following this path on your computer to see whether one of the subdirectory items has been created as an empty file by mistake.

Also, if by any chance uploaded files using Transmit program made by Panic you should add the "filecomparer.ignore-panic-dir-placeholders" property to your Synchronize properties and set it to true. This instructs JetS3t programs to ignore directory placeholder

Hope this helps,
James

TimOnGoogle

unread,
Dec 18, 2009, 1:38:17 PM12/18/09
to JetS3t Users
Hi James...

I was using 0.7.1, and saw that some folks had used the latest from
CVS to get things working, so I tried that, too, and it didn't work
either. I enabled 's3service.ignore-exceptions-in-multi', so it
finished, but generated a ton of those Exceptions I mentioned.

Files have been uploaded with jets3t Synchronize, but also with
BucketExplorer by some other people at work here. I think this
particular hierarchy should have been uploaded entirely by jets3t, but
I'm not sure.

I can download the whole bucket *with* BucketExplorer, incidentally,
with no problems.

I just checked, and all the "Not a directory" problems are being
generated because of a single directory that Synchronize downloaded as
a file. When looking at the bucket with some Windows GUI tools
(BucketExplorer, Cloudberry Explorer), the problem directory/file
('gaming_ca') shows up as a directory, no errors.

I'm wondering what jets3t is doing that it sees the 'gaming_ca'
directory as a file, but also as a directory.

- Tim

James Murty

unread,
Dec 18, 2009, 10:17:05 PM12/18/09
to jets3t...@googlegroups.com
Hi,

Since S3 does not have any concept of a directory as opposed to a piece of data, any tool that stores directory structures in S3 must find a way to represent a directory with a data object. Of course, since there is no recommended way to do this every tool does something different. Which leads to the problems you're seeing.

JetS3t uses object metadata to tag an S3 object which represents a directory. Other tools tend to use naming conventions in the object names, such as by ending with a "/" slash character (Transmit) or with the "_$folder$" suffix (S3 Organizer Firefox extension). In many cases it isn't strictly necessary to represent directories explicitly since they can be inferred from hierarchical object names that contain slash characters, like "gaming_ca/another_subdir/myfile.txt". However, to ensure that empty directories are synchronized with S3 a directory place-holder object is required.

Your object corresponding to the 'gaming_ca' directory must have been created with a tool that uses a strategy that JetS3t doesn't recognize, so it is downloading it as an empty file instead of creating a directory with that name.

If you can provide me with the naming strategy or metadata tags that are present on this object I may be able to update JetS3t to correctly recognize this approach in future, but for now your best option may be to remove the 'gaming_ca' object that is confusing Synchronize.

Synchronize will cope OK if there are missing place-holder directory objects since it can infer the directory names from the hierarchical object names that I mentioned above. That is why it realizes that there should be a 'gaming_ca' directory even though it didn't recognize the 'gaming_ca' directory place-holder object.

Whew, that turned into a novel. Hopefully it helped clarify what is going on...

James

TimOnGoogle

unread,
Dec 21, 2009, 2:51:50 PM12/21/09
to JetS3t Users
Thank you, James, for that very complete answer! I am wondering what
I
ought to do to fix this in place. The "directory" in question doesn't
appear to
have any properties, metadata, or related files that seem different
than other
directories in the hierarchy. But I will check again to make
absolutely sure.
There might have been something I missed.

- Tim

TimOnGoogle

unread,
Dec 21, 2009, 9:59:05 PM12/21/09
to JetS3t Users
Well, I don't know why this was happening, because other buckets that
have "placeholder" keys for directories don't make Synchronize have
these errors.

Is there some requirement that the placeholder key end in a slash, if
there are other keys that have that key as parent? eg. if there is a
key "mydir" and a key "mydir/fred", will that confuse Synchronize?

It might have been that in the problem bucket, directory placeholder
keys were not always terminated with a '/'. Not sure though. In any
case, I used Cockpit to *remove* all directory placeholder keys that
had child keys, and that seems to have fixed it. All other buckets
are syncing down properly. We haven't yet tried syncing up - there's
lots of data we are afraid of losing, so we'll need to make sure to
back up our copy of what's in our production buckets before trying a
sync up.

Thanks for all your help - it pointed me in the right direction, tho'
I'm not sure what specifically was wrong.

Does Synchronize get confused if *some* directories have placeholder
keys, and some don't?

- Tim

James Murty

unread,
Dec 21, 2009, 10:29:39 PM12/21/09
to jets3t...@googlegroups.com
Hi,

If the directory placeholder objects were created by a JetS3t application they will have been recognized as such and handled appropriately. My best guess about what happened is that the problematic placeholder object was not created by a JetS3t application whereas the others were.

You can recognize placeholder objects created by JetS3t because they have a Content-Type of "application/x-directory".

It is unfortunate that your problematic placeholder object didn't have any obvious metadata or naming characteristics to indicate that was a directory rather than a file. If such information was available I could add some smarts to JetS3t to help it recognize and correctly treat directory placeholder objects created by other applications. Anyhow I'm glad your sync is working now.

Remember that you can test a synchronize command without causing any damage by including the "--noaction" option on the command line. However, having a safe backup is always the best course of action. A quick way to create a backup would be to use Cockpit's "Copy or Move" feature to copy your production objects into a second backup bucket. You could even do this and test against the copied objects rather than the originals, to be absolutely sure before running a sync against your production data.

Good luck with it,
James


--

You received this message because you are subscribed to the Google Groups "JetS3t Users" group.
To post to this group, send email to jets3t...@googlegroups.com.
To unsubscribe from this group, send email to jets3t-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/jets3t-users?hl=en.



TimOnGoogle

unread,
Dec 22, 2009, 2:27:47 PM12/22/09
to JetS3t Users
Thanks again for all your help, James.
I should definitely have been more thorough in my check for metadata
like Content-Type - I wasn't sure which metadata I should have been
looking for (and should have asked), so as to give you more info. If
this happens again, now I'll know what to look for.

Thanks also for your suggestions for safe up-syncing. Best technical
support evvaar! :-)

- Tim

On Dec 21, 7:29 pm, James Murty <jamu...@gmail.com> wrote:
> Hi,
>

> If the directory placeholder objects were created by a JetS3t application
> they will have been recognized as such and handled appropriately. My best
> guess about what happened is that the problematic placeholder object was not
> created by a JetS3t application whereas the others were.
>
> You can recognize placeholder objects created by JetS3t because they have a
> Content-Type of "application/x-directory".
>
> It is unfortunate that your problematic placeholder object didn't have any
> obvious metadata or naming characteristics to indicate that was a directory
> rather than a file. If such information was available I could add some
> smarts to JetS3t to help it recognize and correctly treat directory
> placeholder objects created by other applications. Anyhow I'm glad your sync
> is working now.
>
> Remember that you can test a synchronize command without causing any damage
> by including the "--noaction" option on the command line. However, having a
> safe backup is always the best course of action. A quick way to create a
> backup would be to use Cockpit's "Copy or Move" feature to copy your
> production objects into a second backup bucket. You could even do this and
> test against the copied objects rather than the originals, to be absolutely
> sure before running a sync against your production data.
>
> Good luck with it,
> James
>

> > jets3t-users...@googlegroups.com<jets3t-users%2Bunsu...@googlegroups.com>

TimOnGoogle

unread,
Dec 23, 2009, 3:12:52 PM12/23/09
to JetS3t Users
James...

Ok, I found another situation where this happened, and it was - as you
surpmised, because Content-Type was set to application/octet-stream.
I changed it to application/application/x-directory and it worked.
Trailing slash or not didn't seem to make a difference in the way
jets3t interpreted a "directory".

- Tim

Reply all
Reply to author
Forward
0 new messages