Check if a folder exists?

5,416 views
Skip to first unread message

Tomas Zulberti

unread,
Dec 14, 2009, 9:43:31 AM12/14/09
to boto-users
Hi. I am using boto for accessing my files on S3. At a given moment I
want to check if a folder exits. For example,
the path is: "s3n:://test_bucket/folder/". When I use S3 Firefox
Organizer, I see an empty folder.

How can I check if that folder exists using boto. I tried:
- bucket.get_key('folder/')
- bucket.get_key('folder')

And both returns None. I also tried with:
- bucket.list('folder/')

The problem when using that it that when the folder doesn't exits it
returns an empty iterable... For example when using:
- bucket.list('folder/this_sub_folder_doesnt_exits')
and
- bucket.list('folder/')
both return the same, an object that has no elements.

So the question, is how can I check if a folder exists?

Thanks in advance and sorry for my bad english,
Tomas Zulberti

pd: I am using boto-1.8d

Chris Moyer

unread,
Dec 14, 2009, 10:35:03 AM12/14/09
to boto-...@googlegroups.com
S3 doesn't have a concept of "Folders" as you put it. It only has a
concept of being able to filter based on the name of the key. The only
real way to create a so-called "folder" is to actually make a key
under that path.
> --
>
> You received this message because you are subscribed to the Google Groups "boto-users" group.
> To post to this group, send email to boto-...@googlegroups.com.
> To unsubscribe from this group, send email to boto-users+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/boto-users?hl=en.
>
>
>



--
Chris Moyer

Maarten Koopmans

unread,
Dec 14, 2009, 12:03:05 PM12/14/09
to boto-...@googlegroups.com
Yep, and if you want to do resource discovery or locking you might want to consider something like Apache Zookeeper for that.

--Maarten

Tomas Zulberti

unread,
Dec 15, 2009, 2:11:22 PM12/15/09
to boto-users


On Dec 14, 12:35 pm, Chris Moyer <koper...@gmail.com> wrote:
> S3 doesn't have a concept of "Folders" as you put it. It only has a
> concept of being able to filter based on the name of the key. The only
> real way to create a so-called "folder" is to actually make a key
> under that path.
>

I know that folders doesn't exits in S3. The question is who to check
if the key exists because when I use
- bucket.get_key('folder/')
- bucket.get_key('folder')

they both return None. Meaning that the key doesn't exits... I don't
know who to check if that key exists.

Thanks in advance,
Tomas Zulberti

mARK bLOORE

unread,
Dec 15, 2009, 3:02:19 PM12/15/09
to boto-...@googlegroups.com
did you in fact create a key called "folder" or "folder/"? creating a
key called "folder/stuff" makes only that one key; it does not also
make one called "folder". the entire string is the key name. there
are trick available to let you treat strings with certain characters
in them (such as "/", or anything else you want to use) rather like
hierarchical path names, but the keys are still complete strings.
there is no structure implied.
mARK bLOORE <mbl...@gmail.com>

Maarten Koopmans

unread,
Dec 15, 2009, 5:58:11 PM12/15/09
to boto-...@googlegroups.com
Here's what I do: compute the SHA256 or so and use that as the filename inside S3.

Emulate a filesystem (really just a directory layout) in anything that can act as a hierarchical database (i.e Zookeeper). If something is a "directory" in the hierarchy, it contains a number of named leafs. A leaf is either another directory, or a name containing the hash key that directly points into S3.

So you have 

/some/dir
/some/dir/file  with the contents of file just the SHA256 that points into S3. This is fairly cheap to do and emulates a filesystem easily. 

This is why I mentioned ZooKeeper before... this will give you znodes and concurrent access once you scale.

Another option would be one of the S3 Fuse filesystems (on Linux at least).

--Maarten

Mitchell Garnaat

unread,
Dec 15, 2009, 6:11:52 PM12/15/09
to boto-...@googlegroups.com
That's an interesting idea, Maarten.

Stephen Depooter

unread,
Dec 15, 2009, 6:23:29 PM12/15/09
to boto-...@googlegroups.com
On Tuesday 15 December 2009 02:11:22 pm Tomas Zulberti wrote:
> On Dec 14, 12:35 pm, Chris Moyer <koper...@gmail.com> wrote:
> > S3 doesn't have a concept of "Folders" as you put it. It only has a
> > concept of being able to filter based on the name of the key. The only
> > real way to create a so-called "folder" is to actually make a key
> > under that path.
>
> I know that folders doesn't exits in S3. The question is who to check
> if the key exists because when I use
> - bucket.get_key('folder/')
> - bucket.get_key('folder')
>

Then you need to realize that there is no key "folder" or "folder/", unless
you created it specifically.

If you want to do what Firefox S3 organizer etc are doing, then you need to
look at the S3 docs on listing results. Specifically the prefix, and
delimiter arguments.

Pasted below from current svn trunk.


Specifically, use "list" with appropriate delimiters and prefixes, and see if
the result set is empty or not. If it is empty, then the "folder" does not
exists.



----
def list(self, prefix='', delimiter='', marker='', headers=None):
"""
List key objects within a bucket. This returns an instance of an
BucketListResultSet that automatically handles all of the result
paging, etc. from S3. You just need to keep iterating until
there are no more results.
Called with no arguments, this will return an iterator object across
all keys within the bucket.

:type prefix: string
:param prefix: allows you to limit the listing to a particular
prefix. For example, if you call the method with
prefix='/foo/'
then the iterator will only cycle through the keys
that begin with
the string '/foo/'.

:type delimiter: string
:param delimiter: can be used in conjunction with the prefix
to allow you to organize and browse your keys
hierarchically. See:
http://docs.amazonwebservices.com/AmazonS3/2006-03-01/
for more details.

:type marker: string
:param marker: The "marker" of where you are in the result set

:rtype: :class:`boto.s3.bucketlistresultset.BucketListResultSet`
:return: an instance of a BucketListResultSet that handles paging, etc
"""
return BucketListResultSet(self, prefix, delimiter, marker, headers)
----
Stephen Depooter
<sb...@woot.net>
Reply all
Reply to author
Forward
0 new messages