access denied to bucket 'aws-publicdatasets'

509 views
Skip to first unread message

Yuhan Zhang

unread,
Feb 25, 2013, 2:05:10 PM2/25/13
to common...@googlegroups.com
hi all,

I just discovered this public dataset today, but get access denied when trying to list the content underneath 

s3cmd ls s3://aws-publicdatasets/common-crawl/parse-output/
ERROR: Access to bucket 'aws-publicdatasets' was denied
s3cmd ls  s3://aws-publicdatasets/common-crawl/crawl-001/
ERROR: Access to bucket 'aws-publicdatasets' was denied
s3cmd ls s3://aws-publicdatasets/common-crawl/crawl-002/
ERROR: Access to bucket 'aws-publicdatasets' was denied

Error message displays 403 when I tried it through hadoop fs:

hadoop fs -ls s3://aws-publicdatasets/common-crawl/parse-output/segment/
ls: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/common-crawl%2Fparse-output%2Fsegment' - ResponseCode=403, ResponseMessage=Forbidden

I could list other public s3 repo correctly: s3://datasets.elasticmapreduce/ngrams/books/

did I miss setup to access the content?


Thank you.

Yuhan

Mat Kelcey

unread,
Feb 25, 2013, 2:11:35 PM2/25/13
to common...@googlegroups.com
hmmm. i didn't think there was any setup to do.

i just ran this, only a totally vanilla fresh ec2 instance, and had no problems

$ s3cmd ls s3://aws-publicdatasets/common-crawl/parse-output/ | head -n3
                       DIR   s3://aws-publicdatasets/common-crawl/parse-output/checkpoint_staging/
                       DIR   s3://aws-publicdatasets/common-crawl/parse-output/checkpoints/
                       DIR   s3://aws-publicdatasets/common-crawl/parse-output/segment/

???



Yuhan

--
You received this message because you are subscribed to the Google Groups "Common Crawl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to common-crawl...@googlegroups.com.
To post to this group, send email to common...@googlegroups.com.
Visit this group at http://groups.google.com/group/common-crawl?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Yuhan Zhang

unread,
Feb 25, 2013, 9:55:07 PM2/25/13
to common...@googlegroups.com
hi Mat,

I added the bucket to my iam account. I could list that bucket now.
s3://aws-publicdatasets/common-crawl/

It is kind strange as I was able to access other public repo without this configuration.

thanks for the help :)

Yuhan

The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege.  If you receive this message in error, please do not directly or indirectly print, copy, retransmit, disseminate, or otherwise use the information. In addition, please delete this e-mail and all copies and notify the sender.
Reply all
Reply to author
Forward
0 new messages