Amazon S3 is a durable, secure, simple, and fast storage service designed to make web-scale computing easier for developers. Use Amazon S3 if you need low latency or frequent access to your data. Use Amazon Glacier if low storage cost is paramount, your data is rarely retrieved, and data retrieval times of several hours are acceptable.
In the coming months, Amazon S3 will introduce an option that will allow customers to seamlessly move data between Amazon S3 and Amazon Glacier based on data lifecycle policies.
--
You received this message because you are subscribed to the Google Groups "s3ql" group.To post to this group, send email to s3...@googlegroups.com.
To unsubscribe from this group, send email to s3ql+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/s3ql?hl=en.
To unsubscribe from this group, send email to s3ql+uns...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "s3ql" group.
To unsubscribe from this group, send email to s3ql+uns...@googlegroups.com.
On 08/23/2012 10:02 AM, Cliff Stanford wrote:
> On 21/08/12 17:44, David Prothero wrote:
>
>> Not sure if routine s3ql operations would end up incurring more "out"
>> transfers than Glacier is suited for.
>
> Has anyone looked at the API? Is it similar to the S3 one?
The upload API is trivial. It shouldn't take more than about 1-2 hours
to support in it S3QL.
Downloading data is much more complex. The API itself is very easy, but
it does not fit into the S3QL programming model at all. S3QL could
easily create download jobs for the data it needs, but it has no way to
notice when a job is ready for download. Also, when downloading data, it
has to be identified by a changing job id rather then the name it was
stored under. The same object will therefore have a different id every
time it's being downloaded. This means that the S3QL data structures
have to be extended to track the id in addition to the object identifier.
Finally, any read request from userspace would block for several minutes
at least. If a request blocks that long, it's going to create lots of
kernel warning messages. While a program is blocked in this way, it is
also in "uninterruptible sleep", so it cannot even be kill -9'ed.
If I made S3QL only able to write data to glacier, the problem is how
the file system should react to attempts to read data. Probably an IO
error should be generated, but this means that the highest S3QL layer
(which talks to the FUSE kernel module) suddenly needs to talk to the
lowest S3QL layer (the backend) for every request to determine if the
file system is write-only. This isn't hard to program, but it's going to
introduce a lot of ugly identical boiler plate code in lots of places.
�Time flies like an arrow, fruit flies like a Banana.�
Finally, any read request from userspace would block for several minutes
at least. If a request blocks that long, it's going to create lots of
kernel warning messages. While a program is blocked in this way, it is
also in "uninterruptible sleep", so it cannot even be kill -9'ed.
However, I believe you will hit performance issues much earlier. S3QL
performance is roughly logarithmic in stored data, so it will not get
significantly slower no matter how much data you store. However, I am
not sure if S3QL is fast enough to read and store petabytes of data in a
reasonable amount of time. Because of its architecture (FUSE and Python)
it doesn't scale nearly as well as an in-kernel file system.
Finally, S3QL can not upload incremental metadata updates. So every time
you upload the metadata you'd have to upload, say, the entire 1 TB
database, even if you added only 5 MB of data. I can imagine this
becoming very annoying :-).
I'm not sure I understand what kind of integration you have
in mind. Getting S3QL to store data in SimpleDB has been requested
several times before and would be a great feature to have.
Glacier directly would also be nice, but (as discussed a few
weeks ago), the problem isn't so much one of coding but one of architecture.
I deliberately eliminated the use of boto in S3QL a few years ago. Back
then the boto S3 code was (in my opinion) a terrible mess, and got even
worse when GS support was merged.
On 09/16/2012 10:55 AM, Jaka Hudoklin wrote:
> I'm not sure I understand what kind of integration you have
> in mind. Getting S3QL to store data in SimpleDB has been requested
> several times before and would be a great feature to have.
>
>
> I'm not talking about storing data to SimpleDB, but only file metadata,
> because there's no sane way that you can store archive metadata into
> glacier. You need some kind of database. So that's why we decided to use
> SimpleDB.
Yeah, I was talking about metadata too, sorry for the confusion.
Best,
-Nikolaus
--
�Time flies like an arrow, fruit flies like a Banana.�
On Monday, 17 September 2012 02:38:18 UTC+10, Nikolaus Rath wrote:On 09/16/2012 10:55 AM, Jaka Hudoklin wrote:
> I'm not sure I understand what kind of integration you have
> in mind. Getting S3QL to store data in SimpleDB has been requested
> several times before and would be a great feature to have.
>
>
> I'm not talking about storing data to SimpleDB, but only file metadata,
> because there's no sane way that you can store archive metadata into
> glacier. You need some kind of database. So that's why we decided to use
> SimpleDB.
Yeah, I was talking about metadata too, sorry for the confusion.What are the architectural issues preventing supporting SimpleDB instead SQlite for master and slave?(Non-architectural concerns I can appreciate, e.g. that backing S3QL meta into AWS isn't as 'open' and portable as SQLite for non-AWS users. The added option would also burden S3QL maintenance.)
| Attribute | Maximum |
|---|---|
| domains | 250 active domains per account. More can be requested by filling out a form.[6] |
| size of each domain | 10 GB |
| attributes per domain | 1,000,000,000 |
| attributes per item | 256 attributes |
| size per attribute | 1024 bytes |
| Attribute | Maximum |
|---|---|
| items returned in a query response | 2500 items |
| seconds a query may run | 5 seconds |
| attribute names per query predicate | 1 attribute name |
| comparisons per predicate | 22 operators |
| predicates per query expression | 20 predicates |