Estimating usage for S3 storage plugin

46 views
Skip to first unread message

Spadajspadaj

unread,
Jan 18, 2021, 2:30:36 AM1/18/21
to bareos-users
Hi.

I wanted to give S3 storage plugin a try. For now just to see how it
works, but maybe to use it in production one day. But I have completely
no idea how to estimate S3 usage and thus associated costs. I admit I am
no S3 expert at the moment so it would be an opportunity to learn about
S3 for me at the same time. Where can I read a bit more about S3 storage
backend (apart from the manual where I only see how to configure the SD
for S3 as far as I can see)? I don't want to ask too many newbie
questions ;-) Especially about using different S3 tiers for storage (It
would make way sense to use Glacier or even Glacier Deep Archive for
long-term storage rather than Frequent Access tier; at least pricewise).

I can of course set up an account and perform some small-scale test
within free tier but I'd like to know what I would be doing ;-)

Best regards

MK

Brock Palen

unread,
Jan 18, 2021, 5:28:09 AM1/18/21
to Spadajspadaj, bareos-users
Disclaimer I have not used s3 with bareos but done many cloud calculations. 

Few things to think about using cloud. 
Are you running your SD in the cloud?
Are your backup clients in the cloud?
If not what’s your bandwidth? It will impact your backup and restore times significantly if you have modest WAN capacity for local  clients servers. 

As for s3 pricing read this carefully 


You have three components to pricing with s3 and I expect only two move the needle on cost. 

Data stored
Bandwidth and retrieval 
Operations

Opts rations are so cheap and guessing how bareos uses virtual tape volumes it’s prob not a big issue. Someone who has used it though can speak. 

Data stored its straight $/gb/month. So you need to estimate your total data stored for all your fulls and incrementals. Your right these costs decline when you look at glacier but there is a trade off. The cheaper to store the more expensive to access. 

Retrieval fees come in two forms. The first is bandwidth. Which fit most people is .09$/gb (unless your clients and servers are in the same aws region) for my cloud activities this is 50% of my monthly bill. It’s the thing that messes most cloud calculators for budget.  That said if your sever is on prem you likely will never pay this if you don’t use always incremental or do any restores. So if your ok paying for restores maybe it’s ok. 

The cold tiers like glacier charge to access data. Again maybe fine if you almost never read it. Glacier runs $10/tb or more for transfer vs nothing for regular s3. With bandwidth your at ~$100/tb  This is something to avoid deep archive. Their sla is many hours to get data.  I don’t think deep archive is a backup replacement but a compliance archive replacement 

Also be aware glacier and deep archive have minimum retention times of 90 and 180 days. So you will always pay that at a minimum. Ok if your keeping fulls for a long time.  Look at the auto tier options to manage aging volumes. 

So YMMV. If you are 100% in the cloud or you don’t use always incremental or have small data volumes or just a dr copy it works great. 

Personally I run my servers in aws and my full bareos setup on prem with a $400 tape library from eBay. This gives me diversity and most of the data in the cloud is small (websites email text) while the on prem is video photos and road warriors using always incremental. 

Sent from my iPhone
Brock Palen

On Jan 18, 2021, at 2:30 AM, Spadajspadaj <spadaj...@gmail.com> wrote:

Hi.
--
You received this message because you are subscribed to the Google Groups "bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bareos-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bareos-users/9e96d571-cf32-9708-ccd1-2f19350d0849%40gmail.com.

Spadajspadaj

unread,
Jan 18, 2021, 7:04:31 AM1/18/21
to bareos-users
On 18/01/2021 11:28, Brock Palen wrote:
> Disclaimer I have not used s3 with bareos but done many cloud
> calculations.
>
> Few things to think about using cloud.
> Are you running your SD in the cloud?
> Are your backup clients in the cloud?
> If not what’s your bandwidth? It will impact your backup and restore
> times significantly if you have modest WAN capacity for local  clients
> servers.

No, no. I was thinking about keeping an extra copy "off-site". I'm
mostly cloud-free at the moment and I do not wish to change it
significantly. I was thinking whether S3 could be an option for
extending my home backup setup.

Of course I understand the impact of bandwidth on the backup/restore
times. :-)

> As for s3 pricing read this carefully
>
> https://aws.amazon.com/s3/pricing/ <https://aws.amazon.com/s3/pricing/>
>
> You have three components to pricing with s3 and I expect only two
> move the needle on cost.
>
> Data stored
> Bandwidth and retrieval
> Operations
>
> Opts rations are so cheap and guessing how bareos uses virtual tape
> volumes it’s prob not a big issue. Someone who has used it though can
> speak.

That's a good observation. Thanks!

> Data stored its straight $/gb/month. So you need to estimate your
> total data stored for all your fulls and incrementals. Your right
> these costs decline when you look at glacier but there is a trade off.
> The cheaper to store the more expensive to access.
>
> Retrieval fees come in two forms. The first is bandwidth. Which fit
> most people is .09$/gb (unless your clients and servers are in the
> same aws region) for my cloud activities this is 50% of my monthly
> bill. It’s the thing that messes most cloud calculators for budget.
>  That said if your sever is on prem you likely will never pay this if
> you don’t use always incremental or do any restores. So if your ok
> paying for restores maybe it’s ok.
>
> The cold tiers like glacier charge to access data. Again maybe fine if
> you almost never read it. Glacier runs $10/tb or more for transfer vs
> nothing for regular s3. With bandwidth your at ~$100/tb  This is
> something to avoid deep archive. Their sla is many hours to get data.
>  I don’t think deep archive is a backup replacement but a compliance
> archive replacement


Well, that's what I'm counting on - it's better to have backup copy and
not need to use it than not having it ;-)

What I was also interested in was also how to approach the long SLA
regarding Bareos SD operation. Would I have to firstly request access to
the glacier data independently of the SD and after receiving
confirmation of data availability would have to run a restore job? Or
would I just run a restore job from storage using cold-tiered bucket and
the job would simply wait for data availability (similar to mounting tape)?

> Also be aware glacier and deep archive have minimum retention times of
> 90 and 180 days. So you will always pay that at a minimum. Ok if your
> keeping fulls for a long time.  Look at the auto tier options to
> manage aging volumes.

Yes, I noticed that


>
> So YMMV. If you are 100% in the cloud or you don’t use always
> incremental or have small data volumes or just a dr copy it works great.
>
> Personally I run my servers in aws and my full bareos setup on prem
> with a $400 tape library from eBay. This gives me diversity and most
> of the data in the cloud is small (websites email text) while the on
> prem is video photos and road warriors using always incremental.


So it all comes to "try the free tier and see for yourself" :-) I'll
have to do it anyway when I get some spare time just to see how it works
and get some understanding about achievable througputs, needed space and
so on.


Thanks for valuable insight!

Spadajspadaj

unread,
Jan 25, 2021, 4:05:53 AM1/25/21
to bareos...@googlegroups.com

On 18/01/2021 13:04, Spadajspadaj wrote:
> On 18/01/2021 11:28, Brock Palen wrote:
>> Disclaimer I have not used s3 with bareos but done many cloud
>> calculations.
>>
>> Few things to think about using cloud.
>> Are you running your SD in the cloud?
>> Are your backup clients in the cloud?
>> If not what’s your bandwidth? It will impact your backup and restore
>> times significantly if you have modest WAN capacity for local
>>  clients servers.
>
> No, no. I was thinking about keeping an extra copy "off-site". I'm
> mostly cloud-free at the moment and I do not wish to change it
> significantly. I was thinking whether S3 could be an option for
> extending my home backup setup.
>
> Of course I understand the impact of bandwidth on the backup/restore
> times. :-)


OK. I recalculated it using the AWS calculator and it turnes out that
even with glacier tier I can live with costs for storing the data (some
16$ per month for 4TB), but in case of disaster I'd have to pay
something like 180$ for transfer. That's definitely not worth it. I'd
rather buy another disk and keep it in rotation.

It turns out it's not very useful for me after all.

Reply all
Reply to author
Forward
0 new messages