pier disk usage over time

68 views
Skip to first unread message

Mike Gogulski

unread,
Dec 10, 2020, 10:47:19 AM12/10/20
to urbit-dev
What should star and galaxy operators expect in terms of disk usage over time for their piers?

Prior to (perhaps unnecessarily) breaching to get v1.0 running yesterday, my galaxy's pier was just under 32 GB, and a star under it (with no active children) just under 2 GB. They were running since mid-March, and I did not interact over the network via Landscape on either ship.

Thanks for any insights y'all may have.

Mark

unread,
Dec 10, 2020, 11:12:16 AM12/10/20
to Mike Gogulski, urbit-dev
The primary force behind disk space usage is event log growth. Whenever anything happens to/with your ship at all, the events get written to disk, permanently.

Right now, when ships can't talk to each other directly, their galaxies will forward their packets for them. Traditionally, these have been injected into ames as events, processing them statefully and requiring them to be stored in the event log.

As of urbit-v1.0, the runtime supports stateless forwarding. This means it recognizes and forwards packets from the above scenario, without ever injecting them into ames. Your urbit itself doesn't have to know the forwards are happening, so they don't end up in the event log. For galaxies, this means their event logs will likely grow much more slowly than before.

That said, event log size still only goes up. You might not reach those 32 GB as quickly as you did before, but you'll get there eventually. We'll probably work on event log rollover/truncation tools in the future, to let you reclaim some of that space.


~palfun-foslup
https://urbit.org

Mike Gogulski

unread,
Dec 10, 2020, 11:17:04 AM12/10/20
to Mark, urbit-dev
Okay. Glad to hear about that change, and good to hear about the future direction. Thanks, Mark!

Basile Genève

unread,
Dec 11, 2020, 6:25:45 AM12/11/20
to urbit-dev, ~dys, urbit-dev, ~palfun-foslup
Yeah, event log pruning would be a really big deal for those of us who host multiple other planets. If it were in place, each Urbit power-user could probably onboard 5-30 friends/family at very low marginal $/time cost.

Christopher King

unread,
Dec 11, 2020, 9:23:04 AM12/11/20
to urbit-dev
Actually, I'm glad someone brought this up because I was about to do the same. The event log growth is bumping against size limits on some EC2 instances I'm running. Constantly upgrading the instance sizes is not sustainable from a business perspective. What would the process for properly deleting/truncating event logs without causing major/noticeable problems for users? I'm guessing until an actual truncation tool comes out a cron job could handle it in the meantime.


--
To unsubscribe from this group and stop receiving emails from it, send an email to dev+uns...@urbit.org.

Philip Monk

unread,
Dec 14, 2020, 8:56:26 PM12/14/20
to Christopher King, urbit-dev
There's not really even a hacky way to do this right now, as far as I know.  Various parts of the system assume things like "number of events in the log == number of the latest event", so just backing up a checkpoint and deleting the logs before that doesn't work.


On Fri, Dec 11, 2020 at 6:22 AM, Christopher King <ch...@cking.me> wrote:
Actually, I'm glad someone brought this up because I was about to do the same. The event log growth is bumping against size limits on some EC2 instances I'm running. Constantly upgrading the instance sizes is not sustainable from a business perspective. What would the process for properly deleting/truncating event logs without causing major/noticeable problems for users? I'm guessing until an actual truncation tool comes out a cron job could handle it in the meantime.


On Fri, Dec 11, 2020 at 6:25 AM Basile Genève <basilesportif@gmail.com> wrote:
Yeah, event log pruning would be a really big deal for those of us who host multiple other planets. If it were in place, each Urbit power-user could probably onboard 5-30 friends/family at very low marginal $/time cost.
On Thursday, December 10, 2020 at 6:17:04 PM UTC+2 ~dys wrote:
Okay. Glad to hear about that change, and good to hear about the future direction. Thanks, Mark!

On Thu, Dec 10, 2020 at 5:12 PM Mark <ma...@tlon.io> wrote:
The primary force behind disk space usage is event log growth. Whenever anything happens to/with your ship at all, the events get written to disk, permanently.

Right now, when ships can't talk to each other directly, their galaxies will forward their packets for them. Traditionally, these have been injected into ames as events, processing them statefully and requiring them to be stored in the event log.

As of urbit-v1.0, the runtime supports stateless forwarding. This means it recognizes and forwards packets from the above scenario, without ever injecting them into ames. Your urbit itself doesn't have to know the forwards are happening, so they don't end up in the event log. For galaxies, this means their event logs will likely grow much more slowly than before.

That said, event log size still only goes up. You might not reach those 32 GB as quickly as you did before, but you'll get there eventually. We'll probably work on event log rollover/truncation tools in the future, to let you reclaim some of that space.


~palfun-foslup
https://urbit.org

> On 10 Dec 2020, at 16:47, Mike Gogulski <mi...@gogulski.com> wrote:
>
> What should star and galaxy operators expect in terms of disk usage over time for their piers?
>
> Prior to (perhaps unnecessarily) breaching to get v1.0 running yesterday, my galaxy's pier was just under 32 GB, and a star under it (with no active children) just under 2 GB. They were running since mid-March, and I did not interact over the network via Landscape on either ship.
>
> Thanks for any insights y'all may have.

--
To unsubscribe from this group and stop receiving emails from it, send an email to dev+unsubscribe@urbit.org.

Christopher King

unread,
Dec 14, 2020, 9:13:55 PM12/14/20
to urbit-dev
This is a pretty big deal, then. It’s a major weakness for hosting providers to be unable to constrain ship size even in principle, or even meaningfully predict its growth. So far I’m getting by manually increasing EC2 instance sizes, but in addition to not being a scalable solution, this also usually requires hearing about a problem from a customer. Not the UX you want when it already feels so early stage and experimental to most people.

Is there some way Tlon can prioritize a solution?

To unsubscribe from this group and stop receiving emails from it, send an email to dev+uns...@urbit.org.

--

Best,
Chris

Philip Monk

unread,
Dec 14, 2020, 10:02:28 PM12/14/20
to Christopher King, urbit-dev
Heh, that's been our solution too, as evidenced by the many times our galaxies/stars/urbit community planets have gone offline until someone happens to notice it.  It's obviously becoming a bigger deal for Tlon as a hosting provider as well, so we're feeling a righteous pressure to work on it.

It would also solve the upload/download your pier problem — right now you have to tar up your whole directory, which includes the event log and can be too large to reasonably transfer.  Ideally you should just have to send your checkpoint, which is never larger than 2GB completely uncompressed — and it compresses really well since much of that is box sizes, refcounting, and hash caches — all of which can be recomputed.

We're actively planning our Q1 goals and that's definitely in the mix, but we haven't nailed anything down yet.


On Mon, Dec 14, 2020 at 6:13 PM, Christopher King <ch...@cking.me> wrote:
This is a pretty big deal, then. It’s a major weakness for hosting providers to be unable to constrain ship size even in principle, or even meaningfully predict its growth. So far I’m getting by manually increasing EC2 instance sizes, but in addition to not being a scalable solution, this also usually requires hearing about a problem from a customer. Not the UX you want when it already feels so early stage and experimental to most people.

Is there some way Tlon can prioritize a solution?
On Mon, Dec 14, 2020 at 8:56 PM Philip Monk <philip@tlon.io> wrote:
There's not really even a hacky way to do this right now, as far as I know.  Various parts of the system assume things like "number of events in the log == number of the latest event", so just backing up a checkpoint and deleting the logs before that doesn't work.

On Fri, Dec 11, 2020 at 6:22 AM, Christopher King <chris@cking.me> wrote:
Actually, I'm glad someone brought this up because I was about to do the same. The event log growth is bumping against size limits on some EC2 instances I'm running. Constantly upgrading the instance sizes is not sustainable from a business perspective. What would the process for properly deleting/truncating event logs without causing major/noticeable problems for users? I'm guessing until an actual truncation tool comes out a cron job could handle it in the meantime.


On Fri, Dec 11, 2020 at 6:25 AM Basile Genève <basilesportif@gmail.com> wrote:
Yeah, event log pruning would be a really big deal for those of us who host multiple other planets. If it were in place, each Urbit power-user could probably onboard 5-30 friends/family at very low marginal $/time cost.
On Thursday, December 10, 2020 at 6:17:04 PM UTC+2 ~dys wrote:
Okay. Glad to hear about that change, and good to hear about the future direction. Thanks, Mark!

On Thu, Dec 10, 2020 at 5:12 PM Mark <ma...@tlon.io> wrote:
The primary force behind disk space usage is event log growth. Whenever anything happens to/with your ship at all, the events get written to disk, permanently.

Right now, when ships can't talk to each other directly, their galaxies will forward their packets for them. Traditionally, these have been injected into ames as events, processing them statefully and requiring them to be stored in the event log.

As of urbit-v1.0, the runtime supports stateless forwarding. This means it recognizes and forwards packets from the above scenario, without ever injecting them into ames. Your urbit itself doesn't have to know the forwards are happening, so they don't end up in the event log. For galaxies, this means their event logs will likely grow much more slowly than before.

That said, event log size still only goes up. You might not reach those 32 GB as quickly as you did before, but you'll get there eventually. We'll probably work on event log rollover/truncation tools in the future, to let you reclaim some of that space.


~palfun-foslup
https://urbit.org

> On 10 Dec 2020, at 16:47, Mike Gogulski <mi...@gogulski.com> wrote:
>
> What should star and galaxy operators expect in terms of disk usage over time for their piers?
>
> Prior to (perhaps unnecessarily) breaching to get v1.0 running yesterday, my galaxy's pier was just under 32 GB, and a star under it (with no active children) just under 2 GB. They were running since mid-March, and I did not interact over the network via Landscape on either ship.
>
> Thanks for any insights y'all may have.

--
To unsubscribe from this group and stop receiving emails from it, send an email to dev+unsubscribe@urbit.org.

--

Best,
Chris

Christopher King

unread,
Dec 17, 2020, 8:57:33 AM12/17/20
to urbit-dev
Honestly, if you can somehow do this and, as part of the solution, find a workaround for the manual swap space configuration frequently needed, it’s got to be one of the most bang-for-your-buck things you can do towards making Urbit scalable.

On Mon, Dec 14, 2020 at 10:02 PM Philip Monk <phi...@tlon.io> wrote:
Heh, that's been our solution too, as evidenced by the many times our galaxies/stars/urbit community planets have gone offline until someone happens to notice it.  It's obviously becoming a bigger deal for Tlon as a hosting provider as well, so we're feeling a righteous pressure to work on it.

It would also solve the upload/download your pier problem — right now you have to tar up your whole directory, which includes the event log and can be too large to reasonably transfer.  Ideally you should just have to send your checkpoint, which is never larger than 2GB completely uncompressed — and it compresses really well since much of that is box sizes, refcounting, and hash caches — all of which can be recomputed.

We're actively planning our Q1 goals and that's definitely in the mix, but we haven't nailed anything down yet.

On Mon, Dec 14, 2020 at 6:13 PM, Christopher King <ch...@cking.me> wrote:
This is a pretty big deal, then. It’s a major weakness for hosting providers to be unable to constrain ship size even in principle, or even meaningfully predict its growth. So far I’m getting by manually increasing EC2 instance sizes, but in addition to not being a scalable solution, this also usually requires hearing about a problem from a customer. Not the UX you want when it already feels so early stage and experimental to most people.

Is there some way Tlon can prioritize a solution?
On Mon, Dec 14, 2020 at 8:56 PM Philip Monk <phi...@tlon.io> wrote:
There's not really even a hacky way to do this right now, as far as I know.  Various parts of the system assume things like "number of events in the log == number of the latest event", so just backing up a checkpoint and deleting the logs before that doesn't work.

On Fri, Dec 11, 2020 at 6:22 AM, Christopher King <ch...@cking.me> wrote:
Actually, I'm glad someone brought this up because I was about to do the same. The event log growth is bumping against size limits on some EC2 instances I'm running. Constantly upgrading the instance sizes is not sustainable from a business perspective. What would the process for properly deleting/truncating event logs without causing major/noticeable problems for users? I'm guessing until an actual truncation tool comes out a cron job could handle it in the meantime.


On Fri, Dec 11, 2020 at 6:25 AM Basile Genève <basile...@gmail.com> wrote:
Yeah, event log pruning would be a really big deal for those of us who host multiple other planets. If it were in place, each Urbit power-user could probably onboard 5-30 friends/family at very low marginal $/time cost.
On Thursday, December 10, 2020 at 6:17:04 PM UTC+2 ~dys wrote:
Okay. Glad to hear about that change, and good to hear about the future direction. Thanks, Mark!

On Thu, Dec 10, 2020 at 5:12 PM Mark <ma...@tlon.io> wrote:
The primary force behind disk space usage is event log growth. Whenever anything happens to/with your ship at all, the events get written to disk, permanently.

Right now, when ships can't talk to each other directly, their galaxies will forward their packets for them. Traditionally, these have been injected into ames as events, processing them statefully and requiring them to be stored in the event log.

As of urbit-v1.0, the runtime supports stateless forwarding. This means it recognizes and forwards packets from the above scenario, without ever injecting them into ames. Your urbit itself doesn't have to know the forwards are happening, so they don't end up in the event log. For galaxies, this means their event logs will likely grow much more slowly than before.

That said, event log size still only goes up. You might not reach those 32 GB as quickly as you did before, but you'll get there eventually. We'll probably work on event log rollover/truncation tools in the future, to let you reclaim some of that space.


~palfun-foslup
https://urbit.org

> On 10 Dec 2020, at 16:47, Mike Gogulski <mi...@gogulski.com> wrote:
>
> What should star and galaxy operators expect in terms of disk usage over time for their piers?
>
> Prior to (perhaps unnecessarily) breaching to get v1.0 running yesterday, my galaxy's pier was just under 32 GB, and a star under it (with no active children) just under 2 GB. They were running since mid-March, and I did not interact over the network via Landscape on either ship.
>
> Thanks for any insights y'all may have.

--
To unsubscribe from this group and stop receiving emails from it, send an email to dev+uns...@urbit.org.

--

Best,
Chris

--

Best,
Chris
Reply all
Reply to author
Forward
0 new messages