Ready to say goodbye to "mojo:" urls?

35 views
Skip to first unread message

Przemysław Pietrzkiewicz

unread,
Aug 28, 2015, 12:44:05 PM8/28/15
to mojo...@chromium.org
Dear all,

As already discussed on various occasions[1][2], "mojo:" urls are a hack and are causing us problems:
  • on Linux they make the shell bypass the network stack and grab the binary directly from disk, making things that will be slow in production appear fast in development
  • they add a special case to the shell and tooling code
  • they are only useful for references between apps that live all in the same place, which does not scale along with our world.
We already have an alternative that does scale: developers of mojo apps refer to their apps via proper urls (e.g. https://example.com/unicorn_spawner) and the shell offers a '--map-origin' switch, so that "https://example.com" can be mapped to anything - including a local build directory. We also taught mojo tools to read a per-repo config file, so that setting up any mappings can be made transparent and automatic.

Now we can stop living in a special case.

I'd like to propose that we change references to mojo:apps to read https://core.mojoapps.io/app.mojo and drop (once the dust settles) support for "mojo:" urls and "--origin" from the shell. The dev server is already configured so that every time you run `mojo_run` or `mojo_test` in a Mojo checkout, things will "just work".

What do you think?


Cheers,
Przemek

James Robinson

unread,
Aug 28, 2015, 1:46:55 PM8/28/15
to Przemysław Pietrzkiewicz, mojo...@chromium.org
I know that on linux I always run mojo_shell directly just to have a manageable command line and cycle time.  Compare:

$ time out/Debug/mojo_shell mojo:echo_client
[0828/104036:INFO:echo_client.cc(21)] ***** Response: hello world

real 0m0.025s

vs

$ time mojo/devtools/common/mojo_run --no-debugger https://core.mojoapps.io/echo_client.mojo
Configured https://core.mojoapps.io/ locally at http://127.0.0.1:31839/ to serve:
  /packages/ -> ['/ssd/mojo/src/out/Debug/gen/dart-pkg/packages']
  / -> ['/ssd/mojo/src/out/Debug', '.']
[INFO:network_fetcher.cc(85)] Caching mojo app http://127.0.0.1:31839/echo_client.mojo at /usr/local/google/home/jamesr/.mojo_url_response_disk_cache/http_3a//127.0.0.1_3a31839/echo_5fclient.mojo/.org.chromium.Chromium.xBELFo
[0828/103918:INFO:echo_client.cc(21)] ***** Response: hello world

real 0m2.040s

The mojo_run version takes 80 times as long and the command line required is more than twice as long (if I omit --no-debugger the mojo_run command never terminates).  Is there a better way to use the mojo_* tools that has a comparable cycle time and ease of use to running mojo_shell directly?

A slightly more awkward but still reasonable way to run echo_client without using mojo: URLs is to do

cd out/Debug
./mojo_shell echo_client.mojo

but it seems like you're proposing to remove this shortcut as well, correct?

- James

Mitch Rudominer

unread,
Aug 28, 2015, 2:35:40 PM8/28/15
to Przemysław Pietrzkiewicz, mojo...@chromium.org
sgtm

On Fri, Aug 28, 2015 at 9:43 AM, Przemysław Pietrzkiewicz <p...@chromium.org> wrote:

Przemysław Pietrzkiewicz

unread,
Aug 28, 2015, 3:26:54 PM8/28/15
to James Robinson, mojo...@chromium.org
There's certainly room for improvement in the mojo_run flow.

As for the length of the command, I think we should flip the debugger flag around to run without debugger by default, so that the simplest cases are supported through the simplest command lines. (also, the behavior of not terminating is a puzzling trap for newcomers)

As for loading apps directly from disk, I do not think we should disallow that (esp. when an explicit path to a local file is given, like in your example), but I do think that the default should be to hit the network stack (even if only to reach a dev server hosting locally built apps), as in production all app requests will go through this codepath.

Cheers,
Przemek

James Robinson

unread,
Aug 28, 2015, 9:30:51 PM8/28/15
to Przemysław Pietrzkiewicz, mojo...@chromium.org

Would these changes provide a workflow that is as concise and fast as what I'm currently doing? I think enabling developer productivity should be the number one goal of our tooling. There are many aspects of our system that developers need to be able to iterate quickly on. Some of these are related to loading apps and some are not, but working on any of them requires having a fast and easy to remember workflow.

Jeff Brown

unread,
Aug 29, 2015, 2:28:38 AM8/29/15
to James Robinson, Przemysław Pietrzkiewicz, mojo...@chromium.org

+1 workflow needs to be easy although we might just need some better shell scripts...

That said, I don't quite understand why these mojo URLs require any special case support in the shell or why that logic should create any special headaches. Can't they be handled generically by some kind of URL scheme handler mechanism?

I can easily envision scenarios where we will want to load executables off of fixed media (especially when bootstrapping) or embedded into packages (like an Android apk) that should be resolved through non-http schemes (and where we would like to bypass the extra copy into the network cache and such).

The logic for stuff like this shouldn't necessarily pollute the shell, assuming it is well factored.

Jeff.

To unsubscribe from this group and stop receiving emails from it, send an email to mojo-dev+u...@chromium.org.

Adam Barth

unread,
Aug 29, 2015, 2:39:42 AM8/29/15
to Jeff Brown, James Robinson, Przemysław Pietrzkiewicz, mojo...@chromium.org
On Fri, Aug 28, 2015 at 11:28 PM 'Jeff Brown' via mojo-dev <mojo...@chromium.org> wrote:

+1 workflow needs to be easy although we might just need some better shell scripts...

That said, I don't quite understand why these mojo URLs require any special case support in the shell or why that logic should create any special headaches. Can't they be handled generically by some kind of URL scheme handler mechanism?

They're handled specially because the generic URL scheme handler mechanism returns a DataPipe whereas a mojo URL resolves to a FilePath that is fed directly to LoadLibrary.

I can easily envision scenarios where we will want to load executables off of fixed media (especially when bootstrapping) or embedded into packages (like an Android apk) that should be resolved through non-http schemes (and where we would like to bypass the extra copy into the network cache and such).

I think you're saying that what is currently a special case is useful more generally and therefore shouldn't be removed. 

The logic for stuff like this shouldn't necessarily pollute the shell, assuming it is well factored.

Something needs to understand the difference between trying to inflate code from a DataPipe versus a FilePath.  That doesn't necessarily need to be the shell, but currently the shell is what inflates binary code.

Adam

Jeff Brown

unread,
Aug 29, 2015, 3:12:15 AM8/29/15
to Adam Barth, James Robinson, Przemysław Pietrzkiewicz, mojo...@chromium.org

Perhaps the file path (or opened raw file descriptor) should be the least common denominator.  You'll want to mmap the contents anyways for efficiency.  A data pipe only makes sense as something you would stream into a file cache.  If the network service takes care of that then all we're left with is a file stored someplace.

We had an earlier conversation regarding the shell's special handling of elf binaries.  I think in the end we'll want to shove all of that off into a content handler anyhow, even if the code ends up being linked into the shell executable for bootstrapping.  (We might not even need to do that if something else, like init, takes care of starting the content handler instead.)

Eventually the work of forking a process and setting up a sandbox to run the code is going to be more than we'll want to have in the shell anyhow, at which point it might not make much sense to have the shell calling load library itself at all!  ;)

Jeff.

Adam Barth

unread,
Aug 29, 2015, 11:14:20 AM8/29/15
to Jeff Brown, James Robinson, Przemysław Pietrzkiewicz, mojo...@chromium.org
On Sat, Aug 29, 2015 at 12:12 AM Jeff Brown <jeff...@google.com> wrote:

Perhaps the file path (or opened raw file descriptor) should be the least common denominator.  You'll want to mmap the contents anyways for efficiency.

Yes, that could make sense.  Ideally we'd be able to use a file descriptor.  One stumbling block is that LoadLibrary/dlopen take a path rather than a file descriptor as an argument.  There's also that subtle issue that gdb and stack printing utilities expect to be able to find dynamic libraries via paths.

A data pipe only makes sense as something you would stream into a file cache.  If the network service takes care of that then all we're left with is a file stored someplace.

In practice, I don't believe monet use a separate file descriptor for each cache entry.  Presumably we could change the structure of its cache we wanted to.

We had an earlier conversation regarding the shell's special handling of elf binaries.  I think in the end we'll want to shove all of that off into a content handler anyhow, even if the code ends up being linked into the shell executable for bootstrapping.  (We might not even need to do that if something else, like init, takes care of starting the content handler instead.)

Conceptually that makes good sense.  Loading ELF binaries predates our inventing content handlers, which is probably why this part the design isn't very rational.

Eventually the work of forking a process and setting up a sandbox to run the code is going to be more than we'll want to have in the shell anyhow, at which point it might not make much sense to have the shell calling load library itself at all!  ;)

I'm not entirely sure what all is possible here given the current EDK design.  For example, I bet we need to call some EDK entry points when creating a child process, which means if we moved the code that creates new processes into a content handler, it would still need to run in the master process and have access to the EDK APIs, at which point it's unclear what we've gained by moving it out of the "shell".  We might need to talk with trung to learn more.

Przemysław Pietrzkiewicz

unread,
Aug 31, 2015, 12:11:22 PM8/31/15
to Adam Barth, Jeff Brown, James Robinson, q...@google.com, mojo...@chromium.org
Thanks for the discussion so far! Let me try to summarize:
  • we want to keep the ability to load apps directly from disk binaries regardless of app url changes
  • we are concerned that switching to real urls will:
    • make apps load slower
    • make commands more verbose (as per the workflow currently available)
Loading directly from disk
Sgtm. We probably should allow one to --map-origin a host to a directory referenced through a file:/// url, making all apps from this host be loaded directly from disk. (+Benjamin Lerman tells me this might already work). This would generalize the desirable feature and disentangle it from app naming.

Moving to real urls
We need to work out scripting support that makes working with real urls as simple as possible. If we want, we can go very far with that (imagine for instance that "mojo_run" detects "mojo:" urls on the command line, rewrites them transparently to real urls and applies the disk mapping described above to preserve current behavior - details can be probably discussed elsewhere).

Not sure though if getting the target workflow "as fast and concise" as the one we have should block the switch. "mojo:" urls have meaning only in our repo, "developers" of Mojo apps in general have to work and already do work with real urls. We have to make the general workflow good; which we have better chances at if we use it ourselves.

Wdyt?

Cheers,
Przemek

James Robinson

unread,
Aug 31, 2015, 8:00:19 PM8/31/15
to Jeff Brown, Adam Barth, Przemysław Pietrzkiewicz, mojo...@chromium.org
Just mapping the whole file isn't enough either, ELF or NaCl loaders will probably want to map some parts as r--, some r-x and some rw- (and possibly even rwx for JITs) and to load them at certain addresses or offsets.  We have some notion of memory mapping a section using a shared memory handle for example here: https://github.com/domokit/mojo/blob/master/mojo/services/files/public/interfaces/file.mojom#L80 but not enough to bootstrap our system yet.  I've been imagining that we'll want a way to perform mmap()-like operations from some sort of Mojo handle so that we can transmit a handle to the underlying thing (file or blob or whatnot) but I think we're a fair bit away from getting to that point.  For now content handlers that want to mmap() cheat by streaming the data from the DataPipe into a temporary file and then mmap()ing that.

- James

James Robinson

unread,
Aug 31, 2015, 8:06:36 PM8/31/15
to Przemysław Pietrzkiewicz, Adam Barth, Jeff Brown, Benjamin Lerman, mojo...@chromium.org
On Mon, Aug 31, 2015 at 9:11 AM, Przemysław Pietrzkiewicz <p...@chromium.org> wrote:
Not sure though if getting the target workflow "as fast and concise" as the one we have should block the switch. "mojo:" urls have meaning only in our repo, "developers" of Mojo apps in general have to work and already do work with real urls. We have to make the general workflow good; which we have better chances at if we use it ourselves.

I feel that having a fast and easy to use development workflow for people working on the system is essential to being able to improve the system.  I don't think this is necessarily a competing goal with having a good workflow for developers, it just requires some more care and attention to detail.

I think the current state of our tooling is not sufficiently good for the behavior switch you are proposing here.  I don't see any reason why the tools could not be improved to the point where the change you are proposing here does not slow down developers working on the system.  Do you think there is something blocking having a fast and easy workflow for Mojo developers without special handling for mojo: URLs?

- James

Przemysław Pietrzkiewicz

unread,
Sep 2, 2015, 10:54:05 AM9/2/15
to James Robinson, Adam Barth, Jeff Brown, Benjamin Lerman, mojo...@chromium.org
On Tue, 1 Sep 2015 at 02:04 James Robinson <jam...@chromium.org> wrote:
I feel that having a fast and easy to use development workflow for people working on the system is essential to being able to improve the system.  I don't think this is necessarily a competing goal with having a good workflow for developers, it just requires some more care and attention to detail.

Ack. We should be able to align these two closely - the requirements we enumerated on this thread (concise commands, running directly from disk on Linux) will likely apply to the general developer workflow just as well.

I think the current state of our tooling is not sufficiently good for the behavior switch you are proposing here.  I don't see any reason why the tools could not be improved to the point where the change you are proposing here does not slow down developers working on the system. 
Do you think there is something blocking having a fast and easy workflow for Mojo developers without special handling for mojo: URLs?

Nope, point taken, let's get there:). Filed https://github.com/domokit/mojo/issues/408 to track improving the workflow for real urls so that it meets our needs.

Cheers,
Przemek 
 
Reply all
Reply to author
Forward
0 new messages