Nonvolatile application storage, part 1: application ID

287 views
Skip to first unread message

Leon Schuermann

unread,
Apr 4, 2020, 11:37:35 AM4/4/20
to Tock Embedded OS Development Discussion

Hello,

In the last weekly call, the topic of persistent application storage got
some attention. With this Email I'd like to follow up on that.

The term application would be defined as in the threat model PR [1], so
- persisting across reboots and updates (unlike processes)
- consisting of at least one application binary, possibly multiple.

As associating persistent storage to an application will always require
some type of identifier for said application, I'd take the liberty to
start the discussion with collecting and comparing approaches for that
issue.

I can imagine loosely collecting and discussing options in this thread
and then later molding those into a RFC PR for further discussion. I'm
open to suggestions regarding the process.


Some solutions for uniquely and persistently identifying an application
could include:

## An application ID in the TBF header

During the board provisioning phase through the application loader, or
alternatively on the first boot Tock could assign each application a
unique ID. That could be numeric (counter), but does not have to
be. This ID would be stored with the process, but must not be modifiable
by the process itself. A possible location for this identifier is the
TBF header.

Possible issues include:
- The application ID must not change, even when relocating the Tock
applications in memory through an application loader.
- It may or may not be desirable to keep the application ID identical
when reinstalling an application (think TPMs: Should an update clear
cryptographic key material?).
- The application ID must be unique across all applications.
Ideally, that would be checked on boot / during runtime.
- The application must not be able to change its application ID.
- Identification of a compound of multiple application binaries (if that
were to be called a "Tock application" as per definition from [1]).

## A hash over the application's flash memory

Tock could generate a kernel-internal application identifier by
calculating a cryptographic hash of the entire application's flash
memory region, or parts of it. This way, the application itself is its
own persistent ID. No tooling update would be required. This would only
work if the persistent nonvolatile storage region was outside of the
hashed memory region, or on external storage (SD card, EEPROM).

This is motivated by existing cryptographic devices such as TPMs, which
show the behavior of clearing application data during a firmware update.

Possible issues include:
- Change of the application ID by writing to its own memory location.
- Persisting storage during application upgrades.
- Running multiple instances of an identical binary, which are not
considered to build an application compound.
- Hash function in software (#1728)
- Identification of a compound of multiple application binaries (if that
were to be called a "Tock application" as per definition from [1]).

Side note: For the specific use case of storing sensitive information
that must not survive an application upgrade, it may be better to use
this approach not for identifying the application but using the
cryptographic hash along with user provided secrets for a transparent
encryption of the associated nonvolatile storage region (less snake
oil).


I hope this is an adequate start and I'm looking forward to the
discussion. I'd love to get feedback.


Best regards

- Leon Schürmann

[1]: https://github.com/jrvanwhy/tock/blob/threat-model/doc/threat_model/README.md#what-is-an-application

--
This email was composed and sent using free software.
emacs, notmuch, msmtp, qmail, linux, nixos, gnupg, ...

Proudly brought to you without php, outlook, and win32.
signature.asc

Garret Kelly

unread,
Apr 6, 2020, 5:43:12 PM4/6/20
to Tock Embedded OS Development Discussion
I think that this is probably close to ideal. If the kernel is responsible for identifying the application to the storage layer then it becomes a kernel decision about what constitutes an app's identity. This might just be a value in the app's TBF. It might be the hash of the key that signs the TBF in some future where TBFs are signed. Either way, it defers the question of identity to the kernel.

I believe in the vast majority of situations that an update to an application would still want the values it stored to be accessible to the updated application.
 

## A hash over the application's flash memory

Tock could generate a kernel-internal application identifier by
calculating a cryptographic hash of the entire application's flash
memory region, or parts of it. This way, the application itself is its
own persistent ID. No tooling update would be required. This would only
work if the persistent nonvolatile storage region was outside of the
hashed memory region, or on external storage (SD card, EEPROM).

This is motivated by existing cryptographic devices such as TPMs, which
show the behavior of clearing application data during a firmware update.

Possible issues include:
- Change of the application ID by writing to its own memory location.
- Persisting storage during application upgrades.
- Running multiple instances of an identical binary, which are not
considered to build an application compound.
- Hash function in software (#1728)
- Identification of a compound of multiple application binaries (if that
were to be called a "Tock application" as per definition from [1]).

Side note: For the specific use case of storing sensitive information
that must not survive an application upgrade, it may be better to use
this approach not for identifying the application but using the
cryptographic hash along with user provided secrets for a transparent
encryption of the associated nonvolatile storage region (less snake
oil).

Secrets that are bound only to the current version of the application seem strange, do you have a motivating use case for these? At startup do you have to now remove secrets that don't have a corresponding version running?

Pat Pannuto

unread,
Apr 7, 2020, 3:09:30 PM4/7/20
to Garret Kelly, Tock Embedded OS Development Discussion
One question I have here is how (or whether) to support multiple copies of the same application.

We've put some effort into supporting that (i.e. the install directions explicitly call out this case https://github.com/tock/tockloader/#tockloader-install), though I'm not sure I have had occasion to install multiple copies of anything more interesting than blink.

If there are multiple instances of the same application, should they have the same ID? [Or, should we allow multiple instances of the same application, or require them to differ, at least in name [i.e. "blink1" "blink2"], at which point the application ID would naturally differ?]
 
--
You received this message because you are subscribed to the Google Groups "Tock Embedded OS Development Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tock-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tock-dev/7e8bad13-0e18-4919-a951-be429ae696fe%40googlegroups.com.

Leon Schuermann

unread,
Apr 7, 2020, 3:48:09 PM4/7/20
to Tock Embedded OS Development Discussion

Hello,

"'Garret Kelly' via Tock Embedded OS Development Discussion"
<tock...@googlegroups.com> writes:
> I think that this is probably close to ideal. If the kernel is responsible
> for identifying the application to the storage layer then it becomes a
> kernel decision about what constitutes an app's identity. This might just
> be a value in the app's TBF. It might be the hash of the key that signs the
> TBF in some future where TBFs are signed. Either way, it defers the
> question of identity to the kernel.

Indeed. One way or another - ultimately the Kernel must authenticate an
application and permit access to nonvolatile storage areas.

> I believe in the vast majority of situations that an update to an
> application would still want the values it stored to be accessible to the
> updated application.
> [...]
> Secrets that are bound only to the current version of the application seem
> strange, do you have a motivating use case for these? At startup do you
> have to now remove secrets that don't have a corresponding version running?

Yes, there is a quite prominent use case: HSMs (security keys, TPMs,
enterprise HSMs, etc.).

Some devices allow upgrades while retaining cryptographic keys and
information, as they trust their upgrade process with validating the
authenticity of the new firmware sufficiently.

Yet others (common example being YubiKey) do not even have any firmware
update functionality, as to limit the attack area (up to my knowledge).

A common middle ground I've seen with such devices is that they do allow
for firmware upgrades, but clear their storage area beforehand. However
that clearing process is very likely susceptible to fault injection
attacks.

For my scenario I'm assuming a chip with locked flash (and those
protections actually working), but having an updater Tock application as
described in the threat model PR. Using a cryptographic hash of the
application for symmetric encryption of the nonvolatile storage region -
best mixed with a key supplied by the user - will eliminate many attack
vectors, as the code calculating this hash and performing the crypto
(kernel) cannot be changed during the upgrade or faulted easily. One
would have to inject the entire old applications' hash into kernel
memory somehow. Without the exact same application in flash, the
application cannot access sensitive data.


While an interesting concept, it deviates from the original goal of
deciding on an identifier for an application with regards to
_associating_ nonvolatile storage regions.

As with any regular OS, setting ownership & permissions on a file and
encrypting it can both be helpful, but ultimately solve different
issues. Hence I agree that calculating a hash over the apps' flash
region to use as an identifier for said app is probably not the best
solution. For advanced levels of protection other mechanisms are
required.


I would like to invite anyone to contribute other ideas regarding
persistent application identification with Tock. I'd probably give this
some two weeks until we could look into implementation of the most
popular approach.


Best regards

- Leon Schuermann
signature.asc

Leon Schuermann

unread,
Apr 7, 2020, 4:19:07 PM4/7/20
to Tock Embedded OS Development Discussion

Pat Pannuto <pat.p...@gmail.com> writes:
> One question I have here is how (or whether) to support multiple copies of
> the same application.
>
> We've put some effort into supporting that (i.e. the install directions
> explicitly call out this case
> https://github.com/tock/tockloader/#tockloader-install), though I'm not
> sure I have had occasion to install multiple copies of anything more
> interesting than blink.
>
> If there are multiple instances of the same application, should they have
> the same ID? [Or, should we allow multiple instances of the same
> application, or require them to differ, at least in name [i.e. "blink1"
> "blink2"], at which point the application ID would naturally differ?]

I actually did not expect this to be brought up (or even being a
supported use case currently), nevertheless also listed it as a possible
issue with the hash-based approach.


Considering the TBF header ID approach, I do not think that this would
be problematic. As long as the app resides in flash twice, with the
individual TBF header of the second copy containing a different
(pseudo)random application identifier as the first, separating those in
terms of nonvolatile storage would work.

However, whether we _should_ do this is probably up to the threat
model. In its current state[1] it is deliberately vague regarding
compound applications in general, not to mention identical application
binaries.

With the TBF header ID approach I see no technical obstacle:

- If compound applications aren't technically fitting Tock's application
definition, we should enforce uniqueness across all application ids in
the tooling / on boot.
- If compound applications (identical binary or not) are to be called
applications and should share nonvolatile storage, permit duplicate
application ids.
- If two processes were to be spawned from a single binary, they would
share the TBF header and hence be recognized as the same
"application" (in case that's ever going to be possible).


>>> ## A hash over the application's flash memory

With the hash-based approach, the situation is a lot different. Compound
applications in general are an issue here and would need to be worked
around. A workaround would likely include fields in the TBF header and
thereby miss the goal of not requiring any additional value to be
stored.

Best regards,

- Leon Schuermann

[1]: https://github.com/tock/tock/blob/bb8c021833355bb69bd75e8ed652e9e88dcd07eb/doc/threat_model/README.md#what-is-an-application
signature.asc

Johnathan Van Why

unread,
Apr 7, 2020, 7:49:51 PM4/7/20
to Leon Schuermann, Tock Embedded OS Development Discussion
Here are several forms an application ID could take (there may be more):
  1. Use the "package name" field that is already in the TBF headers.
  2. An ID provided by the binary (via TBF headers or runtime syscall).
  3. A hash of the binary.
  4. A hash of the binary and its location in flash (so that multiple copies of a binary receive distinct IDs).
  5. A public key, with the binary signed by the corresponding private key.
  6. The same as #5, but add a binary-provided salt to the key, so that multiple copies of the same binary have distinct IDs.
Here are several different times at which the ID can be computed/verified (particularly for the cryptographic IDs):
  1. Never checked.
  2. Checked by the application loader. This would require the application loader to check the ID with the user or to develop its own authentication system. The application loader could theoretically *set* the application ID as well.
  3. Checked by the kernel before starting the process, but still launch the process if the check fails.
  4. Checked at runtime by a shared "ID validation" driver once.
  5. Checked each time it is needed.
Next, lets start enumerating use cases for an application ID:
  1. Secure boot: only launch a binary if is hash/signature is valid (only applies to the cryptographic IDs) and on a supported list.
  2. Storage: Tie non-volatile storage to the application ID.
  3. Cryptography: We can derive encryption keys specific to an applicaction ID, and allow applications to work with -- and only with -- keys corresponding to their ID.
There are 6 * 5 * 3 == 90 combinations of the above, not all of which are valid (e.g. "never checked" + "secure boot" don't go together). If we come up with other options, the number of possibilities will grow. Tock should support a variety of use cases, but probably shouldn't support every combination of the above.

Pat's question is particularly interesting when the ID is cryptographic. Can you install two copies of a signed app and give them distinct IDs? If yes, does the signature skip the ID field? What if a single binary is launched multiple times as distinct processes, with distinct RAM regions, and you want each process to have a distinct application ID?

At a minimum, I expect OpenTitan to want the following:
  1. Signed binaries
  2. Secure boot
  3. Storage ACLs
  4. Application-specific cryptographic keys
Leon, do you have a use case for application IDs? If so, what capabilities does your use case require?

-Johnathan Van Why

--
You received this message because you are subscribed to the Google Groups "Tock Embedded OS Development Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tock-dev+u...@googlegroups.com.

Leon Schuermann

unread,
Apr 9, 2020, 4:14:45 PM4/9/20
to Johnathan Van Why, Tock Embedded OS Development Discussion

Thank you very much for that extensive summary. I think you covered a
very broad variety of approaches, usages and use cases. Those will
provide a good discussion ground. Nonetheless there might be something
we forgot, so other options are still welcome!

Johnathan Van Why <jrva...@google.com> writes:
> Leon, do you have a use case for application IDs? If so, what capabilities
> does your use case require?

Personally I don't have a specific use case for application IDs other
than wanting an application-local persistent storage. This discussion is
primarily motivated by that goal itself, without respect of my specific
requirements; how I understood it this issue is not only relevant to me.

> At a minimum, I expect OpenTitan to want the following:
>
> 1. Signed binaries
> 2. Secure boot
> 3. Storage ACLs
> 4. Application-specific cryptographic keys

> There are 6 * 5 * 3 == 90 combinations of the above, not all of which are
> valid (e.g. "never checked" + "secure boot" don't go together). If we come
> up with other options, the number of possibilities will grow. Tock should
> support a variety of use cases, but probably shouldn't support every
> combination of the above.

I agree. We have a large number of directions to go from here. I do not
think is feasible to cover each and every use case possibly dependent on
some kind of application ID. In retrospective I regret bringing the
transparent storage encryption idea up, as that shifted the discussion
from its original intention: to find an application ID suitable to map
nonvolatile storage regions to applications, however that term may be
defined. I would recommend to focus on that first and take an iterative
approach from there.

> Here are several forms an application ID could take (there may be more):
>
> 1. Use the "package name" field that is already in the TBF headers.
> 2. An ID provided by the binary (via TBF headers or runtime syscall).
> 3. A hash of the binary.
> 4. A hash of the binary and its location in flash (so that multiple
> copies of a binary receive distinct IDs).
> 5. A public key, with the binary signed by the corresponding private key.
> 6. The same as #5, but add a binary-provided salt to the key, so that
> multiple copies of the same binary have distinct IDs.

To reduce confusion around terminology, I would group these into three
types. Each type has distinct characteristics that apply to all members.
- artificial IDs
- symmetric cryptographic IDs (w/o salt of some kind)
- asymmetric cryptographic IDs (w/o salt of some kind)

> Here are several different times at which the ID can be computed/verified
> (particularly for the cryptographic IDs):
>
> 1. Never checked.
> 2. Checked by the application loader. This would require the application
> loader to check the ID with the user or to develop its own authentication
> system. The application loader could theoretically *set* the application ID
> as well.
> 3. Checked by the kernel before starting the process, but still launch
> the process if the check fails.
> 4. Checked at runtime by a shared "ID validation" driver once.
> 5. Checked each time it is needed.

I think the stages when and where an ID would be
respected/checked/verified depends on the ID type and the associated use
cases.

> Pat's question is particularly interesting when the ID is cryptographic.
> Can you install two copies of a signed app and give them distinct IDs? If
> yes, does the signature skip the ID field? What if a single binary is
> launched multiple times as distinct processes, with distinct RAM regions,
> and you want each process to have a distinct application ID?

Indeed it is, though having multiple instances of a single binary would
also break naive approaches of artificial IDs.


As feared this discussion appears to be of at least cubic complexity or
something along those lines. :)
The many options and interdependencies cannot be reasonably compared,
and especially not discussed one-for-one on a mailing list. Hence I'd
argue that we should limit ourselves to the nonvolatile application
storage use case first.

Even then we have quite some approaches already, the validity of many
depending on the definition of what's an "application" exactly. I fear
that the right approach would be to define that first, considering the
impact on nonvolatile storage sharing amongst other things, and using
these new conditions for deciding on an implementation approach.


@Jonathan: Looking on the threat model PR the term application was
deliberately not well defined as there was ongoing discussion there,
right? I believe that we have already hit the point ("updated in the
future") where knowing the impacts of that concept is necessary to
evaluate technical implementations. Would it make sense to go a step
back and start there? What do you think?


Best regards

- Leon Schuermann

signature.asc

Johnathan Van Why

unread,
Apr 9, 2020, 5:45:49 PM4/9/20
to Leon Schuermann, Tock Embedded OS Development Discussion
I am concerned that developing an application ID specifically for
persistent storage will result in an application ID design that is
poorly suited for the other use cases. That would either hurt those
use cases, or result in us having multiple types of application ID,
which would be confusing (which hurts security) and waste flash space.
I brought up the other use cases because I believe that considering
them now is important.

Do you have a rough idea of the timeline on which you want to have
persistent storage? I suspect that the OpenTitan project will want to
take a hard look at the security consequences of Tock's application ID
design.

> > Here are several forms an application ID could take (there may be more):
> >
> > 1. Use the "package name" field that is already in the TBF headers.
> > 2. An ID provided by the binary (via TBF headers or runtime syscall).
> > 3. A hash of the binary.
> > 4. A hash of the binary and its location in flash (so that multiple
> > copies of a binary receive distinct IDs).
> > 5. A public key, with the binary signed by the corresponding private key.
> > 6. The same as #5, but add a binary-provided salt to the key, so that
> > multiple copies of the same binary have distinct IDs.
>
> To reduce confusion around terminology, I would group these into three
> types. Each type has distinct characteristics that apply to all members.
> - artificial IDs
> - symmetric cryptographic IDs (w/o salt of some kind)
> - asymmetric cryptographic IDs (w/o salt of some kind)

If by "w/o", you mean "with or without", then I agree.

> > Here are several different times at which the ID can be computed/verified
> > (particularly for the cryptographic IDs):
> >
> > 1. Never checked.
> > 2. Checked by the application loader. This would require the application
> > loader to check the ID with the user or to develop its own authentication
> > system. The application loader could theoretically *set* the application ID
> > as well.
> > 3. Checked by the kernel before starting the process, but still launch
> > the process if the check fails.
> > 4. Checked at runtime by a shared "ID validation" driver once.
> > 5. Checked each time it is needed.
>
> I think the stages when and where an ID would be
> respected/checked/verified depends on the ID type and the associated use
> cases.
>
> > Pat's question is particularly interesting when the ID is cryptographic.
> > Can you install two copies of a signed app and give them distinct IDs? If
> > yes, does the signature skip the ID field? What if a single binary is
> > launched multiple times as distinct processes, with distinct RAM regions,
> > and you want each process to have a distinct application ID?
>
> Indeed it is, though having multiple instances of a single binary would
> also break naive approaches of artificial IDs.
>
>
> As feared this discussion appears to be of at least cubic complexity or
> something along those lines. :)

I don't expect this to be decided in two weeks due to the complexity.

> The many options and interdependencies cannot be reasonably compared,
> and especially not discussed one-for-one on a mailing list. Hence I'd
> argue that we should limit ourselves to the nonvolatile application
> storage use case first.
>
> Even then we have quite some approaches already, the validity of many
> depending on the definition of what's an "application" exactly. I fear
> that the right approach would be to define that first, considering the
> impact on nonvolatile storage sharing amongst other things, and using
> these new conditions for deciding on an implementation approach.
>
>
> @Jonathan: Looking on the threat model PR the term application was
> deliberately not well defined as there was ongoing discussion there,
> right? I believe that we have already hit the point ("updated in the
> future") where knowing the impacts of that concept is necessary to
> evaluate technical implementations. Would it make sense to go a step
> back and start there? What do you think?

It was deliberately left undefined because it's a complex topic that
we cannot resolve quickly. I think the conceptual question of "what is
an application" and the technical question of "how do we implement
application IDs" are closely intertwined, and must be discussed
together.

-Johnathan

Leon Schuermann

unread,
Apr 19, 2020, 6:13:08 PM4/19/20
to Johnathan Van Why, Tock Embedded OS Development Discussion

Johnathan Van Why <jrva...@google.com> writes:
> Do you have a rough idea of the timeline on which you want to have
> persistent storage? I suspect that the OpenTitan project will want to
> take a hard look at the security consequences of Tock's application ID
> design.

Having nonvolatile storage regions associated to applications is an
important goal and required for many use cases. I do not necessarily
have a deadline or hard requirements regarding this. Personally, Tock
and all use cases are of hobby nature - hence I'm open to delaying this
if deemed necessary. I started this thread as a result of the Core WG
call on 2020-03-04 to have a public discussion on application IDs for a
variety of application areas (a prominent one being nonvolatile
storage).

> I am concerned that developing an application ID specifically for
> persistent storage will result in an application ID design that is
> poorly suited for the other use cases. That would either hurt those
> use cases, or result in us having multiple types of application ID,
> which would be confusing (which hurts security) and waste flash space.
> I brought up the other use cases because I believe that considering
> them now is important.

That does make sense. I do not want to introduce several application
identifiers, exactly for the reasons stated. My intention was to focus
on a single application identifier in an iterative process, in a second
step analyzing how that would work for a second application area. I now
see how that strategy would generate a large overhead not fit for this
discussion medium.

Tock already has an application identifier, namely the package name as
part of the TBF header[1]. This certainly is a candidate to be used for
associating resources to an application and must be analysed. However if
that is not deemed appropriate for the required use cases, we should
introduce an identifier next to that. This way we would in a sense still
have two identifiers (I think that would be fine): a human-readable and
an "internal" application identifier.

>> To reduce confusion around terminology, I would group these into three
>> types. Each type has distinct characteristics that apply to all members.
>> - artificial IDs
>> - symmetric cryptographic IDs (w/o salt of some kind)
>> - asymmetric cryptographic IDs (w/o salt of some kind)
>
> If by "w/o", you mean "with or without", then I agree.

Indeed, sorry for that.

>> @Jonathan: Looking on the threat model PR the term application was
>> deliberately not well defined as there was ongoing discussion there,
>> right? I believe that we have already hit the point ("updated in the
>> future") where knowing the impacts of that concept is necessary to
>> evaluate technical implementations. Would it make sense to go a step
>> back and start there? What do you think?
>
> It was deliberately left undefined because it's a complex topic that
> we cannot resolve quickly. I think the conceptual question of "what is
> an application" and the technical question of "how do we implement
> application IDs" are closely intertwined, and must be discussed
> together.

That makes sense. Nonetheless, I am specifically concerned about two
questions:

- can two processes of different binaries be considered the same
application?
- are two processes of a single binary in flash considered the same
application?

I feel those are questions having great influence on what is possible
(and deliberately not allowed) with Tock applications. I'm not saying
that this should be decided entirely independent from identifying the
application (however the term is defined). However I am concerned that
attacking this problem primarily from the technical aspect of finding an
appropriate application ID will not lead to the greater goal of
providing a sound application model or require unnecessary compromises.

-

Taking a step back from the meta-discussion, regarding each application
identifier and the use-cases I am concerned about the following:

### symmetric cryptographic IDs

I am assuming that symmetric IDs are not stored explicitly in flash, but
rather calculated on demand (possibly cached in RAM). Symmetric IDs not
calculated / validated at runtime are arguably artificial IDs.

- This would require some kind of hashing support for the kernel at
runtime. On platforms without HW support, software would need to fill
the gap. There are efforts to upstream this, most notably #1728.
Having a software implementation of hash functions increases required
resources (flash, RAM, compute) and can be an issue on small
platforms. In security critical environments, hash functions in
software might not be feasible.
Possibly, multiple different hash functions would need to be
supported to enable HW support on many platforms.

- Associating persistent resources (e.g. nonvolatile storage) to
applications using symmetric IDs will significantly reduce flexibility
for application developers. Self-rewriting applications, upgrades and
other dynamic approaches without losing access to associated
resources will become difficult at best.

- I do not see any possibility of combining multiple different binaries
into a single application.
On the other hand, multiple instances of the same binary can be
distinguished by including the instance id as a salt (counting from 0
up to n, where n would be included in the TBF) in the hash function.
This does not work when some instances of the same binary should be
considered the same application, while others should not.

## asymmetric cryptographic IDs

- _all concerns from above_
I can imagine the software implementations to be much more problematic
due to high complexity and resource requirements, plus less HW
available.

- What part of the signing process would be used as an ID exactly? I can
imagine multiple things:
- the hash over which the signature has been generated (effectively
going back to symmetric IDs, with the additional benefit of having a
signed binary)
- the cryptographic signature itself
- the fingerprint (hash) of some kind of (virtual) certificate,
containing the binary / hash, other information and the
cryptographic signature

## artificial IDs

Artificial IDs are very flexible, but cannot double as a part of a
cryptographic verification processes. Protecting sensitive information /
resources with artifical IDs likely requires extreme care and secure
tooling.

-

>> > At a minimum, I expect OpenTitan to want the following:
>> >
>> > 1. Signed binaries
>> > 2. Secure boot
>> > 3. Storage ACLs
>> > 4. Application-specific cryptographic keys

Can you elaborate on the difference between "Signed binaries" and
"Secure boot"? Coming from x86 UEFI, one seems like a tool to accomplish
the other.

-

To allow more fine grained control with the artificial IDs approach, one
could think about the concept of resource groups instead of a single ID.

A binary would include a variable length TBF header field containing
multiple resource group entries it is designed to have access to. During
runtime an application choose from which context it would like to use a
resource (such as nonvolatile storage regions) based on the supplied
resource group.

This could allow multiple binaries to work as one application, sharing
some or all resources.

To have a single binary instantiate multiple applications, it could have
an "instances" TBF header field, containing an array of further TBF
header sections, one per instance. Tock would spawn as many instances as
there are instance-headers, allowing fine grained control per instance -
not only for resource association.

This is just an idea and needs further development.

-

Sorry for the delayed response.

- Leon Schuermann

[1]: https://github.com/tock/tock/blob/master/doc/TockBinaryFormat.md#3-package-name
signature.asc

Johnathan Van Why

unread,
Apr 24, 2020, 7:31:35 PM4/24/20
to Leon Schuermann, Tock Embedded OS Development Discussion
Note that the package name is unique per binary, which means that if we decide to use the package name as an application ID then we would be choosing to not support multi-binary applications.

I suspect we will end up supporting both human-readable IDs and cryptographic IDs because working with cryptographic IDs seems too high-effort for many use cases.


> >> To reduce confusion around terminology, I would group these into three
> >> types. Each type has distinct characteristics that apply to all members.
> >> - artificial IDs
> >> - symmetric cryptographic IDs (w/o salt of some kind)
> >> - asymmetric cryptographic IDs (w/o salt of some kind)
> >
> > If by "w/o", you mean "with or without", then I agree.
>
> Indeed, sorry for that.
>
> >> @Jonathan: Looking on the threat model PR the term application was
> >> deliberately not well defined as there was ongoing discussion there,
> >> right? I believe that we have already hit the point ("updated in the
> >> future") where knowing the impacts of that concept is necessary to
> >> evaluate technical implementations. Would it make sense to go a step
> >> back and start there? What do you think?
> >
> > It was deliberately left undefined because it's a complex topic that
> > we cannot resolve quickly. I think the conceptual question of "what is
> > an application" and the technical question of "how do we implement
> > application IDs" are closely intertwined, and must be discussed
> > together.
>
> That makes sense. Nonetheless, I am specifically concerned about two
> questions:
>
> - can two processes of different binaries be considered the same
>   application?

I think there are three possible answers to "can two processes of different binaries be considered the same application?":
  1. No. Two binaries can only be the same application if they are bitwise-identical.
  2. Yes, but not concurrently. This means that you can update an application binary to a new version and still consider it the same application, but you cannot simultaneously run processes of different binaries with the same application ID.
  3. Yes, including concurrently: You can have multiple processes of different binaries running concurrently with a single application ID.

> - are two processes of a single binary in flash considered the same
>   application?

A more general question is "do we want to support loading a single binary as multiple processes simultaneously"? If the answer to that is "no" then this question is moot.
We could let the mechanism that instantiates the binary specify the salt for each launched binary. Doing so would open up another question: do we include the list of applications in the app's signature?


> ## asymmetric cryptographic IDs
>
> - _all concerns from above_
>   I can imagine the software implementations to be much more problematic
>   due to high complexity and resource requirements, plus less HW
>   available.
>
> - What part of the signing process would be used as an ID exactly? I can
>   imagine multiple things:

The ID would be the public key for the signature.

>   - the hash over which the signature has been generated (effectively
>     going back to symmetric IDs, with the additional benefit of having a
>     signed binary)
>   - the cryptographic signature itself
>   - the fingerprint (hash) of some kind of (virtual) certificate,
>     containing the binary / hash, other information and the
>     cryptographic signature
>
> ## artificial IDs
>
> Artificial IDs are very flexible, but cannot double as a part of a
> cryptographic verification processes. Protecting sensitive information /
> resources with artifical IDs likely requires extreme care and secure
> tooling.
>
> -
>
> >> > At a minimum, I expect OpenTitan to want the following:
> >> >
> >> >    1. Signed binaries
> >> >    2. Secure boot
> >> >    3. Storage ACLs
> >> >    4. Application-specific cryptographic keys
>
> Can you elaborate on the difference between "Signed binaries" and
> "Secure boot"? Coming from x86 UEFI, one seems like a tool to accomplish
> the other.

By "signed binaries", I was referring to the form the application ID takes. In this case, it means the application ID is a public key and there is a corresponding signature checked by the kernel.

"Secure boot" refers to the kernel only booting apps from a predefined list of acceptable application IDs.

So yes, signed binaries would be a mechanism used to accomplish secure boot. Signed binaries would also be a tool used for storage ACLs and application-specific cryptographic keys.

Julien Cretin

unread,
May 20, 2020, 11:55:04 AM5/20/20
to Johnathan Van Why, Leon Schuermann, Tock Embedded OS Development Discussion
Hi all,

It looks to me that we don't need an application id to add persistent storage support in Tock. It would be a sufficient condition, but it's not a necessary condition. From a high-level point of view, what we need is the following relation:

    has_access: Application -> Permission -> Storage -> bool

    enum Permission {
        Read,
        Write,
        Erase,
    }


(The permission part may be dropped if we believe permissions should always be either none or all, but I'll keep it for the rest since it doesn't hurt.)

We have `has_access(application, permission, storage)` only if the owner of the storage agrees that the application may access it with that permission. How this is done is discussed later.

If we have `has_access(application, permission, storage)` then the application may access the storage with that permission. How this is done is discussed later.

What seems like the simplest way to implement this would be for each storage to come with a public key. This is how a storage is identified.

We extend the Tock Binary Format Header (TBF header) with the following:

-   A new kind of header (either a different version or some other solution) that specifies a storage instead of an application. It contains the following fields:
    -   The storage identity (i.e. its public key).
    -   The storage location (e.g. start and end address).
-   For application headers (header version 2), we add a list of storages in a similar way as writable flash regions. Each element of this list would contain the following fields:
    -   The identity of the storage.
    -   (optional: The location of the storage if the application does not support a storage at arbitrary location.)
    -   A signature (with the storage private key) of the permission and the application binary. In particular, each application update needs a new approval from the storage owner. Different applications may access the same storage if the owner of the storage allows it.

When the kernel boots and goes through the linked list of TBF headers, it not only creates the processes, but also creates the storages metadata. This means that in addition to the list of processes, the kernel also stores the list of storages. A storage in the kernel would contain its identity and location. When the kernel creates a process, it checks for each storage that the storage exists and the permission is valid.

Access to a storage is done in 2 places. When the process is created, the MPU is configured to give read access to the storage location (do we know any boards where the flash is not mapped to memory? If yes, this needs to be parametrized). When the application accesses the flash syscall driver, the driver checks the permission. For that, the kernel exposes in `Process` a method to check if the process has a given permission for a given flash slice, essentially looping through the storages where the slice fits and checking the permission for that storage.

Note that if we don't want the kernel to iterate twice through the linked list of TBF headers (first to parse storages then to parse processes), storages need to be defined before they are used. This should be easy to check in tockloader.

This is very high-level, but I think agreeing on the high-level APIs would help parallelize the work on sub-components. Does this high-level picture fit into Tock design principles?

Thanks,
Julien

--
You received this message because you are subscribed to the Google Groups "Tock Embedded OS Development Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tock-dev+u...@googlegroups.com.

Johnathan Van Why

unread,
May 20, 2020, 3:35:19 PM5/20/20
to Julien Cretin, Leon Schuermann, Tock Embedded OS Development Discussion
On Wed, May 20, 2020 at 8:55 AM Julien Cretin <julien.cr...@polytechnique.org> wrote:
Hi all,

It looks to me that we don't need an application id to add persistent storage support in Tock. It would be a sufficient condition, but it's not a necessary condition.

If we keep this proposal scoped to persistent storage, then we will end up in a similar situation to "Tock has multiple distinct types of application ID". We just wouldn't call them "application IDs".
 
From a high-level point of view, what we need is the following relation:

    has_access: Application -> Permission -> Storage -> bool

    enum Permission {
        Read,
        Write,
        Erase,
    }


(The permission part may be dropped if we believe permissions should always be either none or all, but I'll keep it for the rest since it doesn't hurt.)

We have `has_access(application, permission, storage)` only if the owner of the storage agrees that the application may access it with that permission. How this is done is discussed later.

If we have `has_access(application, permission, storage)` then the application may access the storage with that permission. How this is done is discussed later.

What seems like the simplest way to implement this would be for each storage to come with a public key. This is how a storage is identified.

We extend the Tock Binary Format Header (TBF header) with the following:

-   A new kind of header (either a different version or some other solution) that specifies a storage instead of an application. It contains the following fields:
    -   The storage identity (i.e. its public key).
    -   The storage location (e.g. start and end address).

Defining storage locations as address ranges in flash is extremely coarse and fairly wasteful. These storage regions would need to be aligned to the nearest flash page, and flash pages are quite large (e.g. 2 KiB in the H1 chip). This design also prevents wear levelling: if one storage is accessed heavily its flash pages will wear out much quicker than other. These concerns are some of the motivating reasons why OpenTitan is pursuing a key value store API rather than raw flash access.
 
-   For application headers (header version 2), we add a list of storages in a similar way as writable flash regions. Each element of this list would contain the following fields:
    -   The identity of the storage.
    -   (optional: The location of the storage if the application does not support a storage at arbitrary location.)
    -   A signature (with the storage private key) of the permission and the application binary. In particular, each application update needs a new approval from the storage owner. Different applications may access the same storage if the owner of the storage allows it.

If I understand this correctly, this means that different processes may access a single storage concurrently. This causes race conditions that will need to be handled. Some of the possible choices for application ID don't have this issue.
 

When the kernel boots and goes through the linked list of TBF headers, it not only creates the processes, but also creates the storages metadata. This means that in addition to the list of processes, the kernel also stores the list of storages.

Where would the kernel store this list? The only dynamic memory allocation supported in the Tock kernel is grant regions, which is per-process. That said, I don't see the need for this list either. Unless I am missing something, the kernel could reference the original storage list when it needs it.
 
A storage in the kernel would contain its identity and location. When the kernel creates a process, it checks for each storage that the storage exists and the permission is valid.

Access to a storage is done in 2 places. When the process is created, the MPU is configured to give read access to the storage location (do we know any boards where the flash is not mapped to memory? If yes, this needs to be parametrized). When the application accesses the flash syscall driver, the driver checks the permission. For that, the kernel exposes in `Process` a method to check if the process has a given permission for a given flash slice, essentially looping through the storages where the slice fits and checking the permission for that storage.

Note that if we don't want the kernel to iterate twice through the linked list of TBF headers (first to parse storages then to parse processes), storages need to be defined before they are used. This should be easy to check in tockloader.

This is very high-level, but I think agreeing on the high-level APIs would help parallelize the work on sub-components. Does this high-level picture fit into Tock design principles?

Let me paint this design in a different light, using different terminology to refer to the same concepts. Instead of a "storage", you just have an "application". The global list of "storages" would be a global list of "applications". The "storage identity (public key)" would become the "application identity (public key)". A process binary can be signed by one or more "applications", indicating that process is part of that application and has access to the application's storage region.

With the renaming, we can then categorize this design based on the axes I outlined in my first message in this thread:
  • Form: 5, a public key, with the binary signed by the corresponding key
  • Verification time: 3 (app startup), plus maybe 5 (runtime -- this depends on how the runtime permissions check works)
  • Use cases supported: 1 (if the kernel only boots applications in the application list, and that list is signed with the kernel image), 2 (storage), and 3 (cryptography -- each application could have its own crypto realm based on the public key).
I have the following concerns with this design:
  1. It requires crypto. Some Tock users may not want to involve crypto in the application build/deployment process. Not all hardware Tock supports has hardware-accelerated crypto, so this could have a lot of runtime overhead.
  2. It seems overly complex for what it does. I think we could handle everything in per-app headers, rather than introducing a new type of header into Tock.
-Johnathan

Julien Cretin

unread,
May 21, 2020, 7:58:02 AM5/21/20
to Johnathan Van Why, Leon Schuermann, Tock Embedded OS Development Discussion
Hi Johnathan,

On Wed, May 20, 2020 at 9:35 PM 'Johnathan Van Why' via Tock Embedded OS Development Discussion <tock...@googlegroups.com> wrote:
On Wed, May 20, 2020 at 8:55 AM Julien Cretin <julien.cr...@polytechnique.org> wrote:
Hi all,

It looks to me that we don't need an application id to add persistent storage support in Tock. It would be a sufficient condition, but it's not a necessary condition.

If we keep this proposal scoped to persistent storage, then we will end up in a similar situation to "Tock has multiple distinct types of application ID". We just wouldn't call them "application IDs".

I guess the issue would be about naming, because if there are distinct types of application IDs it's probably because they have different semantics. I guess what you suggest is to use the finest application ID definition (the one that can differentiate at least as much as all others) so that all other application IDs can be computed from this one. One issue when doing that is that modularity of updates is lost (updating a single application may require to modify its identity in different places even though the actual type of application ID needed would not have changed). So this is mostly a compromise between modularity of updates and multiplicity of application ID semantics. If the considered use-cases only update all applications simultaneously, then modularity of updates is not useful. In that case it's clearly preferable to have a single application ID definition (the finest one).

Another point to consider would be that if coming up with a single application ID definition takes much more time than coming up with some other use-cases independently, it may be preferable to do it in 2 steps: first experiment with other application ID use-cases, then unify them with the gained hindsights. I don't know how long it may take to come up with a unified application ID definition with Tock approval stamp. If we believe it's less than 6 months, then it's not worth doing the 2 step scenario.

From a high-level point of view, what we need is the following relation:

    has_access: Application -> Permission -> Storage -> bool

    enum Permission {
        Read,
        Write,
        Erase,
    }


(The permission part may be dropped if we believe permissions should always be either none or all, but I'll keep it for the rest since it doesn't hurt.)

We have `has_access(application, permission, storage)` only if the owner of the storage agrees that the application may access it with that permission. How this is done is discussed later.

If we have `has_access(application, permission, storage)` then the application may access the storage with that permission. How this is done is discussed later.

What seems like the simplest way to implement this would be for each storage to come with a public key. This is how a storage is identified.

We extend the Tock Binary Format Header (TBF header) with the following:

-   A new kind of header (either a different version or some other solution) that specifies a storage instead of an application. It contains the following fields:
    -   The storage identity (i.e. its public key).
    -   The storage location (e.g. start and end address).

Defining storage locations as address ranges in flash is extremely coarse and fairly wasteful. These storage regions would need to be aligned to the nearest flash page, and flash pages are quite large (e.g. 2 KiB in the H1 chip). This design also prevents wear levelling: if one storage is accessed heavily its flash pages will wear out much quicker than other. These concerns are some of the motivating reasons why OpenTitan is pursuing a key value store API rather than raw flash access.

I guess you have in mind use-cases with multiple applications using persistent storage. Because if there is a single application running on top of Tock, it's actually as efficient and more flexible for the application to have access to raw flash locations and implement flash-efficient data-structures on top of it, because the persistent storage would have a Rust API instead of a syscall API, so richer types can be used.

For multiple applications it's indeed better to share the flash-efficient data-structures in Tock to avoid each application to have the library code in flash (this is actually a general issue of multiple applications use-cases that induce code duplication between applications, for example some libtock code or some basic language libraries). Note also that an alternative to avoid complexifying the kernel code would be to have a single application handling the persistent storage (in that case it may have access to raw flash locations) and providing some IPC interface for other applications. However, there is still the issue that the persistent storage API would be described in terms of simple data (integers and slices) instead of arbitrary Rust data (enums, lists, nested structures, etc).

Note that one problem with multiple applications accessing a single persistent storage, is that some permissions need to be defined regarding capacity (how many words an application may store at a given time) and lifetime (how many words an application can write in the lifetime of the flash).

-   For application headers (header version 2), we add a list of storages in a similar way as writable flash regions. Each element of this list would contain the following fields:
    -   The identity of the storage.
    -   (optional: The location of the storage if the application does not support a storage at arbitrary location.)
    -   A signature (with the storage private key) of the permission and the application binary. In particular, each application update needs a new approval from the storage owner. Different applications may access the same storage if the owner of the storage allows it.

If I understand this correctly, this means that different processes may access a single storage concurrently. This causes race conditions that will need to be handled. Some of the possible choices for application ID don't have this issue.

This decision is up to the storage owner. The storage owner reviews the application code before signing it.

When the kernel boots and goes through the linked list of TBF headers, it not only creates the processes, but also creates the storages metadata. This means that in addition to the list of processes, the kernel also stores the list of storages.

Where would the kernel store this list? The only dynamic memory allocation supported in the Tock kernel is grant regions, which is per-process. That said, I don't see the need for this list either. Unless I am missing something, the kernel could reference the original storage list when it needs it.

The kernel would store this list next to the list of processes: https://github.com/tock/tock/blob/b2e4c162aa7e6764e8368dc64ceabadc4cef1d88/kernel/src/sched.rs#L34.
There is no need for dynamic allocation. The board would statically allocate the list of storages as this is currently the case for processes.
 
A storage in the kernel would contain its identity and location. When the kernel creates a process, it checks for each storage that the storage exists and the permission is valid.

Access to a storage is done in 2 places. When the process is created, the MPU is configured to give read access to the storage location (do we know any boards where the flash is not mapped to memory? If yes, this needs to be parametrized). When the application accesses the flash syscall driver, the driver checks the permission. For that, the kernel exposes in `Process` a method to check if the process has a given permission for a given flash slice, essentially looping through the storages where the slice fits and checking the permission for that storage.

Note that if we don't want the kernel to iterate twice through the linked list of TBF headers (first to parse storages then to parse processes), storages need to be defined before they are used. This should be easy to check in tockloader.

This is very high-level, but I think agreeing on the high-level APIs would help parallelize the work on sub-components. Does this high-level picture fit into Tock design principles?

Let me paint this design in a different light, using different terminology to refer to the same concepts. Instead of a "storage", you just have an "application". The global list of "storages" would be a global list of "applications". The "storage identity (public key)" would become the "application identity (public key)". A process binary can be signed by one or more "applications", indicating that process is part of that application and has access to the application's storage region.

That's an interesting way to see it :-)

With the renaming, we can then categorize this design based on the axes I outlined in my first message in this thread:
  • Form: 5, a public key, with the binary signed by the corresponding key
  • Verification time: 3 (app startup), plus maybe 5 (runtime -- this depends on how the runtime permissions check works)
  • Use cases supported: 1 (if the kernel only boots applications in the application list, and that list is signed with the kernel image), 2 (storage), and 3 (cryptography -- each application could have its own crypto realm based on the public key).
I have the following concerns with this design:
  1. It requires crypto. Some Tock users may not want to involve crypto in the application build/deployment process. Not all hardware Tock supports has hardware-accelerated crypto, so this could have a lot of runtime overhead.
The crypto part can be optional. But in that case there is no guarantee that the storage owner agrees that the application may access it. Is it possible to have this property without crypto? I guess you have in mind the use-case where all applications are bundled together and signed by a single person. That person would be responsible for checking all permissions before signing. But then I don't see how the signature would be checked if not by the chip during boot.
  1. It seems overly complex for what it does. I think we could handle everything in per-app headers, rather than introducing a new type of header into Tock.
I think the main argument against this proposal as I see it, is the multiple applications scenario. This is not something I considered and I don't know how wide-spread those scenarios are (because they are quite wasteful in terms of flash due to code duplication and MPU alignment constraints).

A simple way to modify this proposal to remove the issues you see would be to have a single set of flash locations defined in the board (together with an optional public key). The kernel would provide a syscall API for a flash-efficient data-structure on top of those locations. This permits to remove the need for the new header for storages since there is only one and it's defined in the board. The application would still have a set of permissions in their header (how much capacity and lifetime they are allowed to use) which would be optionally signed (if crypto is desired). As I see it, we don't even need a notion of application ID since the permissions are checked when the process is created and then the process identity is the same until the next power off. Does that make sense to you?

Johnathan Van Why

unread,
May 21, 2020, 2:12:51 PM5/21/20
to Julien Cretin, Leon Schuermann, Tock Embedded OS Development Discussion
On Thu, May 21, 2020 at 4:58 AM Julien Cretin <julien.cr...@polytechnique.org> wrote:
Hi Johnathan,

On Wed, May 20, 2020 at 9:35 PM 'Johnathan Van Why' via Tock Embedded OS Development Discussion <tock...@googlegroups.com> wrote:
On Wed, May 20, 2020 at 8:55 AM Julien Cretin <julien.cr...@polytechnique.org> wrote:
Hi all,

It looks to me that we don't need an application id to add persistent storage support in Tock. It would be a sufficient condition, but it's not a necessary condition.

If we keep this proposal scoped to persistent storage, then we will end up in a similar situation to "Tock has multiple distinct types of application ID". We just wouldn't call them "application IDs".

I guess the issue would be about naming, because if there are distinct types of application IDs it's probably because they have different semantics. I guess what you suggest is to use the finest application ID definition (the one that can differentiate at least as much as all others) so that all other application IDs can be computed from this one. One issue when doing that is that modularity of updates is lost (updating a single application may require to modify its identity in different places even though the actual type of application ID needed would not have changed). So this is mostly a compromise between modularity of updates and multiplicity of application ID semantics. If the considered use-cases only update all applications simultaneously, then modularity of updates is not useful. In that case it's clearly preferable to have a single application ID definition (the finest one).

Another point to consider would be that if coming up with a single application ID definition takes much more time than coming up with some other use-cases independently, it may be preferable to do it in 2 steps: first experiment with other application ID use-cases, then unify them with the gained hindsights. I don't know how long it may take to come up with a unified application ID definition with Tock approval stamp. If we believe it's less than 6 months, then it's not worth doing the 2 step scenario.

I don't expect it to take 6 months to come up with a unified application ID definition.
 
From a high-level point of view, what we need is the following relation:

    has_access: Application -> Permission -> Storage -> bool

    enum Permission {
        Read,
        Write,
        Erase,
    }


(The permission part may be dropped if we believe permissions should always be either none or all, but I'll keep it for the rest since it doesn't hurt.)

We have `has_access(application, permission, storage)` only if the owner of the storage agrees that the application may access it with that permission. How this is done is discussed later.

If we have `has_access(application, permission, storage)` then the application may access the storage with that permission. How this is done is discussed later.

What seems like the simplest way to implement this would be for each storage to come with a public key. This is how a storage is identified.

We extend the Tock Binary Format Header (TBF header) with the following:

-   A new kind of header (either a different version or some other solution) that specifies a storage instead of an application. It contains the following fields:
    -   The storage identity (i.e. its public key).
    -   The storage location (e.g. start and end address).

Defining storage locations as address ranges in flash is extremely coarse and fairly wasteful. These storage regions would need to be aligned to the nearest flash page, and flash pages are quite large (e.g. 2 KiB in the H1 chip). This design also prevents wear levelling: if one storage is accessed heavily its flash pages will wear out much quicker than other. These concerns are some of the motivating reasons why OpenTitan is pursuing a key value store API rather than raw flash access.

I guess you have in mind use-cases with multiple applications using persistent storage. Because if there is a single application running on top of Tock, it's actually as efficient and more flexible for the application to have access to raw flash locations and implement flash-efficient data-structures on top of it, because the persistent storage would have a Rust API instead of a syscall API, so richer types can be used.

For multiple applications it's indeed better to share the flash-efficient data-structures in Tock to avoid each application to have the library code in flash (this is actually a general issue of multiple applications use-cases that induce code duplication between applications, for example some libtock code or some basic language libraries). Note also that an alternative to avoid complexifying the kernel code would be to have a single application handling the persistent storage (in that case it may have access to raw flash locations) and providing some IPC interface for other applications. However, there is still the issue that the persistent storage API would be described in terms of simple data (integers and slices) instead of arbitrary Rust data (enums, lists, nested structures, etc).

IPC is another use case for application IDs. If the persistent storage layer is implemented as an app, then it would use the same application ID concept as the IPC system.
 
Note that one problem with multiple applications accessing a single persistent storage, is that some permissions need to be defined regarding capacity (how many words an application may store at a given time) and lifetime (how many words an application can write in the lifetime of the flash).

-   For application headers (header version 2), we add a list of storages in a similar way as writable flash regions. Each element of this list would contain the following fields:
    -   The identity of the storage.
    -   (optional: The location of the storage if the application does not support a storage at arbitrary location.)
    -   A signature (with the storage private key) of the permission and the application binary. In particular, each application update needs a new approval from the storage owner. Different applications may access the same storage if the owner of the storage allows it.

If I understand this correctly, this means that different processes may access a single storage concurrently. This causes race conditions that will need to be handled. Some of the possible choices for application ID don't have this issue.

This decision is up to the storage owner. The storage owner reviews the application code before signing it.

My concern is data races at runtime, not ownership. It makes the design of the storage layer API more difficult.
 
When the kernel boots and goes through the linked list of TBF headers, it not only creates the processes, but also creates the storages metadata. This means that in addition to the list of processes, the kernel also stores the list of storages.

Where would the kernel store this list? The only dynamic memory allocation supported in the Tock kernel is grant regions, which is per-process. That said, I don't see the need for this list either. Unless I am missing something, the kernel could reference the original storage list when it needs it.

The kernel would store this list next to the list of processes: https://github.com/tock/tock/blob/b2e4c162aa7e6764e8368dc64ceabadc4cef1d88/kernel/src/sched.rs#L34.
There is no need for dynamic allocation. The board would statically allocate the list of storages as this is currently the case for processes.
 
A storage in the kernel would contain its identity and location. When the kernel creates a process, it checks for each storage that the storage exists and the permission is valid.

Access to a storage is done in 2 places. When the process is created, the MPU is configured to give read access to the storage location (do we know any boards where the flash is not mapped to memory? If yes, this needs to be parametrized). When the application accesses the flash syscall driver, the driver checks the permission. For that, the kernel exposes in `Process` a method to check if the process has a given permission for a given flash slice, essentially looping through the storages where the slice fits and checking the permission for that storage.

Note that if we don't want the kernel to iterate twice through the linked list of TBF headers (first to parse storages then to parse processes), storages need to be defined before they are used. This should be easy to check in tockloader.

This is very high-level, but I think agreeing on the high-level APIs would help parallelize the work on sub-components. Does this high-level picture fit into Tock design principles?

Let me paint this design in a different light, using different terminology to refer to the same concepts. Instead of a "storage", you just have an "application". The global list of "storages" would be a global list of "applications". The "storage identity (public key)" would become the "application identity (public key)". A process binary can be signed by one or more "applications", indicating that process is part of that application and has access to the application's storage region.

That's an interesting way to see it :-)

With the renaming, we can then categorize this design based on the axes I outlined in my first message in this thread:
  • Form: 5, a public key, with the binary signed by the corresponding key
  • Verification time: 3 (app startup), plus maybe 5 (runtime -- this depends on how the runtime permissions check works)
  • Use cases supported: 1 (if the kernel only boots applications in the application list, and that list is signed with the kernel image), 2 (storage), and 3 (cryptography -- each application could have its own crypto realm based on the public key).
I have the following concerns with this design:
  1. It requires crypto. Some Tock users may not want to involve crypto in the application build/deployment process. Not all hardware Tock supports has hardware-accelerated crypto, so this could have a lot of runtime overhead.
The crypto part can be optional. But in that case there is no guarantee that the storage owner agrees that the application may access it. Is it possible to have this property without crypto? I guess you have in mind the use-case where all applications are bundled together and signed by a single person. That person would be responsible for checking all permissions before signing. But then I don't see how the signature would be checked if not by the chip during boot.

I have two use cases in mind for application IDs that are not backed by signature verification:
  1. Workshop/hobbyist use cases where establishing a PKI is onerous and security is not a concern.
  2. Sensornet applications on SoCs lacking crypto acceleration, where energy usage is a primary concern and security may be less important (relatively).
  1. It seems overly complex for what it does. I think we could handle everything in per-app headers, rather than introducing a new type of header into Tock.
I think the main argument against this proposal as I see it, is the multiple applications scenario. This is not something I considered and I don't know how wide-spread those scenarios are (because they are quite wasteful in terms of flash due to code duplication and MPU alignment constraints).

In my opinion, one of Tock's distinguishing features is its ability to run multiple independent applications in parallel.
 
A simple way to modify this proposal to remove the issues you see would be to have a single set of flash locations defined in the board (together with an optional public key). The kernel would provide a syscall API for a flash-efficient data-structure on top of those locations. This permits to remove the need for the new header for storages since there is only one and it's defined in the board. The application would still have a set of permissions in their header (how much capacity and lifetime they are allowed to use) which would be optionally signed (if crypto is desired). As I see it, we don't even need a notion of application ID since the permissions are checked when the process is created and then the process identity is the same until the next power off. Does that make sense to you?

I believe I understand you correctly.

Julien Cretin

unread,
May 22, 2020, 8:46:49 AM5/22/20
to Johnathan Van Why, Leon Schuermann, Tock Embedded OS Development Discussion
Hi Johnathan,

This makes a lot of sense. Thanks for the explanation.

A simple way to modify this proposal to remove the issues you see would be to have a single set of flash locations defined in the board (together with an optional public key). The kernel would provide a syscall API for a flash-efficient data-structure on top of those locations. This permits to remove the need for the new header for storages since there is only one and it's defined in the board. The application would still have a set of permissions in their header (how much capacity and lifetime they are allowed to use) which would be optionally signed (if crypto is desired). As I see it, we don't even need a notion of application ID since the permissions are checked when the process is created and then the process identity is the same until the next power off. Does that make sense to you?

I believe I understand you correctly.

Cool. In that case I'm following up the discussion in https://github.com/tock/tock/issues/1692 since application IDs are not needed for this use-case.

However, I am still interested to know your opinion on whether Tock wants to support the following use-case: a single application that would prefer to implement its own flash-efficient storage (for many reasons depending on the Tock key-value store implementation: code size, latency, rich API, code in direct style, stronger guarantees, etc.) Note that to support such a scenario, one only needs applications IDs and a raw flash driver. The raw flash driver provides read slice (or for supported chips adding an MPU region so that the application may read the flash without syscalls), write word slice, and erase page. The platform would filter syscalls to this driver for authorized applications only (there's only one anyway but to prevent flashing an evil application that would read or modify the storage in an unauthorized way).

Thanks a lot for your replies,
Julien

Pat Pannuto

unread,
May 22, 2020, 10:28:15 AM5/22/20
to Julien Cretin, Johnathan Van Why, Leon Schuermann, Tock Embedded OS Development Discussion
Hi Julien,

Paraphrasing, "is Tock interested in a raw flash driver", I think the answer is yes, but with caveats.

Direct flash access will necessarily be a layer in the kernel, on which KV or filesystem APIs are built. Those APIs will be exposed through a stable syscall interface.

I think we would view raw flash access more like raw GPIO, SPI, or I2C -- as drivers that are useful to provide, but are also board-specific and not part of the stable, cross-platform API -- that is, a "Hardware Access" driver.

-Pat

Julien Cretin

unread,
May 22, 2020, 10:43:01 AM5/22/20
to Pat Pannuto, Johnathan Van Why, Leon Schuermann, Tock Embedded OS Development Discussion
Hi Pat,

Yes, a "hardware access" driver would fit the use-case very well. Actually there is https://github.com/tock/tock/pull/1467 to add such a driver for the nRF52 family of boards. But the problem with such drivers is that they should not be aware of applications as far as I understand the first reply to the PR. The driver may need to access the application information for 3 reasons:
1. Check whether the application is authorized. This can actually be fixed with the `filter_syscall` mechanism.
2. Have application-specific state (like an allowed slice to write to flash). This could be fixed by having only one state for all applications essentially limiting the usage to a single application.
3. Add an MPU region to permit direct read access to the flash for the application. This could be fixed by having a separate hardware independent syscall (or memop) to provide direct read access to a flash location. This call should obviously be authorized.

The most problematic points are (2) and (3) since (1) is essentially fixed once we have application IDs. If there are already solutions, I'm interested in using them.

Thanks,
Julien

Johnathan Van Why

unread,
Jun 5, 2020, 6:31:22 PM6/5/20
to Julien Cretin, Pat Pannuto, Leon Schuermann, Tock Embedded OS Development Discussion
Based on today's core working group call, here is my proposal for application IDs.

Summary:

Application IDs and the information required to verify them are encoded in a userspace binary's TBF headers. The format of the data in the header is board-specific. Boards that support application IDs are responsible for parsing and validating the application ID header themselves, the core kernel is only responsible for storing the computed ID. Each process may have at most 1 application ID, although each binary may have multiple TBF headers, each of which is supported by a single board. Application IDs are optional.

In-RAM data type:

The kernel will store the application ID in the Process struct as a Option<&'static [u8]>. Generally, the slice reference will point into the TBF header itself, although that is up to the board.

TLV Element:

We introduce a new TLV element type to store application ID information. The TLV is variable-sized, and its contents are board-specific. It is expected that the TLV's data starts with the application's intended ID, and that additional data used to verify the ID (such as a signature of the ID is a public key) follows the application ID.

Application ID TLV Parsing:

As the core kernel is parsing each userspace binary's TBF headers, when it encounters the application ID TLV element, it will call into a board-provided function to interpret the TLV element. To keep this explanation clear, let's call it decode_app_id(). decode_app_id() can return 2 possible values:
1. An application ID (&'static [u8]). When the kernel receives this it will store that ID and skip processing any additional application ID TLV elements it finds.
2. Invalid ID. This indicates the application ID TLV element is not recognized by the board file. This will ignore that TLV element.
If the board does not recognize any application ID TLV elements (either because there are none or because they are all invalid), the application ID for the process will be set to None.

Properties of this design:
  1. Does not specify what types of application ID are supported. This is a result of today's discussion. Boards can support literal IDs (IDs that are not verified), hash-based IDs (which cannot support app updates), and/or PKI-based IDs (which can support app updates but require a PKI). Caveat: there is potential confusion if an application ID header *appears* valid for one board even when it is meant for a different board.
    Question: Should we try to distinguish boards in the application ID header, such as by allocating each board an ID and storing that in the header?
  2. Leaves the complexity of dealing with potentially-complex application ID types to the board.
  3. Does not prevent applications with invalid IDs from loading. That can be done as part of a "secure boot" API that allows a board to decide whether an application ID is acceptable.
  4. Application IDs can vary in size.

Leon Schuermann

unread,
Jun 15, 2020, 4:25:50 AM6/15/20
to Johnathan Van Why, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
Hello,

Johnathan Van Why <jrva...@google.com> writes:
> Based on today's core working group call, here is my proposal for
> application IDs.

I think this is a good summary and emphasises the distinction between an
in-RAM identifier useful for allocating resources to apps, and the
(cryptographic) integrity and safety verifications - for whichever
purposes they may be used.

I do however see three issues:

> Application IDs are optional.

Is that really a good idea? I strongly believe once implemented
application IDs will be an integral part of the kernel and resource
assignment to apps. Think a partitioned storage: having _optional_
application IDs will require special handling of these cases in the
kernel. Furthermore, an app will have such drivers fail for a (from an
application code point of view) seemingly arbitrary reason - namely the
application ID not set in the TBF header.

Given the low overhead of such an identifier, as well as the ubiquitous
use cases, I'm in favor of making an application ID mandatory.

> The format of the data in the header is board-specific.

I believe this is a good idea, but in order to make the IDs mandatory
and to be able to explore the different concepts in the upstream Tock
boards, a default format should be provided. That default does not have
to serve any security or verification purposes, just sufficient for
allocating resources to individual apps. If a specific use case exists,
one can always swap the application ID out later.

> *In-RAM data type:*
>
> The kernel will store the application ID in the Process struct as a
> Option<&'static [u8]>. Generally, the slice reference will point into the
> TBF header itself, although that is up to the board.

I'm afraid of using a slice here, as that introduces potentially
unbounded complexity when comparing application IDs. More important,
persistently allocating resources across reboots would require the
kernel to store the once assigned identifier to the resource either in
NVM or with the resource itself. Given that your proposed identifier has
no size limitations, this could prove impossible or difficult at best.


From what I understand you'll want the application ID to be verifiable
itself, like a hash or public key. For the reasons outlined above I
would avoid that.

I am in favor of a clear distinction between

- application ID for resource allocation

This must be of fixed size and must not have a large overhead when
comparing frequently. Probably a u32 suits here (usize for most/all
platforms Tock currently supports).

- application ID itself providing information for application security /
integrity verification

This can be arbitrarily defined and does not make sense to standardize
between boards. It serves different purposes depending on the
individual boards' security requirements.

In an effort to avoid confusion, I'll call them verifiable IDs.


Essentially this would mean that

- each application must have a fixed ID in the process struct

This ID is bounded in size and must have a minimal storage and
comparison overhead.

A standard way of encoding this ID in the TBF/TLV must be defined and
implemented at least across the upstream board definitions in Tock.

A custom board may choose to derive this ID differently.

- a developer/board may choose to assign an app a verifiable ID

This can then be (cryptographically) verified along with other
information. The (literal) application ID _can_ be included in this
verification process to avoid manipulation of that.


I think this distinction is important to solve the issue of persistent
resource allocation - which is important for all boards, given that we
probably will provide capsules which rely on that - while not preventing
any developer from rolling out custom verifiable IDs or other mechanisms
to ensure security and integrity.


Best regards

- Leon Schuermann

signature.asc

Johnathan Van Why

unread,
Oct 26, 2020, 8:10:13 PM10/26/20
to Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
In case it isn't clear, I was specifying a mechanism for identifying and authenticating applications (giving them an ID and cryptographically verifying that ID, if necessary), not an authorization mechanism. Individual use cases should have their own authorization mechanism (e.g. ACLs) that grant access to applications by ID.

On Mon, Jun 15, 2020 at 1:25 AM Leon Schuermann <le...@is.currently.online> wrote:
Hello,

Johnathan Van Why <jrva...@google.com> writes:
> Based on today's core working group call, here is my proposal for
> application IDs.

I think this is a good summary and emphasises the distinction between an
in-RAM identifier useful for allocating resources to apps, and the
(cryptographic) integrity and safety verifications - for whichever
purposes they may be used.

I do however see three issues:

> Application IDs are optional.

Is that really a good idea? I strongly believe once implemented
application IDs will be an integral part of the kernel and resource
assignment to apps. Think a partitioned storage: having _optional_
application IDs will require special handling of these cases in the
kernel. Furthermore, an app will have such drivers fail for a (from an
application code point of view) seemingly arbitrary reason - namely the
application ID not set in the TBF header.

Given the low overhead of such an identifier, as well as the ubiquitous
use cases, I'm in favor of making an application ID mandatory.

If we make them mandatory, then we need to do one of the following:
  1. Make them a Tock 2.0 change, as adding more mandatory data to the TBF header is a backwards-incompatible change.
  2. Find a way to conjure up an app ID for processes that lack an app ID TLV element.
I can think of a few ways to do #2. Here, K is the size of the manufactured application ID, M is the maximum number of processes supported by the kernel, and N is the number of processes loaded by the kernel. Most of these require adding a reference to the Process struct; I'm ignoring that cost to avoid repetition:
  1. Add application ID storage to the Process struct (rather than storing references to the application ID as in my proposal). Copy all application IDs to the process struct. This would allow us to use a process binary's location in non-volatile storage as an application ID. RAM cost: O(K*M)
  2. Allocate storage for application IDs in the grant region, copy all application IDs there. Like #1, this would let us use a process' binary location as an application ID. RAM cost: O(K*N).
  3. Compile M distinct application IDs into the kernel image. When an application that is missing an ID is loaded, point the ID reference at one of those IDs. Cost: O(K*M) in non-volatile storage.

> The format of the data in the header is board-specific.

I believe this is a good idea, but in order to make the IDs mandatory
and to be able to explore the different concepts in the upstream Tock
boards, a default format should be provided. That default does not have
to serve any security or verification purposes, just sufficient for
allocating resources to individual apps. If a specific use case exists,
one can always swap the application ID out later.

> *In-RAM data type:*
>
> The kernel will store the application ID in the Process struct as a
> Option<&'static [u8]>. Generally, the slice reference will point into the
> TBF header itself, although that is up to the board.

I'm afraid of using a slice here, as that introduces potentially
unbounded complexity when comparing application IDs. More important,
persistently allocating resources across reboots would require the
kernel to store the once assigned identifier to the resource either in
NVM or with the resource itself. Given that your proposed identifier has
no size limitations, this could prove impossible or difficult at best.

The board can limit the maximum length of the slice, so it controls the worst-case complexity of comparisons.

We can do fixed-length IDs, but they're going to be somewhat expensive. Public keys for signature algorithms can be multiple kB, which is probably an unreasonable size penalty for many applications. We could rely on hashing to reduce key sizes at the expense of startup speed, but that's still going to result in an application ID of at least 32 bytes. We could alleviate the startup speed impact by verifying application IDs lazily at the expense of complexity, although that causes the same "strange errors at runtime" issue that having optional IDs causes.

Johnathan Van Why

unread,
Oct 27, 2020, 8:18:46 PM10/27/20
to Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
Use Case Analysis

First, I want to perform some more analysis of the uses cases for application IDs that I have identified so far:
  1. Storage: Processes may want to store data into non-volatile storage that can be accessed by future processes that are part of the same application.
  2. System call filtering: A board's kernel may wish to restrict certain system calls to specific applications.
  3. Secure boot: I'll use "secure boot" to refer to a mechanism that refuses to launch an a process binary if that process binary is not signed by a trusted public key. This is to distinguish it from "verified boot", which performs cryptographic measurements of what is being booted but still boots unverified process binaries.
  4. Key derivation: This use case refers to cryptographic key derivation processes that derive per-application keys without relying on per-application non-volatile storage. Note that a crypto subsystem that stores per-application keys in non-volatile storage would be considered a "storage" use case.
  5. IPC: An inter-process communication system should allow processes to securely send messages to a specific application, securely receive messages from specific applications, and securely identify who sent them a message.
I want to dig into the similarities and differences between these use cases. Here are a few factors that vary between the use cases that are relevant to application ID design:
  1. ID size sensitivity: Does this use case benefit from having application IDs that are a fixed size across all Tock kernels? How much does it benefit from having a small ID?
  2. Does trust span across boots: Security-wise, is every chip boot a clean slate, or must the system trust security-relevant data from past system boots? A "clean slate" boot is better for security, but not always possible.
  3. Requires cryptographic verification: Does this use case inherently require cryptographically verified application IDs, or is cryptographic verification an optional security improvement?
Here is my analysis of the use cases with respect to the three factors:

Storage
ID size sensitivity: Storage benefits from having a fixed-size ID, as a variable-size ID would require many of the storage layer's data structure elements to have variable size when it would otherwise be unnecessary. A storage layer with ACLs would benefit greatly from small IDs as application IDs will probably be stored repeatedly in the ACLs.
Does trust span across boots: Yes. The storage layer must trust data it wrote into non-volatile storage on a previous boot.
Requires cryptographic verification: No, cryptographic verification is an optional security improvement.

System call filtering
ID size sensitivity: I would expect system call filtering to benefit greatly from small IDs, as ACLs would contain repeated application IDs.
Does trust span across boots: No. The ACLs can be part of the kernel image or application image and can be cryptographically verified during the boot sequence.
Requires cryptographic verification: No, cryptographic verification is an optional securtity improvement.

Secure boot
ID size sensitivity: Secure boot is not size sensitive. Smaller IDs would improve the size of the kernel, but as the IDs must be cryptographic signatures of some sort I can't think of a way to do so that doesn't increase the size of the TBF headers by at least the same amount. Computing smaller IDs is slower than comparing large IDs.
Does trust span across boots: No, the cryptographic verification would occur at each boot.
Requires cryptographic verification: Yes. "Secure boot" without cryptographic verification would have identical security to just erasing unwanted apps.

Key derivation
ID size sensitivity: Smaller IDs would improve key derivation performance, but are not unlikely to be a deal-breaker.
Does trust span across boots: No, if the key verification is performed by dedicated cryptographic hardware (e.g. H1 does this).
Requires cryptographic verification: No. Security will be as strong as the app loading mechanism if unverified IDs are used.

IPC
ID size sensitivity: Moderate. Smaller IDs would reduce the size of process binaries that contain application IDs and the RAM usage of processes that manipulate application IDs at runtime.
Does trust span across boots: No, IPC is only between concurrently-running processes.
Requires cryptographic verification: No, it is an optional security improvement.

Proposal v2 Differences

Based on Leon's concerns and the above analysis, I made a second proposal (below). Here are the differences between the second proposal and my first proposal from June 5:
  1. Application IDs are a fixed size rather than variable-size, to make them easier to store in a filesystem.
  2. Application IDs are no longer optional. I added a mechanism to allow boards to invent application IDs for processes that don't have a recognized ID.
  3. I expanded the proposal to discuss ACL storage to alleviate concerns about too-large application IDs for some use cases.

Proposal v2

Summary
Application IDs are arbitrary 48-byte sequences. Application IDs and the information required to verify them are encoded in a userspace binary's TBF headers. The format of the data in the header is board-specific. Boards are responsible for parsing and validating the application ID header themselves; the core kernel is responsible for storing the computed ID. Although each process binary may have multiple IDs in its TBF headers, each process has 1 application ID. If a process binary does not have a recognized ID in the TBF headers, a board must "invent" an ID for that process to load.

In-RAM data type
The kernel will store the application ID in the Process struct as a Option<&'static [u8; 48]>. Generally, the slice reference will point into the TBF header itself, although that is up to the board.

TLV Element
We introduce a new TLV element type to store application ID information. The TLV is variable-sized, and its contents are board-specific. In most cases, the TLV will contain the application ID, and in many cases it will also contain ID verification information.

TLV Element Parsing
The board must provide two functions to the core kernel:
  1. decode_app_id, which validates an application ID TLV entry. If the ID is valid it returns the ID, otherwise it returns None.
  2. invent_app_id, which either invents and returns an application ID for the process or returns None.
When the core kernel tries to load a process, it will use the first application ID that decode_app_id returns (evaluated in the same order as the TLV entries in the TLB header). If it does not find an ID TLV entry or decode_app_id always returns None, it will call invent_app_id to invent an ID for the process. If invent_app_id returns None the kernel will skip loading that process.

Syscall Filter ACL Compression
The system call filter ACLs can contain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The ACLs can then refer to applications by their short ID rather than their application ID, in order to compress the list size. At runtime, the system call filter will need to search the map to find an application's short ID, then perform authorization checks using the short ID. For performance, the system call filter can cache the short ID in the process' grant region.

Storage ACL Compression
The storage system can maintain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The storage system can then use the short IDs internally, performing lookups as necessary to associate the short IDs with processes at runtime. It can use the grant region to cache the short ID lookup results for a process.

Design notes
  1. A 48 byte key is 384 bits. A 384 bit hash provides 128 bits of collision resistance against an adversary that can run Grover's algorithm.
  2. I made the board responsible for inventing application IDs so that boards that want to verify every application can refuse to invent application IDs. I'm concerned that if we invent IDs outside the board file we'll end up with issues where a board that wants to verify IDs accidentally launches apps with unverified IDs.
  3. A board can invent IDs by compiling a fixed list of app IDs into its image and returning those IDs in order of app (so the first unverified app would use the first fixed ID, the second the second, etc.). The drawback of this approach is that deploying a new process binary (or disabling an existing process binary) could cause an existing process binary's app ID to change and/or move to a different process binary.
  4. A board with access to an entropy source could allocate a fixed buffer for invented app IDs in RAM and generate the IDs on the fly. This would avoid confusion across reboots by changing every process binary's app ID at every boot.
  5. A board could hash a process binary and its location in flash to invent an ID.
  6. decode_app_id and invent_app_id will probably return Result with error enums, not Option, but I left other error cases out of the proposal to keep it understandable.
  7. The system call filter short IDs and storage short IDs cannot be the same, as the ID allocation is different between them. The system call filter ACLs do not need to dynamically allocate short IDs, as the ACLs themselves are static (presumably compiled into the kernel). On the other hand, the storage system will need to allocate short IDs when new applications are loaded.

Leon Schuermann

unread,
Oct 30, 2020, 9:27:51 AM10/30/20
to Johnathan Van Why, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion

Jonathan,

Thanks for writing this up, sorry for the delayed reply. I actually
wrote many iterations of this email but ultimately, all remaining issues
are either solved by your proposal or I don't have a better solution.

The primary concern about the proposal is naturally the large
application ID size. However, assuming a board's RAM is proportional to
the number of applications that can run on it, I'm not too worried about
allocating 48-bytes of memory per process (which is not even required in
the case that flash is mapped into the address space).

For use cases where the long ID is an issue, by introducing the
compression to 32-bit you have provided a viable alternative to use. As
far as I can see, we now have two constant-sized IDs, a primary 48-byte
ID and a secondary (derived) 4-byte ID to use.



The following may sound controversial given that I've actively pushed
towards fixed-size IDs in the past (and still think that this is a good
idea!). I'm not trying to circle back or bring an opposition, especially
now that we have a proposal for fixed size IDs; rather I'm just thinking
out loud.


Given that the critical use cases you identified that do greatly benefit
or require a fixed size ID are likely to use the smaller 4-byte
(compressed) ID anyways, I'm actually not too concerned about the length
of the primary ID entirely and even think that we might be able to make
it generic in the code, with the default `load_processes` only
supporting a concrete type (e.g. [u8; 48]).

I'm not saying to make IDs dynamically in length: having the ID size
fixed at compile time enables allocating memory statically, which fits
well into the design of the Tock kernel. If we introduce a standard
short (compressed) ID of fixed length for use in all persistent
allocation use cases anyways (which could be calculated through a trait
method), we might just be able to make the kernel generic over the
concerte primary (long) ID type. Maybe this could improve support for
different cryptographic primitives (i.e. introduce an `ApplicationId:
Sized` trait, which is implemented on e.g. an `Ed25519SigId` type
wrapping an array)? What do you think?

Essentially (and after re-reading the thread I didn't seem to really
express this explicitly, sorry) I was concerned for fixed-size IDs for
the persistent use cases as you mentioned. Deriving such an ID from a
long primary ID (as proposed) combines the best of both worlds. Having a
fixed-size type (at compile time) in the kernel enables compiler
optimisations, reduces dynamic storage requirements. Making it generic,
such that it's still fixed size at compile time might be better suited
for your purposes. There probably also are many arguments against it, I
don't have much stake in it.


Looking forward to your feedback. Thanks for the current proposal, which
I'm really happy with!


Leon
signature.asc

Johnathan Van Why

unread,
Oct 30, 2020, 11:59:44 AM10/30/20
to Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
The only argument against making the application ID length configurable via generics is that it may be difficult to do in code. Otherwise, I have no objection.

Conceptually, the board is responsible for deciding what security properties it wants application IDs to have, and letting it set the size of those IDs is in line with that security model.

Johnathan Van Why

unread,
Oct 30, 2020, 6:57:50 PM10/30/20
to Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
As promised, here is my third app ID proposal, a modification of my second proposal using input from today's core WG call.

Application ID Proposal v3

Summary
Application IDs are arbitrary K-byte sequences, where K is a compile-time constant defined by the board. Application IDs and the information required to verify them are encoded in a userspace binary's TBF headers. Different types of app IDs are stored in different TLV entry types. Boards must decide what app ID TLV entry types they support, and may be responsible for parsing and verifying application IDs. The core kernel is responsible for storing the computed application ID. Although each process binary may have multiple IDs in its TBF headers, each process has 1 application ID. If a process binary does not have a supported ID type in the TBF headers, a board must "invent" an ID for that process, or the process will fail to load.

In-RAM data type
We will introduce a new type AppId which is a #[repr(C,align(4))] wrapper around [u8; K]. The kernel will store the application ID in the Process struct as a &'static AppId. Generally, the reference will point into the TBF headers, although that is up to the board.

TLV Elements
We introduce a new TLV element for unverified IDs. The length field of the TLV element is fixed at 8. The data contained inside is the app ID with no verification data. For apps that are released publicly, the ID should be chosen in a manner that minimizes the chance of collision with other apps (e.g. generated randomly or by a truncated cryptographic hash of the app's name).

When a Tock board (either in-tree or out-of-tree) wants to implement cryptographically-verified app IDs, the board authors must introduce a new TLV element type to store their application ID and and any verification information that may be necessary.

TLV Element Parsing
The board must tell the core kernel what types of application ID it is willing to accept. Additionally, it must provide the invent_app_id function, which either invents and returns a new application ID or returns None.

When the core kernel tries to load a process, it will search for the first application ID TLV element of a type the board accepts, and verify that TLV element. If verification fails, then the process will fail to load. If the process binary's TBF headers do not contain a form of app ID that the board accepts, then the core kernel will call 
invent_app_id to create an app ID for the process. If invent_app_id returns None the process will fail to load.

Syscall Filter ACL Compression
The system call filter ACLs can contain a map from K-byte application IDs to short IDs (e.g. 32-bit integers). The ACLs can then refer to applications by their short ID rather than their application ID, in order to compress the list size. At runtime, the system call filter will need to search the map to find an application's short ID, then perform authorization checks using the short ID. For performance, the system call filter can cache the short ID in the process' grant region.

Storage ACL Compression
The storage system can maintain a map from K-byte application IDs to short IDs (e.g. 32-bit integers). The storage system can then use the short IDs internally, performing lookups as necessary to associate the short IDs with processes at runtime. It can use the grant region to cache the short ID lookup results for a process.

Differences from v2

  1. I allowed the board to set the size of the application IDs, rather than forcing them to all be 48 bytes.
  2. I removed the Option<> wrapper around the app ID in the Process struct. Leaving the Option in was a mistake in my v2 proposal.
  3. I raised the alignment of application IDs to 4 bytes by wrapping the ID in a type that is repr(align(4)). This is to improve the performance of copy and comparison operations on app IDs.
  4. I re-did the TLV entry section. I new specify that different types of app ID should have different TLV entry types. I went ahead and specified a TLV entry type for unverified IDs to satisfy most use cases, while intentionally leaving cryptographically-verified ID TLV entries unspecified for now. This change affected the process loading behavior: in the v2 proposal, the kernel would ignore a TLV entry with a correct format but cryptographically invalid ID in it, now a recognized format with a cryptographically invalid ID will cause the process to fail to load.

Design notes
  1. It was pointed out in the core WG call that making app IDs word-aligned would make some operations more efficient (e.g. copying them into crypto hardware and performing comparisons). The alignment for application IDs cannot be larger than 4 bytes, as they'll frequently be stored in TBF headers which have a 4-byte alignment. Thus 4 bytes seems ideal.
  2. It is important to make boards explicitly state what app ID types they accept, as accepting some ID types by default could result in a board that wants cryptographically-verified IDs accepting unverified IDs by accident. That would be a security vulnerability.
  3. App IDs do not need to have a size that is a multiple of 4 bytes, although I expect sizes that are a multiple of 4 bytes to be rare.
  4. The TBF header design already the namespace between TLV entry types that are defined by top (MSB is 0) and TLV entry types defined out-of-tree (MSB is 1). Therefore out-of-tree boards can introduce their own app ID types without coordinating upstream.
  5. AppId cannot be repr(Rust), as a repr(Rust) struct may have padding with validity constraints on that padding, and the AppId will frequently reside in the TBF headers. It cannot be repr(transparent) because repr(transparent) structs copy their alignment from the contained field, and we want to manually increase the alignment. repr(C) is the only remaining representation that can be soundly stored in a TBF header and has the alignment we want.
  6. For the unverified ID TLV element, 4 bytes didn't seem like enough to give global uniqueness, while 8 bytes does. I do not anticipate there being a billion publicly-released Tock apps, and if there are then we can add a new type of app ID that is larger. This number is very bikesheddable.
  7. invent_app_id would probably return Result or another enum rather than Option, but the other failure cases aren't important for this proposal so I used Option for simplicity.
  8. The core kernel is allowed to make the board verify some app ID types. This may even be the common case as it seems beneficial to make use of crypto hardware the board has when performing app ID verification.
  9. The name AppId is very bikesheddable.

Leon Schuermann

unread,
Oct 31, 2020, 8:08:50 AM10/31/20
to Johnathan Van Why, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion

This is great.

I see you opted for an implementation using const generics over the
length of the byte-slice, which I didn't think about. It looks like a
good option though.


The proof of concept I promised to publish uses associated constants in
an `AppId` trait, and hence would only provide access to a slice of
dynamic length (&[u8]). Every (efficient) implementation would still use
slices of fixed length underneath though, and by not using the trait
implementers dynamically (not going through a vtable) the compiler might
be able to reason about the slice length anyways. I'd prefer not to rely
on this though and hence like const generics better.


The reason for using a trait over const generics was that in my
particular use case, I would generate the application ID at runtime and
would not have a location in flash to point the reference to (I'd need
to either at compile or runtime allocate 'static memory in the kernel to
write the ID to, then let the reference point to that memory).

I think both concepts can be combined by making a trait, which is
generic over a constant. The result should be similarly efficient as
using the slice-reference in the kernel directly, given that the
compiler should be able to inline all of the trait methods.

This probably wouldn't come without drawbacks. An _instance_ of `AppId`
would still be able to require or guarantee alignment of the underlying
data for efficiency reasons, the _trait_ could however not enforce
this. Also, accessing the app id bytes could (by being a method) have
unbounded complexity. I'm not sure how relevant these issues are, given
that a board decides over its AppId implementation in the end.


I've published the results as a Gist, along with the associated
constants-based implementation (which could be a viable implementation
strategy if `const_generics` were too unstable; one could switch later
for efficiency reasons). Let me know what you think:

https://gist.github.com/lschuermann/4a00ae0305a51a4a613b039e71f4e09f

Const generics playground:
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=6d7960c6a50994dbc8dbb84ebd32c8fc

Associated constants playground:
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=b2a5a3efe67d12c5ef34141adc3d803a

The slice reference in the kernel is great though. If there are reasons
why we should refrain from using a trait, I'm happy with the current
proposal.


Johnathan Van Why <jrva...@google.com> writes:
> 9. The name AppId is very bikesheddable.

Let the bikeshedding begin! :) More seriously, the collision with the
current `AppId` type is confusing. I'll propose to the rename the
current `AppId` to a more fitting `ProcessInstanceId` (or similar) in a
PR, which better describes what it is currently identifying.


Thanks!
signature.asc

Johnathan Van Why

unread,
Oct 31, 2020, 3:47:07 PM10/31/20
to Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
I don't know enough about the kernel's implementation to offer sound feedback on which approach to go with. I tried to word the proposal in a way that makes it clear that any of those options are fine. I didn't mean to imply a particular choice of implementation when I used a capital K for key length, I just used a capital K to be consistent with me complexity analysis a few posts back.

I do not think it is an issue for the kernel's performance to depend on choices the board file makes, as long as the choices are obvious. I think anyone implementing a board will recognize that performing operations on large app IDs will take longer than operations on small app IDs, and that if they implement a trait method the kernel calls in a slow manner then the kernel will be slow. I don't see why it would be difficult to efficiently implement any of the trait methods you offered.

I'm don't understand why you have compressed as part of AppId in your gists. Unfortunately, I don't think we can centrally allocate compressed IDs, and instead the subsystems that need them (e.g. syscall filtering and storage) will need to create them themselves. Is compressed something you wrote into your mockup before I sent my v3 proposal?

Leon Schuermann

unread,
Oct 31, 2020, 5:57:01 PM10/31/20
to Johnathan Van Why, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion

Johnathan Van Why <jrva...@google.com> writes:
> I tried to word the proposal in a way that makes it clear that any of
> those options are fine. I didn't mean to imply a particular choice of
> implementation when I used a capital K for key length, I just used a
> capital K to be consistent with me complexity analysis a few posts
> back

That makes sense. Maybe I was misled by the 'in-RAM data type' section
of the proposal into thinking it would describe a precise implementation
strategy already. Handling the ID as an opaque fixed length byte
sequence makes sense however, and it is probably best to keep such
implementation details as discussed in my last email out of the
design. Thanks for the feedback.


> I'm don't understand why you have compressed as part of AppId in your
> gists. Unfortunately, I don't think we can centrally allocate
> compressed IDs, and instead the subsystems that need them
> (e.g. syscall filtering and storage) will need to create them
> themselves. Is compressed something you wrote into your mockup before
> I sent my v3 proposal?

The following is under the assumption that there is a function-based
mapping from application IDs to short IDs, regardless of whether it's
the capsules responsibility or to be determined by the ID
type. Rereading through your proposal, it might also be possible that
you intend the subsystems to have a static (persistent) mapping from
application IDs or short IDs. These comments don't apply then.


The basic idea was indeed to centrally compute the different
"compressed" ids (hence the name, as to provide an alternative to the
primary ids). I see how this can become an issue.

The reason to do this centrally is that each application ID type should
ideally be able to define its own mapping from the application ID to the
short IDs used in the different subsystems. For example, a (complex)
hash function might not be viable on every board, on the other hand
simple hash functions might not provide sufficient diffusion to
accommodate for application IDs which have a small hamming
distance. Essentially, the storage subsystem might want to choose the
short ID differently (using a different algorithm) depending on the
length and type of the application ID (random number, human readable
identifier, domain name, etc.).

Using a trait-based type system, it might work to couple the
mapping-function from application IDs to short IDs to both the target
subsystem and the ID type, by introducing traits such as

trait StorageId: AppId {
fn storage_id(&self) -> [u8; 8];
}

which can then be used to define the mapping for each individual
subsystem, on an ID type granularity.

Would - under the assumption that you are talking about a "dynamic"
(non-persistent, `f(app_id) -> short_id` signature) app id to short ID
mapping - this solve the issue? Do I understand your proposal and the
issue correctly?
signature.asc

Johnathan Van Why

unread,
Oct 31, 2020, 9:37:47 PM10/31/20
to Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Sat, Oct 31, 2020 at 2:57 PM Leon Schuermann <le...@is.currently.online> wrote:

Johnathan Van Why <jrva...@google.com> writes:
> I tried to word the proposal in a way that makes it clear that any of
> those options are fine. I didn't mean to imply a particular choice of
> implementation when I used a capital K for key length, I just used a
> capital K to be consistent with me complexity analysis a few posts
> back

That makes sense. Maybe I was misled by the 'in-RAM data type' section
of the proposal into thinking it would describe a precise implementation
strategy already. Handling the ID as an opaque fixed length byte
sequence makes sense however, and it is probably best to keep such
implementation details as discussed in my last email out of the
design. Thanks for the feedback.

Oh, I used "in-RAM data type" to refer to the type that appears in the kernel's code, to distinguish it from the layout of the TLV entry.
Were you thinking the storage IDs would be computed by taking a 32-bit hash of the full app ID? That doesn't work from a security perspective: an adversary could produce their own app ID that hashes to the same short ID as a target app in order to impersonate that target app to subsystems using short IDs.

We need a different approach for producing 32-bit short IDs without the possibility of collisions. I could not think of a way to do so that works for both system call filtering and a storage subsystem. In the case of system call filtering, where the ACLs are compiled into the kernel, the build step that generates the ACLs would generate the short IDs. In the case of a storage subsystem, the storage subsystem would have to generate short IDs on the fly when newly-added apps make calls into the storage subsystem.

I suspect we're miscommunicating here, but I don't see the miscommunication yet.
 
Using a trait-based type system, it might work to couple the
mapping-function from application IDs to short IDs to both the target
subsystem and the ID type, by introducing traits such as

    trait StorageId: AppId {
        fn storage_id(&self) -> [u8; 8];
    }

which can then be used to define the mapping for each individual
subsystem, on an ID type granularity.

Would - under the assumption that you are talking about a "dynamic"
(non-persistent, `f(app_id) -> short_id` signature) app id to short ID
mapping - this solve the issue? Do I understand your proposal and the
issue correctly?

I don't think your understanding here is correct, because I don't see short IDs as being produced as a function of the full app ID. I see them as being produced by the subsystem that needs them and associated with the full app ID. E.g. I expect short IDs to generally be counters.

Alistair Francis

unread,
Nov 3, 2020, 11:18:28 AM11/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Oct 27, 2020 at 5:18 PM 'Johnathan Van Why' via Tock Embedded
OS Development Discussion <tock...@googlegroups.com> wrote:
>
> Use Case Analysis
>
> First, I want to perform some more analysis of the uses cases for application IDs that I have identified so far:
>
> Storage: Processes may want to store data into non-volatile storage that can be accessed by future processes that are part of the same application.

As Phil has pointed out we actually have some more use cases here. It
is possible that you want a second app to access the data. Phil gave
the example of migrating the data from app A to app B.

It's also possible that we would want to split or merge apps. So what
used to be two separate apps with their own storage will be merged
into 1 single app.

> System call filtering: A board's kernel may wish to restrict certain system calls to specific applications.

This I think is a key use case.

> Secure boot: I'll use "secure boot" to refer to a mechanism that refuses to launch an a process binary if that process binary is not signed by a trusted public key. This is to distinguish it from "verified boot", which performs cryptographic measurements of what is being booted but still boots unverified process binaries.
> Key derivation: This use case refers to cryptographic key derivation processes that derive per-application keys without relying on per-application non-volatile storage. Note that a crypto subsystem that stores per-application keys in non-volatile storage would be considered a "storage" use case.
> IPC: An inter-process communication system should allow processes to securely send messages to a specific application, securely receive messages from specific applications, and securely identify who sent them a message.
>
> I want to dig into the similarities and differences between these use cases. Here are a few factors that vary between the use cases that are relevant to application ID design:
>
> ID size sensitivity: Does this use case benefit from having application IDs that are a fixed size across all Tock kernels? How much does it benefit from having a small ID?
> Does trust span across boots: Security-wise, is every chip boot a clean slate, or must the system trust security-relevant data from past system boots? A "clean slate" boot is better for security, but not always possible.
> Requires cryptographic verification: Does this use case inherently require cryptographically verified application IDs, or is cryptographic verification an optional security improvement?
>
> Here is my analysis of the use cases with respect to the three factors:
>
> Storage
>
> ID size sensitivity: Storage benefits from having a fixed-size ID, as a variable-size ID would require many of the storage layer's data structure elements to have variable size when it would otherwise be unnecessary. A storage layer with ACLs would benefit greatly from small IDs as application IDs will probably be stored repeatedly in the ACLs.
> Does trust span across boots: Yes. The storage layer must trust data it wrote into non-volatile storage on a previous boot.
> Requires cryptographic verification: No, cryptographic verification is an optional security improvement.

ACLs I think are a good idea for the storage problem. I don't see how
an appID fixes that though. Unless we end up also introducing some
manifest file that can be parsed by the kernel to determine ACLs for
each app.

>
> System call filtering
>
> ID size sensitivity: I would expect system call filtering to benefit greatly from small IDs, as ACLs would contain repeated application IDs.

The same problem as above. An appID doesn't fix the syscall filtering
problem. How do we then specify the filters for each app?

> Does trust span across boots: No. The ACLs can be part of the kernel image or application image and can be cryptographically verified during the boot sequence.
> Requires cryptographic verification: No, cryptographic verification is an optional securtity improvement.
>
> Secure boot
>
> ID size sensitivity: Secure boot is not size sensitive. Smaller IDs would improve the size of the kernel, but as the IDs must be cryptographic signatures of some sort I can't think of a way to do so that doesn't increase the size of the TBF headers by at least the same amount. Computing smaller IDs is slower than comparing large IDs.
> Does trust span across boots: No, the cryptographic verification would occur at each boot.
> Requires cryptographic verification: Yes. "Secure boot" without cryptographic verification would have identical security to just erasing unwanted apps.

How does an appID in an untrusted header give us secure boot?

>
> Key derivation
>
> ID size sensitivity: Smaller IDs would improve key derivation performance, but are not unlikely to be a deal-breaker.
> Does trust span across boots: No, if the key verification is performed by dedicated cryptographic hardware (e.g. H1 does this).
> Requires cryptographic verification: No. Security will be as strong as the app loading mechanism if unverified IDs are used.
>
> IPC
>
> ID size sensitivity: Moderate. Smaller IDs would reduce the size of process binaries that contain application IDs and the RAM usage of processes that manipulate application IDs at runtime.
>
> Does trust span across boots: No, IPC is only between concurrently-running processes.
> Requires cryptographic verification: No, it is an optional security improvement.
>
>
> Proposal v2 Differences
>
> Based on Leon's concerns and the above analysis, I made a second proposal (below). Here are the differences between the second proposal and my first proposal from June 5:
>
> Application IDs are a fixed size rather than variable-size, to make them easier to store in a filesystem.
> Application IDs are no longer optional. I added a mechanism to allow boards to invent application IDs for processes that don't have a recognized ID.
> I expanded the proposal to discuss ACL storage to alleviate concerns about too-large application IDs for some use cases.
>
>
> Proposal v2
>
> Summary
>
> Application IDs are arbitrary 48-byte sequences. Application IDs and the information required to verify them are encoded in a userspace binary's TBF headers. The format of the data in the header is board-specific. Boards are responsible for parsing and validating the application ID header themselves; the core kernel is responsible for storing the computed ID. Although each process binary may have multiple IDs in its TBF headers, each process has 1 application ID. If a process binary does not have a recognized ID in the TBF headers, a board must "invent" an ID for that process to load.

48-bytes for every application is a lot of space. That will have to be
stored in flash and parsed in RAM at some point.

Also, this comes back to the same problem that we don't trust the TBF headers.

Is the idea here that each board will parse headers differently?
Doesn't that now mean that apps are board specific instead of being
architecture specific like they are now (at least on ARM)?

>
> In-RAM data type
>
> The kernel will store the application ID in the Process struct as a Option<&'static [u8; 48]>. Generally, the slice reference will point into the TBF header itself, although that is up to the board.
>
> TLV Element
>
> We introduce a new TLV element type to store application ID information. The TLV is variable-sized, and its contents are board-specific. In most cases, the TLV will contain the application ID, and in many cases it will also contain ID verification information.

I thought we were trying to avoid variable sized headers?

>
> TLV Element Parsing
>
> The board must provide two functions to the core kernel:
>
> decode_app_id, which validates an application ID TLV entry. If the ID is valid it returns the ID, otherwise it returns None.
> invent_app_id, which either invents and returns an application ID for the process or returns None.
>
> When the core kernel tries to load a process, it will use the first application ID that decode_app_id returns (evaluated in the same order as the TLV entries in the TLB header). If it does not find an ID TLV entry or decode_app_id always returns None, it will call invent_app_id to invent an ID for the process. If invent_app_id returns None the kernel will skip loading that process.
>
>
> Syscall Filter ACL Compression
>
> The system call filter ACLs can contain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The ACLs can then refer to applications by their short ID rather than their application ID, in order to compress the list size. At runtime, the system call filter will need to search the map to find an application's short ID, then perform authorization checks using the short ID. For performance, the system call filter can cache the short ID in the process' grant region.

Where do these ACLs come from? This is one of the major use cases for
the appID and I don't see how the filtering actually happens?

>
> Storage ACL Compression
>
> The storage system can maintain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The storage system can then use the short IDs internally, performing lookups as necessary to associate the short IDs with processes at runtime. It can use the grant region to cache the short ID lookup results for a process.

The same problem here, where does the storage ACL come from?

>
>
> Design notes
>
> A 48 byte key is 384 bits. A 384 bit hash provides 128 bits of collision resistance against an adversary that can run Grover's algorithm.
> I made the board responsible for inventing application IDs so that boards that want to verify every application can refuse to invent application IDs. I'm concerned that if we invent IDs outside the board file we'll end up with issues where a board that wants to verify IDs accidentally launches apps with unverified IDs.
> A board can invent IDs by compiling a fixed list of app IDs into its image and returning those IDs in order of app (so the first unverified app would use the first fixed ID, the second the second, etc.). The drawback of this approach is that deploying a new process binary (or disabling an existing process binary) could cause an existing process binary's app ID to change and/or move to a different process binary.

If we are just hard coding IDs in order what advantage do we get from
having the IDs in the first place?

> A board with access to an entropy source could allocate a fixed buffer for invented app IDs in RAM and generate the IDs on the fly. This would avoid confusion across reboots by changing every process binary's app ID at every boot.

Won't that slow down startup waiting for entropy?

> A board could hash a process binary and its location in flash to invent an ID.

That seems like a good option, but couldn't the loader do this instead
of the kernel? Generating a hash of flash could be slow on some
boards.

> decode_app_id and invent_app_id will probably return Result with error enums, not Option, but I left other error cases out of the proposal to keep it understandable.
> The system call filter short IDs and storage short IDs cannot be the same, as the ID allocation is different between them. The system call filter ACLs do not need to dynamically allocate short IDs, as the ACLs themselves are static (presumably compiled into the kernel). On the other hand, the storage system will need to allocate short IDs when new applications are loaded.

The more I think about appIDs the less of a reason I see for them.

Why can we not store the syscall and storage ACLs in the TBF header? A
secure system will do a crypto signature check of the app/TBF header
when loading it so we know they haven't been tampered with. An
unsecure system can just trust the headers. Apps can't change their
headers so they can't give themselves more permissions. If every app
lists it's syscalls we can then always enforce syscall filtering which
would be cool. I can auto-generate a list of all syscalls when
building an app so it should be easy to keep track of.

Secure boot needs to be seperate from appIDs anyway using crypto
signatures. For key derivation we could use the signature as an ID or
a hash of the app in flash. IPC could also have it's own ACL
implementation (like the syscall/storage) or use a hash of the app in
flash as an ID.

Also, I think we might need to re-think the threat model and put more
trust in the TBF header. It looks like all approaches end up with at
least some trust in the TBF header.

Alistair
> --
> You received this message because you are subscribed to the Google Groups "Tock Embedded OS Development Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to tock-dev+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/tock-dev/CAJqTQ1goLUas7bUMq6jHPA3kRWQn_4BV4j0VNgVRBH%2BW966yxQ%40mail.gmail.com.

Leon Schuermann

unread,
Nov 3, 2020, 12:40:46 PM11/3/20
to Johnathan Van Why, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion

"'Johnathan Van Why' via Tock Embedded OS Development Discussion"
<tock...@googlegroups.com> writes:
> Were you thinking the storage IDs would be computed by taking a 32-bit hash
> of the full app ID? That doesn't work from a security perspective: an
> adversary could produce their own app ID that hashes to the same short ID
> as a target app in order to impersonate that target app to subsystems using
> short IDs.

Indeed that was my initial assumption. I would've hoped that something
along the lines of 64-bit short IDs provide sufficient collision
resistance, given that on security critical systems, apps are verified
beforehand and thereby collisions would've been under the control of the
app's signee anyways. I do however understand the issues with this
approach, and hence don't want to pursue it.

This might make it harder or impossible to have dynamic resource
allocations in peripherals without writable persistent storage. To
circumvent this, either those peripherals do - under consideration of
the security implementations (having collisions) - use a one-way
function to map from application IDs to resources on their own, or use a
static mapping introduced while compiling the kernel. Rereading the
proposal, the way it's written makes this sufficiently clear.

It's good that we have this discussion. Ideas such as this one tend to
get stuck in one's head from previous iterations of the discussion, so
actively talking about the implications can resolve those
misinterpretations. Thanks!
signature.asc

Leon Schuermann

unread,
Nov 3, 2020, 1:23:25 PM11/3/20
to Alistair Francis, Johnathan Van Why, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
Alistair Francis <alist...@gmail.com> writes:
> As Phil has pointed out we actually have some more use cases here. It
> is possible that you want a second app to access the data. Phil gave
> the example of migrating the data from app A to app B.
>
> It's also possible that we would want to split or merge apps. So what
> used to be two separate apps with their own storage will be merged
> into 1 single app.

This is true. For storage, this single, one-to-one mapping of
applications to respective ids is likely insufficient to cover all of
the persistent storage use cases we want to support.

I'm convinced that this application ID will nonetheless benefit Tock,
since (1) it will enable the cryptographic verification use cases are
required by Jonathan and others, while being agnostic to the precise
mechanisms and (2) it will be sufficient for resource allocations where
a one-to-one mapping is desirable.

Furthermore, this application ID can then be used by the individual
subsystems to grant access using more complex mechanisms by use of more
complex authorization mechanisms, which can grant access based on a

(principal, request, object)

granularity. An example for such a mechanism could be an access control
list. This proposal introduces the principal (app) identification
mechanism only.

> The same problem as above. An appID doesn't fix the syscall filtering
> problem. How do we then specify the filters for each app?

Using a separate table, which this proposal does not want to
introduce. This explains the thread subject, which was my initial idea
with this discussion: "Nonvolatile application storage, part 1:
application ID". In a subsequent thread we can think about the precise
ACL design.

> 48-bytes for every application is a lot of space. That will have to be
> stored in flash and parsed in RAM at some point.

This is referring to an outdated version of the proposal. The new
version. To cite the proposal v3:
> We will introduce a new type AppId which is a #[repr(C,align(4))]
> wrapper around [u8; K].

Refer to the message <87r1pek376.fsf@zirconium> for potential
implementation strategies of this _generic_ but not _dynamic_
application ID length. It should allow for efficient static
implementations, while still being flexible for different needs and
their respective ID width requirements.


Some of your points I'll have to think about a bit longer. You have some
valid concerns, especially about inventing the ID. I hope this email
clears up some confusion.


Leon
signature.asc

Johnathan Van Why

unread,
Nov 3, 2020, 1:30:08 PM11/3/20
to Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Nov 3, 2020 at 9:40 AM Leon Schuermann <le...@is.currently.online> wrote:

"'Johnathan Van Why' via Tock Embedded OS Development Discussion"
<tock...@googlegroups.com> writes:
> Were you thinking the storage IDs would be computed by taking a 32-bit hash
> of the full app ID? That doesn't work from a security perspective: an
> adversary could produce their own app ID that hashes to the same short ID
> as a target app in order to impersonate that target app to subsystems using
> short IDs.

Indeed that was my initial assumption. I would've hoped that something
along the lines of 64-bit short IDs provide sufficient collision
resistance, given that on security critical systems, apps are verified
beforehand and thereby collisions would've been under the control of the
app's signee anyways. I do however understand the issues with this
approach, and hence don't want to pursue it.

This might make it harder or impossible to have dynamic resource
allocations in peripherals without writable persistent storage. To
circumvent this, either those peripherals do - under consideration of
the security implementations (having collisions) - use a one-way
function to map from application IDs to resources on their own, or use a
static mapping introduced while compiling the kernel. Rereading the
proposal, the way it's written makes this sufficiently clear.

I'm having trouble thinking of a subsystem that needs to dynamically allocate short IDs but which lacks writable persistent storage. As far as I can tell, the only reason you would need to dynamically allocate short IDs is if you are going to write them into persistent storage.

As a result, I don't think this will ever be an issue. I still would like to avoid having multiple types of short ID, but I can't think of a way to do so.

Johnathan Van Why

unread,
Nov 3, 2020, 1:32:46 PM11/3/20
to Leon Schuermann, Alistair Francis, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Nov 3, 2020 at 10:23 AM Leon Schuermann <le...@is.currently.online> wrote:
Alistair Francis <alist...@gmail.com> writes:
> As Phil has pointed out we actually have some more use cases here. It
> is possible that you want a second app to access the data. Phil gave
> the example of migrating the data from app A to app B.
>
> It's also possible that we would want to split or merge apps. So what
> used to be two separate apps with their own storage will be merged
> into 1 single app.

This is true. For storage, this single, one-to-one mapping of
applications to respective ids is likely insufficient to cover all of
the persistent storage use cases we want to support.

I'm convinced that this application ID will nonetheless benefit Tock,
since (1) it will enable the cryptographic verification use cases are
required by Jonathan and others, while being agnostic to the precise
mechanisms and (2) it will be sufficient for resource allocations where
a one-to-one mapping is desirable.

Furthermore, this application ID can then be used by the individual
subsystems to grant access using more complex mechanisms by use of more
complex authorization mechanisms, which can grant access based on a

    (principal, request, object)

granularity. An example for such a mechanism could be an access control
list. This proposal introduces the principal (app) identification
mechanism only.

To introduce security terminology, my application ID proposal provides "identification" and "authentication", but not "authorization". Authorization would be provided via ACLs. You can't have secure authorization without identification and authentication, so both mechanisms are needed.
 
> The same problem as above. An appID doesn't fix the syscall filtering
> problem. How do we then specify the filters for each app?

Using a separate table, which this proposal does not want to
introduce. This explains the thread subject, which was my initial idea
with this discussion: "Nonvolatile application storage, part 1:
application ID". In a subsequent thread we can think about the precise
ACL design.

> 48-bytes for every application is a lot of space. That will have to be
> stored in flash and parsed in RAM at some point.

This is referring to an outdated version of the proposal. The new
version. To cite the proposal v3:

For reference, here is a link to proposal v3: https://groups.google.com/g/tock-dev/c/aduN7fHWXdI/m/9oOx5kfRAgAJ.

Johnathan Van Why

unread,
Nov 3, 2020, 1:56:21 PM11/3/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Nov 3, 2020 at 8:18 AM Alistair Francis <alist...@gmail.com> wrote:
On Tue, Oct 27, 2020 at 5:18 PM 'Johnathan Van Why' via Tock Embedded
OS Development Discussion <tock...@googlegroups.com> wrote:
>
> Use Case Analysis
>
> First, I want to perform some more analysis of the uses cases for application IDs that I have identified so far:
>
> Storage: Processes may want to store data into non-volatile storage that can be accessed by future processes that are part of the same application.

As Phil has pointed out we actually have some more use cases here. It
is possible that you want a second app to access the data. Phil gave
the example of migrating the data from app A to app B.

It's also possible that we would want to split or merge apps. So what
used to be two separate apps with their own storage will be merged
into 1 single app.

Merging apps is a bit tricky under my proposal. To merge app B into app A, you would need to deploy a new version of app B that adds app A to the ACLs for the storage being merged, then deploy the merged app (and remove app B).

We could make this easier by allowing processes to have multiple app IDs, at the expense of added complexity (including dynamic allocation for app ID references).
For secure boot, you need a cryptographically verified ID. You verify the app ID header before trusting it.

For example, a verified app ID can be a public key, with the associated private key stored securely by the app's developers. The app ID header would contain the app ID/public key, as well as a signature of the app. The kernel would parse the app ID header and verify the signature before it trusts it.

This allows you to trust the PKI and crypto hardware rather than trusting the bytes in flash memory.
 

>
> Key derivation
>
> ID size sensitivity: Smaller IDs would improve key derivation performance, but are not unlikely to be a deal-breaker.
> Does trust span across boots: No, if the key verification is performed by dedicated cryptographic hardware (e.g. H1 does this).
> Requires cryptographic verification: No. Security will be as strong as the app loading mechanism if unverified IDs are used.
>
> IPC
>
> ID size sensitivity: Moderate. Smaller IDs would reduce the size of process binaries that contain application IDs and the RAM usage of processes that manipulate application IDs at runtime.
>
> Does trust span across boots: No, IPC is only between concurrently-running processes.
> Requires cryptographic verification: No, it is an optional security improvement.
>
>
> Proposal v2 Differences
>
> Based on Leon's concerns and the above analysis, I made a second proposal (below). Here are the differences between the second proposal and my first proposal from June 5:
>
> Application IDs are a fixed size rather than variable-size, to make them easier to store in a filesystem.
> Application IDs are no longer optional. I added a mechanism to allow boards to invent application IDs for processes that don't have a recognized ID.
> I expanded the proposal to discuss ACL storage to alleviate concerns about too-large application IDs for some use cases.
>
>
> Proposal v2
>
> Summary
>
> Application IDs are arbitrary 48-byte sequences. Application IDs and the information required to verify them are encoded in a userspace binary's TBF headers. The format of the data in the header is board-specific. Boards are responsible for parsing and validating the application ID header themselves; the core kernel is responsible for storing the computed ID. Although each process binary may have multiple IDs in its TBF headers, each process has 1 application ID. If a process binary does not have a recognized ID in the TBF headers, a board must "invent" an ID for that process to load.

48-bytes for every application is a lot of space. That will have to be
stored in flash and parsed in RAM at some point.

Also, this comes back to the same problem that we don't trust the TBF headers.

Is the idea here that each board will parse headers differently?
Doesn't that now mean that apps are board specific instead of being
architecture specific like they are now (at least on ARM)?

Different boards may choose to parse different app ID headers, yes. Proposal v3 addresses this better, but it still won't be 100% uniform. I expect the unverified app ID type that I specified to cover most use cases of Tock. Apps that want to run on boards that use other app ID types can list multiple types in the TBF headers.

If an app is deployed to a board that doesn't understand its app ID headers, the board can "invent" an app ID in order to boot the app regardless. Of course, this doesn't work if the board needs secure boot, but by definition a board that needs secure boot does not support all Tock apps.
 
>
> In-RAM data type
>
> The kernel will store the application ID in the Process struct as a Option<&'static [u8; 48]>. Generally, the slice reference will point into the TBF header itself, although that is up to the board.
>
> TLV Element
>
> We introduce a new TLV element type to store application ID information. The TLV is variable-sized, and its contents are board-specific. In most cases, the TLV will contain the application ID, and in many cases it will also contain ID verification information.

I thought we were trying to avoid variable sized headers?

Proposal v3 avoids the variable-sized header by having distinct TLV entry types for different types of app ID.
 

>
> TLV Element Parsing
>
> The board must provide two functions to the core kernel:
>
> decode_app_id, which validates an application ID TLV entry. If the ID is valid it returns the ID, otherwise it returns None.
> invent_app_id, which either invents and returns an application ID for the process or returns None.
>
> When the core kernel tries to load a process, it will use the first application ID that decode_app_id returns (evaluated in the same order as the TLV entries in the TLB header). If it does not find an ID TLV entry or decode_app_id always returns None, it will call invent_app_id to invent an ID for the process. If invent_app_id returns None the kernel will skip loading that process.
>
>
> Syscall Filter ACL Compression
>
> The system call filter ACLs can contain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The ACLs can then refer to applications by their short ID rather than their application ID, in order to compress the list size. At runtime, the system call filter will need to search the map to find an application's short ID, then perform authorization checks using the short ID. For performance, the system call filter can cache the short ID in the process' grant region.

Where do these ACLs come from? This is one of the major use cases for
the appID and I don't see how the filtering actually happens?

The ACLs could be compiled into the kernel image or deployed separately from the kernel with their own cryptographic verification method.

The filtering is performed by a board component called by the kernel, I think the hooks are already in place for that.
 

>
> Storage ACL Compression
>
> The storage system can maintain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The storage system can then use the short IDs internally, performing lookups as necessary to associate the short IDs with processes at runtime. It can use the grant region to cache the short ID lookup results for a process.

The same problem here, where does the storage ACL come from?

I am assuming that storage ACL are part of the storage system. I would expect processes to specify the permissions of data when they ask the storage system to write the data into storage.
 
>
>
> Design notes
>
> A 48 byte key is 384 bits. A 384 bit hash provides 128 bits of collision resistance against an adversary that can run Grover's algorithm.
> I made the board responsible for inventing application IDs so that boards that want to verify every application can refuse to invent application IDs. I'm concerned that if we invent IDs outside the board file we'll end up with issues where a board that wants to verify IDs accidentally launches apps with unverified IDs.
> A board can invent IDs by compiling a fixed list of app IDs into its image and returning those IDs in order of app (so the first unverified app would use the first fixed ID, the second the second, etc.). The drawback of this approach is that deploying a new process binary (or disabling an existing process binary) could cause an existing process binary's app ID to change and/or move to a different process binary.

If we are just hard coding IDs in order what advantage do we get from
having the IDs in the first place?

The advantage is we would be able to support apps that lack app IDs (backwards compatibility), or whose app IDs are not recognized by the board (cross-board compatibility).

The alternative is making app IDs optional, as I did in my v1 proposal, but that idea wasn't popular.
 
> A board with access to an entropy source could allocate a fixed buffer for invented app IDs in RAM and generate the IDs on the fly. This would avoid confusion across reboots by changing every process binary's app ID at every boot.

Won't that slow down startup waiting for entropy?

Yes, but a board is already responsible for its own performance.
 
> A board could hash a process binary and its location in flash to invent an ID.

That seems like a good option, but couldn't the loader do this instead
of the kernel? Generating a hash of flash could be slow on some
boards.

Hmm, I don't think we've ever had application loaders modify TBF data. I think that idea is somewhat awkward. It would require the application loader to understand *every* TBF header type in order to load an app, as it would need to change offsets into the app (as adding an app ID header would grow the TBF headers). It's also not a concept we can keep forever, as an application loader will not be able to modify the TBF headers of signed apps.

Ostensibly that seems okay to me, but I think it would be very awkward.
 
> decode_app_id and invent_app_id will probably return Result with error enums, not Option, but I left other error cases out of the proposal to keep it understandable.
> The system call filter short IDs and storage short IDs cannot be the same, as the ID allocation is different between them. The system call filter ACLs do not need to dynamically allocate short IDs, as the ACLs themselves are static (presumably compiled into the kernel). On the other hand, the storage system will need to allocate short IDs when new applications are loaded.

The more I think about appIDs the less of a reason I see for them.

Why can we not store the syscall and storage ACLs in the TBF header? A
secure system will do a crypto signature check of the app/TBF header
when loading it so we know they haven't been tampered with.
 
Who would sign the header? It can't be the application author, because they might be malicious. For OpenTitan, we generally won't trust the application loader, or even fully trust the flash memory itelf.

An unsecure system can just trust the headers. Apps can't change their
headers so they can't give themselves more permissions. If every app
lists it's syscalls we can then always enforce syscall filtering which
would be cool. I can auto-generate a list of all syscalls when
building an app so it should be easy to keep track of.

Secure boot needs to be seperate from appIDs anyway using crypto
signatures. For key derivation we could use the signature as an ID or
a hash of the app in flash. IPC could also have it's own ACL
implementation (like the syscall/storage) or use a hash of the app in
flash as an ID.

Using a hash of a process binary as an app ID for IPC will make updating that app very difficult, as other apps would suddenly stop recognizing it.
 

Also, I think we might need to re-think the threat model and put more
trust in the TBF header. It looks like all approaches end up with at
least some trust in the TBF header.

I disagree, on the basis that:
  1. We don't need to trust the TBF headers stored in non-volatile storage.
  2. OpenTitan is unlikely to be willing to trust TBF headers stored in non-volatile storage until they have been cryptographically verified. Even after they are cryptographically verified, we'll only trust the TBF headers to the extent that that cryptographic verification applies. E.g. we won't let two apps signed with different public keys impersonate each other, even though both signatures are verified by the kernel.

Alistair Francis

unread,
Nov 3, 2020, 2:29:20 PM11/3/20
to Leon Schuermann, Johnathan Van Why, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Nov 3, 2020 at 10:23 AM Leon Schuermann
<le...@is.currently.online> wrote:
>
> Alistair Francis <alist...@gmail.com> writes:
> > As Phil has pointed out we actually have some more use cases here. It
> > is possible that you want a second app to access the data. Phil gave
> > the example of migrating the data from app A to app B.
> >
> > It's also possible that we would want to split or merge apps. So what
> > used to be two separate apps with their own storage will be merged
> > into 1 single app.
>
> This is true. For storage, this single, one-to-one mapping of
> applications to respective ids is likely insufficient to cover all of
> the persistent storage use cases we want to support.
>
> I'm convinced that this application ID will nonetheless benefit Tock,
> since (1) it will enable the cryptographic verification use cases are
> required by Jonathan and others, while being agnostic to the precise

Do you have an example of how an appID would be used for cryptographic
verification?

> mechanisms and (2) it will be sufficient for resource allocations where
> a one-to-one mapping is desirable.

So we will support both a 1:1 mapping and a more full featured ACL?
Now there are two ways to allocate resources in the kernel.

>
> Furthermore, this application ID can then be used by the individual
> subsystems to grant access using more complex mechanisms by use of more
> complex authorization mechanisms, which can grant access based on a
>
> (principal, request, object)
>
> granularity. An example for such a mechanism could be an access control
> list. This proposal introduces the principal (app) identification
> mechanism only.
>
> > The same problem as above. An appID doesn't fix the syscall filtering
> > problem. How do we then specify the filters for each app?
>
> Using a separate table, which this proposal does not want to
> introduce. This explains the thread subject, which was my initial idea
> with this discussion: "Nonvolatile application storage, part 1:
> application ID". In a subsequent thread we can think about the precise
> ACL design.

How can we pick an appID mechanism without any idea of what the ACL
will look like? It seems like we are just picking the first part and
hoping the rest will match up later.

For example, if we go with a security manifest that specifies all apps
and permissions. So instead of a linked list of apps we have a
serialised json file and a list of apps (just an example) why do we
need unique appIDs?

--------------------------
| Security Manifest |
| App1: ... |
| App2: ... |
--------------------------
| App1 |
--------------------------
| App2 |
--------------------------

If the entire bundle is signed and checked before loading what does an
appID give us? We could just use the order of the apps.

If we put the ACLs in the TBF headers do we need appIDs either?

It seems like we have settled on appIDs without a clear use case of
what they let us accomplish that we couldn't do without them.

At first I was all on board with appIDs, but the more I think about it
the less and less use cases I see. There are still fundamental
problems like how do we ensure that they are unique? Do we really want
every board to do it's own thing so that app loading is even less
generic?

>
> > 48-bytes for every application is a lot of space. That will have to be
> > stored in flash and parsed in RAM at some point.
>
> This is referring to an outdated version of the proposal. The new
> version. To cite the proposal v3:
> > We will introduce a new type AppId which is a #[repr(C,align(4))]
> > wrapper around [u8; K].

Yep, sorry. I half wrote my reply on Friday, but then lost internet
access so only got around to sending it today.

>
> Refer to the message <87r1pek376.fsf@zirconium> for potential
> implementation strategies of this _generic_ but not _dynamic_
> application ID length. It should allow for efficient static
> implementations, while still being flexible for different needs and
> their respective ID width requirements.

Doesn't that make the elf2tab, tockloader and kernel implementations
more complex though? To support all these different length options?

Won't these different length options break the ability to run any
architecture compiled app on any board (at least of ARM and RISC-V
maybe one day with PIC)?

Alistair Francis

unread,
Nov 3, 2020, 2:52:13 PM11/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Nov 3, 2020 at 10:56 AM 'Johnathan Van Why' via Tock Embedded
OS Development Discussion <tock...@googlegroups.com> wrote:
>
> On Tue, Nov 3, 2020 at 8:18 AM Alistair Francis <alist...@gmail.com> wrote:
>>
>> On Tue, Oct 27, 2020 at 5:18 PM 'Johnathan Van Why' via Tock Embedded
>> OS Development Discussion <tock...@googlegroups.com> wrote:
>> >
>> > Use Case Analysis
>> >
>> > First, I want to perform some more analysis of the uses cases for application IDs that I have identified so far:
>> >
>> > Storage: Processes may want to store data into non-volatile storage that can be accessed by future processes that are part of the same application.
>>
>> As Phil has pointed out we actually have some more use cases here. It
>> is possible that you want a second app to access the data. Phil gave
>> the example of migrating the data from app A to app B.
>>
>> It's also possible that we would want to split or merge apps. So what
>> used to be two separate apps with their own storage will be merged
>> into 1 single app.
>
>
> Merging apps is a bit tricky under my proposal. To merge app B into app A, you would need to deploy a new version of app B that adds app A to the ACLs for the storage being merged, then deploy the merged app (and remove app B).

I don't see any mention of how this ACL is set up though. Am I missing
something?

>
> We could make this easier by allowing processes to have multiple app IDs, at the expense of added complexity (including dynamic allocation for app ID references).

Then it isn't really an appID and is more a list of permissions (which
I think is much more useful).
Why not just call this a signature TLV field? Then the kernel can be
configured to only load valid signed apps. I don't see why we need an
appID for this.

I also don't see any mention of verified IDs in the v3 proposal,
except to say that they are "unspecified for now".

I'm all for adding a signature TLV field for secure boot, but that
seems seperate to an appID.

>
> This allows you to trust the PKI and crypto hardware rather than trusting the bytes in flash memory.
>
>>
>>
>> >
>> > Key derivation
>> >
>> > ID size sensitivity: Smaller IDs would improve key derivation performance, but are not unlikely to be a deal-breaker.
>> > Does trust span across boots: No, if the key verification is performed by dedicated cryptographic hardware (e.g. H1 does this).
>> > Requires cryptographic verification: No. Security will be as strong as the app loading mechanism if unverified IDs are used.
>> >
>> > IPC
>> >
>> > ID size sensitivity: Moderate. Smaller IDs would reduce the size of process binaries that contain application IDs and the RAM usage of processes that manipulate application IDs at runtime.
>> >
>> > Does trust span across boots: No, IPC is only between concurrently-running processes.
>> > Requires cryptographic verification: No, it is an optional security improvement.
>> >
>> >
>> > Proposal v2 Differences
>> >
>> > Based on Leon's concerns and the above analysis, I made a second proposal (below). Here are the differences between the second proposal and my first proposal from June 5:
>> >
>> > Application IDs are a fixed size rather than variable-size, to make them easier to store in a filesystem.
>> > Application IDs are no longer optional. I added a mechanism to allow boards to invent application IDs for processes that don't have a recognized ID.
>> > I expanded the proposal to discuss ACL storage to alleviate concerns about too-large application IDs for some use cases.
>> >
>> >
>> > Proposal v2
>> >
>> > Summary
>> >
>> > Application IDs are arbitrary 48-byte sequences. Application IDs and the information required to verify them are encoded in a userspace binary's TBF headers. The format of the data in the header is board-specific. Boards are responsible for parsing and validating the application ID header themselves; the core kernel is responsible for storing the computed ID. Although each process binary may have multiple IDs in its TBF headers, each process has 1 application ID. If a process binary does not have a recognized ID in the TBF headers, a board must "invent" an ID for that process to load.
>>
>> 48-bytes for every application is a lot of space. That will have to be
>> stored in flash and parsed in RAM at some point.
>>
>> Also, this comes back to the same problem that we don't trust the TBF headers.
>>
>> Is the idea here that each board will parse headers differently?
>> Doesn't that now mean that apps are board specific instead of being
>> architecture specific like they are now (at least on ARM)?
>
>
> Different boards may choose to parse different app ID headers, yes. Proposal v3 addresses this better, but it still won't be 100% uniform. I expect the unverified app ID type that I specified to cover most use cases of Tock. Apps that want to run on boards that use other app ID types can list multiple types in the TBF headers.

This seems like a loss for most Tock use cases. This makes the apps
much less generic or usable. I don't see a good justification of why
this is a benefit.

>
> If an app is deployed to a board that doesn't understand its app ID headers, the board can "invent" an app ID in order to boot the app regardless. Of course, this doesn't work if the board needs secure boot, but by definition a board that needs secure boot does not support all Tock apps.
>
>>
>> >
>> > In-RAM data type
>> >
>> > The kernel will store the application ID in the Process struct as a Option<&'static [u8; 48]>. Generally, the slice reference will point into the TBF header itself, although that is up to the board.
>> >
>> > TLV Element
>> >
>> > We introduce a new TLV element type to store application ID information. The TLV is variable-sized, and its contents are board-specific. In most cases, the TLV will contain the application ID, and in many cases it will also contain ID verification information.
>>
>> I thought we were trying to avoid variable sized headers?
>
>
> Proposal v3 avoids the variable-sized header by having distinct TLV entry types for different types of app ID.

That's good, but does that mean we have a range of different possible
TLV entries?

>
>>
>>
>> >
>> > TLV Element Parsing
>> >
>> > The board must provide two functions to the core kernel:
>> >
>> > decode_app_id, which validates an application ID TLV entry. If the ID is valid it returns the ID, otherwise it returns None.
>> > invent_app_id, which either invents and returns an application ID for the process or returns None.
>> >
>> > When the core kernel tries to load a process, it will use the first application ID that decode_app_id returns (evaluated in the same order as the TLV entries in the TLB header). If it does not find an ID TLV entry or decode_app_id always returns None, it will call invent_app_id to invent an ID for the process. If invent_app_id returns None the kernel will skip loading that process.
>> >
>> >
>> > Syscall Filter ACL Compression
>> >
>> > The system call filter ACLs can contain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The ACLs can then refer to applications by their short ID rather than their application ID, in order to compress the list size. At runtime, the system call filter will need to search the map to find an application's short ID, then perform authorization checks using the short ID. For performance, the system call filter can cache the short ID in the process' grant region.
>>
>> Where do these ACLs come from? This is one of the major use cases for
>> the appID and I don't see how the filtering actually happens?
>
>
> The ACLs could be compiled into the kernel image or deployed separately from the kernel with their own cryptographic verification method.

Where these come from seems pretty important. Like I mentioned in
another reply if we used a security manifest I don't see the need for
appIDs. Compiling ACLs into the kernel directly seems like a bad idea
as then we need to change the kernel just to change an ACL.

>
> The filtering is performed by a board component called by the kernel, I think the hooks are already in place for that.
>
>>
>>
>> >
>> > Storage ACL Compression
>> >
>> > The storage system can maintain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The storage system can then use the short IDs internally, performing lookups as necessary to associate the short IDs with processes at runtime. It can use the grant region to cache the short ID lookup results for a process.
>>
>> The same problem here, where does the storage ACL come from?
>
>
> I am assuming that storage ACL are part of the storage system. I would expect processes to specify the permissions of data when they ask the storage system to write the data into storage.

If an app can specify it's storage settings then we expose ourselves
to a brute force attack. Imagine a compromised app can just keep
guessing 32-bit storage IDs until it can eventually read the secret
data. That doesn't seem like a good idea.

>
>>
>> >
>> >
>> > Design notes
>> >
>> > A 48 byte key is 384 bits. A 384 bit hash provides 128 bits of collision resistance against an adversary that can run Grover's algorithm.
>> > I made the board responsible for inventing application IDs so that boards that want to verify every application can refuse to invent application IDs. I'm concerned that if we invent IDs outside the board file we'll end up with issues where a board that wants to verify IDs accidentally launches apps with unverified IDs.
>> > A board can invent IDs by compiling a fixed list of app IDs into its image and returning those IDs in order of app (so the first unverified app would use the first fixed ID, the second the second, etc.). The drawback of this approach is that deploying a new process binary (or disabling an existing process binary) could cause an existing process binary's app ID to change and/or move to a different process binary.
>>
>> If we are just hard coding IDs in order what advantage do we get from
>> having the IDs in the first place?
>
>
> The advantage is we would be able to support apps that lack app IDs (backwards compatibility), or whose app IDs are not recognized by the board (cross-board compatibility).
>
> The alternative is making app IDs optional, as I did in my v1 proposal, but that idea wasn't popular.

That shouldn't be in the board though. Why does each board need a
different way of handling backwards compatible apps?

>
>>
>> > A board with access to an entropy source could allocate a fixed buffer for invented app IDs in RAM and generate the IDs on the fly. This would avoid confusion across reboots by changing every process binary's app ID at every boot.
>>
>> Won't that slow down startup waiting for entropy?
>
>
> Yes, but a board is already responsible for its own performance.
>
>>
>> > A board could hash a process binary and its location in flash to invent an ID.
>>
>> That seems like a good option, but couldn't the loader do this instead
>> of the kernel? Generating a hash of flash could be slow on some
>> boards.
>
>
> Hmm, I don't think we've ever had application loaders modify TBF data. I think that idea is somewhat awkward. It would require the application loader to understand *every* TBF header type in order to load an app, as it would need to change offsets into the app (as adding an app ID header would grow the TBF headers). It's also not a concept we can keep forever, as an application loader will not be able to modify the TBF headers of signed apps.

Tockloader can already do this. It can change the size of the TBF
header already to modify the header contents.

>
> Ostensibly that seems okay to me, but I think it would be very awkward.

This is the conclusion we reached in the OpenTitan meeting about how
to handle appIDs. My original appID PR added the idea with elf2tab,
but that got a lot of push back. Instead an idea was to do it in the
loader (tockloader) when we create the apps.

As for signed apps it would work then as well as the bundle would be
signed after we have set the IDs for the apps.

>
>>
>> > decode_app_id and invent_app_id will probably return Result with error enums, not Option, but I left other error cases out of the proposal to keep it understandable.
>> > The system call filter short IDs and storage short IDs cannot be the same, as the ID allocation is different between them. The system call filter ACLs do not need to dynamically allocate short IDs, as the ACLs themselves are static (presumably compiled into the kernel). On the other hand, the storage system will need to allocate short IDs when new applications are loaded.
>>
>> The more I think about appIDs the less of a reason I see for them.
>>
>> Why can we not store the syscall and storage ACLs in the TBF header? A
>> secure system will do a crypto signature check of the app/TBF header
>> when loading it so we know they haven't been tampered with.
>
>
> Who would sign the header? It can't be the application author, because they might be malicious. For OpenTitan, we generally won't trust the application loader, or even fully trust the flash memory itelf.

In the case of OpenTitan it would be whoever signs the kernel and the
apps, so I'm assuming the vendor. The header is just part of the app.

>
>> An unsecure system can just trust the headers. Apps can't change their
>> headers so they can't give themselves more permissions. If every app
>> lists it's syscalls we can then always enforce syscall filtering which
>> would be cool. I can auto-generate a list of all syscalls when
>> building an app so it should be easy to keep track of.
>>
>> Secure boot needs to be seperate from appIDs anyway using crypto
>> signatures. For key derivation we could use the signature as an ID or
>> a hash of the app in flash. IPC could also have it's own ACL
>> implementation (like the syscall/storage) or use a hash of the app in
>> flash as an ID.
>
>
> Using a hash of a process binary as an app ID for IPC will make updating that app very difficult, as other apps would suddenly stop recognizing it.

Ah good point. We can use the Java style name that Tock already uses
for IPC then.

>
>>
>>
>> Also, I think we might need to re-think the threat model and put more
>> trust in the TBF header. It looks like all approaches end up with at
>> least some trust in the TBF header.
>
>
> I disagree, on the basis that:
>
> We don't need to trust the TBF headers stored in non-volatile storage.

But the appID is stored in the TBF headers in non-volatile storage and
you plan on using that?

> OpenTitan is unlikely to be willing to trust TBF headers stored in non-volatile storage until they have been cryptographically verified. Even after they are cryptographically verified, we'll only trust the TBF headers to the extent that that cryptographic verification applies. E.g. we won't let two apps signed with different public keys impersonate each other, even though both signatures are verified by the kernel.

Agreed. The TBF headers would be checked when loaded by the kernel to
ensure they make a signature from a trusted public key.

Alistair
> To view this discussion on the web visit https://groups.google.com/d/msgid/tock-dev/CAJqTQ1g1yUef8oiFG0UOOw1UOASLSCNaS5L98dYN3p0RiDW1dg%40mail.gmail.com.

Johnathan Van Why

unread,
Nov 3, 2020, 3:04:21 PM11/3/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Nov 3, 2020 at 11:29 AM Alistair Francis <alist...@gmail.com> wrote:
On Tue, Nov 3, 2020 at 10:23 AM Leon Schuermann
<le...@is.currently.online> wrote:
>
> Alistair Francis <alist...@gmail.com> writes:
> > As Phil has pointed out we actually have some more use cases here. It
> > is possible that you want a second app to access the data. Phil gave
> > the example of migrating the data from app A to app B.
> >
> > It's also possible that we would want to split or merge apps. So what
> > used to be two separate apps with their own storage will be merged
> > into 1 single app.
>
> This is true. For storage, this single, one-to-one mapping of
> applications to respective ids is likely insufficient to cover all of
> the persistent storage use cases we want to support.
>
> I'm convinced that this application ID will nonetheless benefit Tock,
> since (1) it will enable the cryptographic verification use cases are
> required by Jonathan and others, while being agnostic to the precise

Do you have an example of how an appID would be used for cryptographic
verification?

You have an OpenTitan chip in a server. The OpenTitan chip runs two apps:
  1. A root of trust app that verifies that the firmware on the server is signed correctly (i.e. not malicious) and can serve up cryptographic attestations (signatures) vouching for the machine's boot security. This app is certified under government certification models (FIPS, Common Criteria, etc).
  2. A second app that performs distinct, non-government-certified functionality. This app is written by a distinct set of authors.
You want to be able to update both apps independently. The government doesn't trust app number because it isn't certified, so instead it has to trust the kernel's ability to defend app 1 from app 2.

Before the kernel loads an app claiming to be app 1, it needs to verify that app is actually app 1 so that app 2 cannot lie and impersonate app 1. It does this by validating the app's signature against app 1's public key. System call filtering lists would then prevent app 2 from doing operations that only app 1 can do, such as controlling the machine's boot sequence.
 

> mechanisms and (2) it will be sufficient for resource allocations where
> a one-to-one mapping is desirable.

So we will support both a 1:1 mapping and a more full featured ACL?
Now there are two ways to allocate resources in the kernel.

These are identification, authentication, and authorization mechanisms, not resource allocation mechanisms.
 
>
> Furthermore, this application ID can then be used by the individual
> subsystems to grant access using more complex mechanisms by use of more
> complex authorization mechanisms, which can grant access based on a
>
>     (principal, request, object)
>
> granularity. An example for such a mechanism could be an access control
> list. This proposal introduces the principal (app) identification
> mechanism only.
>
> > The same problem as above. An appID doesn't fix the syscall filtering
> > problem. How do we then specify the filters for each app?
>
> Using a separate table, which this proposal does not want to
> introduce. This explains the thread subject, which was my initial idea
> with this discussion: "Nonvolatile application storage, part 1:
> application ID". In a subsequent thread we can think about the precise
> ACL design.

How can we pick an appID mechanism without any idea of what the ACL
will look like? It seems like we are just picking the first part and
hoping the rest will match up later.

ACLs are built on top of app IDs, not the other way around.
 
For example, if we go with a security manifest that specifies all apps
and permissions. So instead of a linked list of apps we have a
serialised json file and a list of apps (just an example) why do we
need unique appIDs?

--------------------------
| Security Manifest |
| App1: ...               |
| App2: ...               |
--------------------------
| App1                    |
--------------------------
|  App2                   |
--------------------------

If the entire bundle is signed and checked before loading what does an
appID give us? We could just use the order of the apps.

Who signs the bundle? OpenTitan's cryptographic use cases of app IDs require a higher level of trust than we have in the application loader, so it can't be the application loader.
 
If we put the ACLs in the TBF headers do we need appIDs either?

That would require trusting the TBF headers, which we don't want to do, as they might be malicious.

We don't need cryptographically-verified app IDs if we trust TBF headers.
 
It seems like we have settled on appIDs without a clear use case of
what they let us accomplish that we couldn't do without them.

I listed 5, and I've yet to see another efficient way to accomplish that.
 
At first I was all on board with appIDs, but the more I think about it
the less and less use cases I see. There are still fundamental
problems like how do we ensure that they are unique? Do we really want
every board to do it's own thing so that app loading is even less
generic?

I never said they need to be unique. Boards can't all be the same as they will use different cryptographic verification mechanisms. OpenTitan will likely perform cryptographic verification that other platforms can't afford.

Consider also that OpenTitan has not yet decided what cryptosystem it will use for app signing -- and there may not even be a unified cryptosystem across all use cases of OpenTitan.

There has to be some board-to-board variation. I did the best I could to avoid making apps depend on specific boards.
 
>
> > 48-bytes for every application is a lot of space. That will have to be
> > stored in flash and parsed in RAM at some point.
>
> This is referring to an outdated version of the proposal. The new
> version. To cite the proposal v3:
> > We will introduce a new type AppId which is a #[repr(C,align(4))]
> > wrapper around [u8; K].

Yep, sorry. I half wrote my reply on Friday, but then lost internet
access so only got around to sending it today.

>
> Refer to the message <87r1pek376.fsf@zirconium> for potential
> implementation strategies of this _generic_ but not _dynamic_
> application ID length. It should allow for efficient static
> implementations, while still being flexible for different needs and
> their respective ID width requirements.

Doesn't that make the elf2tab, tockloader and kernel implementations
more complex though? To support all these different length options?

elf2tab and the kernel, yes. tockloader, no -- tockloader shouldn't need to touch the app ID header.
 
Won't these different length options break the ability to run any
architecture compiled app on any board (at least of ARM and RISC-V
maybe one day with PIC)?

No, I don't see why that would be the case. The exception is boards that have some form of secure boot, but again that is fine by definition.

Johnathan Van Why

unread,
Nov 3, 2020, 3:17:13 PM11/3/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Nov 3, 2020 at 11:52 AM Alistair Francis <alist...@gmail.com> wrote:
On Tue, Nov 3, 2020 at 10:56 AM 'Johnathan Van Why' via Tock Embedded
OS Development Discussion <tock...@googlegroups.com> wrote:
>
> On Tue, Nov 3, 2020 at 8:18 AM Alistair Francis <alist...@gmail.com> wrote:
>>
>> On Tue, Oct 27, 2020 at 5:18 PM 'Johnathan Van Why' via Tock Embedded
>> OS Development Discussion <tock...@googlegroups.com> wrote:
>> >
>> > Use Case Analysis
>> >
>> > First, I want to perform some more analysis of the uses cases for application IDs that I have identified so far:
>> >
>> > Storage: Processes may want to store data into non-volatile storage that can be accessed by future processes that are part of the same application.
>>
>> As Phil has pointed out we actually have some more use cases here. It
>> is possible that you want a second app to access the data. Phil gave
>> the example of migrating the data from app A to app B.
>>
>> It's also possible that we would want to split or merge apps. So what
>> used to be two separate apps with their own storage will be merged
>> into 1 single app.
>
>
> Merging apps is a bit tricky under my proposal. To merge app B into app A, you would need to deploy a new version of app B that adds app A to the ACLs for the storage being merged, then deploy the merged app (and remove app B).

I don't see any mention of how this ACL is set up though. Am I missing
something?

No, you're not missing something. That is the storage system's responsibility. I'm not trying to limit the choices that storage system authors have for designing permissions systems.
 

>
> We could make this easier by allowing processes to have multiple app IDs, at the expense of added complexity (including dynamic allocation for app ID references).

Then it isn't really an appID and is more a list of permissions (which
I think is much more useful).

I agree that we can extend that idea to a permissions system, but it would get expensive. You would need a signature check for each permission. That doesn't interact well with syscall filtering, and for storage I expect it would be used as a single app ID in practice.
A signature isn't useful if you don't use it. A signature gives you an association between the public key and the thing you signed. How would you anticipate we use the signature?
Yes.
 
>
>>
>>
>> >
>> > TLV Element Parsing
>> >
>> > The board must provide two functions to the core kernel:
>> >
>> > decode_app_id, which validates an application ID TLV entry. If the ID is valid it returns the ID, otherwise it returns None.
>> > invent_app_id, which either invents and returns an application ID for the process or returns None.
>> >
>> > When the core kernel tries to load a process, it will use the first application ID that decode_app_id returns (evaluated in the same order as the TLV entries in the TLB header). If it does not find an ID TLV entry or decode_app_id always returns None, it will call invent_app_id to invent an ID for the process. If invent_app_id returns None the kernel will skip loading that process.
>> >
>> >
>> > Syscall Filter ACL Compression
>> >
>> > The system call filter ACLs can contain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The ACLs can then refer to applications by their short ID rather than their application ID, in order to compress the list size. At runtime, the system call filter will need to search the map to find an application's short ID, then perform authorization checks using the short ID. For performance, the system call filter can cache the short ID in the process' grant region.
>>
>> Where do these ACLs come from? This is one of the major use cases for
>> the appID and I don't see how the filtering actually happens?
>
>
> The ACLs could be compiled into the kernel image or deployed separately from the kernel with their own cryptographic verification method.

Where these come from seems pretty important. Like I mentioned in
another reply if we used a security manifest I don't see the need for
appIDs. Compiling ACLs into the kernel directly seems like a bad idea
as then we need to change the kernel just to change an ACL.

>
> The filtering is performed by a board component called by the kernel, I think the hooks are already in place for that.
>
>>
>>
>> >
>> > Storage ACL Compression
>> >
>> > The storage system can maintain a map from 384-bit application IDs to short IDs (e.g. 32-bit integers). The storage system can then use the short IDs internally, performing lookups as necessary to associate the short IDs with processes at runtime. It can use the grant region to cache the short ID lookup results for a process.
>>
>> The same problem here, where does the storage ACL come from?
>
>
> I am assuming that storage ACL are part of the storage system. I would expect processes to specify the permissions of data when they ask the storage system to write the data into storage.

If an app can specify it's storage settings then we expose ourselves
to a brute force attack. Imagine a compromised app can just keep
guessing 32-bit storage IDs until it can eventually read the secret
data. That doesn't seem like a good idea.

No, that's why you tie it to the app ID, which can be cryptographically verified.
 

>
>>
>> >
>> >
>> > Design notes
>> >
>> > A 48 byte key is 384 bits. A 384 bit hash provides 128 bits of collision resistance against an adversary that can run Grover's algorithm.
>> > I made the board responsible for inventing application IDs so that boards that want to verify every application can refuse to invent application IDs. I'm concerned that if we invent IDs outside the board file we'll end up with issues where a board that wants to verify IDs accidentally launches apps with unverified IDs.
>> > A board can invent IDs by compiling a fixed list of app IDs into its image and returning those IDs in order of app (so the first unverified app would use the first fixed ID, the second the second, etc.). The drawback of this approach is that deploying a new process binary (or disabling an existing process binary) could cause an existing process binary's app ID to change and/or move to a different process binary.
>>
>> If we are just hard coding IDs in order what advantage do we get from
>> having the IDs in the first place?
>
>
> The advantage is we would be able to support apps that lack app IDs (backwards compatibility), or whose app IDs are not recognized by the board (cross-board compatibility).
>
> The alternative is making app IDs optional, as I did in my v1 proposal, but that idea wasn't popular.

That shouldn't be in the board though. Why does each board need a
different way of handling backwards compatible apps?

Some boards will want to handle those apps, and some won't. We could make the core kernel invent app IDs, but the boards would still need to configure that value. Note that I made a security argument against having the kernel auto-magically invent app IDs.
 

>
>>
>> > A board with access to an entropy source could allocate a fixed buffer for invented app IDs in RAM and generate the IDs on the fly. This would avoid confusion across reboots by changing every process binary's app ID at every boot.
>>
>> Won't that slow down startup waiting for entropy?
>
>
> Yes, but a board is already responsible for its own performance.
>
>>
>> > A board could hash a process binary and its location in flash to invent an ID.
>>
>> That seems like a good option, but couldn't the loader do this instead
>> of the kernel? Generating a hash of flash could be slow on some
>> boards.
>
>
> Hmm, I don't think we've ever had application loaders modify TBF data. I think that idea is somewhat awkward. It would require the application loader to understand *every* TBF header type in order to load an app, as it would need to change offsets into the app (as adding an app ID header would grow the TBF headers). It's also not a concept we can keep forever, as an application loader will not be able to modify the TBF headers of signed apps.

Tockloader can already do this. It can change the size of the TBF
header already to modify the header contents.

Oh, okay. I'm not against keeping that as an option. I would be okay with removing invented app IDs from the proposal in exchange for making tockloader add an unverified app ID header.
 

>
> Ostensibly that seems okay to me, but I think it would be very awkward.

This is the conclusion we reached in the OpenTitan meeting about how
to handle appIDs. My original appID PR added the idea with elf2tab,
but that got a lot of push back. Instead an idea was to do it in the
loader (tockloader) when we create the apps.

I did not realize the OpenTitan WG meeting would be discussing this, or I would have attended.

For security-relevant discussions, we should have at least one member present who understands the tock threat model.
 

As for signed apps it would work then as well as the bundle would be
signed after we have set the IDs for the apps.

That's a no-go for OpenTitan. Here's an example build and deployment flow:
  1. Apps are built in secure infrastructure, which we trust for integrity.
  2. Apps are signed using an offline CA, which we trust for integrity.
  3. Apps are transmitted using an untrusted deployment mechanism, which we don't trust.
  4. Apps are installed by an application loader. The only part of the application loader we trust is the part that makes sure the linked list remains intact (as required by the threat model), which we only trust for confidentiality and availability.
  5. The next time the chip boots, the kernel verifies the signature from the offline CA to ensure integrity.
We don't trust the application loader for integrity, so it can't produce the app ID signature.
 

>
>>
>> > decode_app_id and invent_app_id will probably return Result with error enums, not Option, but I left other error cases out of the proposal to keep it understandable.
>> > The system call filter short IDs and storage short IDs cannot be the same, as the ID allocation is different between them. The system call filter ACLs do not need to dynamically allocate short IDs, as the ACLs themselves are static (presumably compiled into the kernel). On the other hand, the storage system will need to allocate short IDs when new applications are loaded.
>>
>> The more I think about appIDs the less of a reason I see for them.
>>
>> Why can we not store the syscall and storage ACLs in the TBF header? A
>> secure system will do a crypto signature check of the app/TBF header
>> when loading it so we know they haven't been tampered with.
>
>
> Who would sign the header? It can't be the application author, because they might be malicious. For OpenTitan, we generally won't trust the application loader, or even fully trust the flash memory itelf.

In the case of OpenTitan it would be whoever signs the kernel and the
apps, so I'm assuming the vendor. The header is just part of the app.

The kernel and apps may be signed by different groups.
 

>
>> An unsecure system can just trust the headers. Apps can't change their
>> headers so they can't give themselves more permissions. If every app
>> lists it's syscalls we can then always enforce syscall filtering which
>> would be cool. I can auto-generate a list of all syscalls when
>> building an app so it should be easy to keep track of.
>>
>> Secure boot needs to be seperate from appIDs anyway using crypto
>> signatures. For key derivation we could use the signature as an ID or
>> a hash of the app in flash. IPC could also have it's own ACL
>> implementation (like the syscall/storage) or use a hash of the app in
>> flash as an ID.
>
>
> Using a hash of a process binary as an app ID for IPC will make updating that app very difficult, as other apps would suddenly stop recognizing it.

Ah good point. We can use the Java style name that Tock already uses
for IPC then.

>
>>
>>
>> Also, I think we might need to re-think the threat model and put more
>> trust in the TBF header. It looks like all approaches end up with at
>> least some trust in the TBF header.
>
>
> I disagree, on the basis that:
>
> We don't need to trust the TBF headers stored in non-volatile storage.

But the appID is stored in the TBF headers in non-volatile storage and
you plan on using that?

We don't have to trust it because we cryptographically verify it.

Alistair Francis

unread,
Nov 3, 2020, 3:20:53 PM11/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Nov 3, 2020 at 12:04 PM Johnathan Van Why <jrva...@google.com> wrote:
>
> On Tue, Nov 3, 2020 at 11:29 AM Alistair Francis <alist...@gmail.com> wrote:
>>
>> On Tue, Nov 3, 2020 at 10:23 AM Leon Schuermann
>> <le...@is.currently.online> wrote:
>> >
>> > Alistair Francis <alist...@gmail.com> writes:
>> > > As Phil has pointed out we actually have some more use cases here. It
>> > > is possible that you want a second app to access the data. Phil gave
>> > > the example of migrating the data from app A to app B.
>> > >
>> > > It's also possible that we would want to split or merge apps. So what
>> > > used to be two separate apps with their own storage will be merged
>> > > into 1 single app.
>> >
>> > This is true. For storage, this single, one-to-one mapping of
>> > applications to respective ids is likely insufficient to cover all of
>> > the persistent storage use cases we want to support.
>> >
>> > I'm convinced that this application ID will nonetheless benefit Tock,
>> > since (1) it will enable the cryptographic verification use cases are
>> > required by Jonathan and others, while being agnostic to the precise
>>
>> Do you have an example of how an appID would be used for cryptographic
>> verification?
>
>
> You have an OpenTitan chip in a server. The OpenTitan chip runs two apps:
>
> A root of trust app that verifies that the firmware on the server is signed correctly (i.e. not malicious) and can serve up cryptographic attestations (signatures) vouching for the machine's boot security. This app is certified under government certification models (FIPS, Common Criteria, etc).
> A second app that performs distinct, non-government-certified functionality. This app is written by a distinct set of authors.
>
> You want to be able to update both apps independently. The government doesn't trust app number because it isn't certified, so instead it has to trust the kernel's ability to defend app 1 from app 2.

Ah, so for OpenTitan the apps can be updated individually and are not
all published from the same vendor? That makes more sense now.

But now what happens if the government wants to add a second app that
can also do these trusted operations?

>
> Before the kernel loads an app claiming to be app 1, it needs to verify that app is actually app 1 so that app 2 cannot lie and impersonate app 1. It does this by validating the app's signature against app 1's public key. System call filtering lists would then prevent app 2 from doing operations that only app 1 can do, such as controlling the machine's boot sequence.

Ok, but I'm still missing something. The appID is in the TBF header.
The kernel checks the signature to ensure that app1 is signed by the
government's public key then loads it. Why do we then need an appID?
Just for the ACL?

>
>>
>>
>> > mechanisms and (2) it will be sufficient for resource allocations where
>> > a one-to-one mapping is desirable.
>>
>> So we will support both a 1:1 mapping and a more full featured ACL?
>> Now there are two ways to allocate resources in the kernel.
>
>
> These are identification, authentication, and authorization mechanisms, not resource allocation mechanisms.
>
>>
>> >
>> > Furthermore, this application ID can then be used by the individual
>> > subsystems to grant access using more complex mechanisms by use of more
>> > complex authorization mechanisms, which can grant access based on a
>> >
>> > (principal, request, object)
>> >
>> > granularity. An example for such a mechanism could be an access control
>> > list. This proposal introduces the principal (app) identification
>> > mechanism only.
>> >
>> > > The same problem as above. An appID doesn't fix the syscall filtering
>> > > problem. How do we then specify the filters for each app?
>> >
>> > Using a separate table, which this proposal does not want to
>> > introduce. This explains the thread subject, which was my initial idea
>> > with this discussion: "Nonvolatile application storage, part 1:
>> > application ID". In a subsequent thread we can think about the precise
>> > ACL design.
>>
>> How can we pick an appID mechanism without any idea of what the ACL
>> will look like? It seems like we are just picking the first part and
>> hoping the rest will match up later.
>
>
> ACLs are built on top of app IDs, not the other way around.

Agreed, but they should at least be considered when we pick an appID mechanism.

>
>>
>> For example, if we go with a security manifest that specifies all apps
>> and permissions. So instead of a linked list of apps we have a
>> serialised json file and a list of apps (just an example) why do we
>> need unique appIDs?
>>
>> --------------------------
>> | Security Manifest |
>> | App1: ... |
>> | App2: ... |
>> --------------------------
>> | App1 |
>> --------------------------
>> | App2 |
>> --------------------------
>>
>> If the entire bundle is signed and checked before loading what does an
>> appID give us? We could just use the order of the apps.
>
>
> Who signs the bundle? OpenTitan's cryptographic use cases of app IDs require a higher level of trust than we have in the application loader, so it can't be the application loader.

I didn't realise that OT will support different vendors shipping apps.
That does make the problem harder.

>
>>
>> If we put the ACLs in the TBF headers do we need appIDs either?
>
>
> That would require trusting the TBF headers, which we don't want to do, as they might be malicious.

But the appID is in the same header.

>
> We don't need cryptographically-verified app IDs if we trust TBF headers.
>
>>
>> It seems like we have settled on appIDs without a clear use case of
>> what they let us accomplish that we couldn't do without them.
>
>
> I listed 5, and I've yet to see another efficient way to accomplish that.

I still don't fully see how appIDs link to all of them.

>
>>
>> At first I was all on board with appIDs, but the more I think about it
>> the less and less use cases I see. There are still fundamental
>> problems like how do we ensure that they are unique? Do we really want
>> every board to do it's own thing so that app loading is even less
>> generic?
>
>
> I never said they need to be unique. Boards can't all be the same as they will use different cryptographic verification mechanisms. OpenTitan will likely perform cryptographic verification that other platforms can't afford.

That's fine, but why not just have a sign TLV entry that OT uses?

Alistair

Johnathan Van Why

unread,
Nov 3, 2020, 3:33:12 PM11/3/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
We want to leave that open as a possibility. It also lines up with Tock's threat model.
 

But now what happens if the government wants to add a second app that
can also do these trusted operations?

You can do two things:
  1. Update the syscall filter ACLs (either via a kernel update if they are in the kernel or an ACL update if the ACLs are deployed separately) to list the new app ID.
  2. Add a second process binary using the same app ID.
The "permissions" model (where we have a separate signature for each authorized action) supports this more nicely, but it may be too inefficient for Tock.
 

>
> Before the kernel loads an app claiming to be app 1, it needs to verify that app is actually app 1 so that app 2 cannot lie and impersonate app 1. It does this by validating the app's signature against app 1's public key. System call filtering lists would then prevent app 2 from doing operations that only app 1 can do, such as controlling the machine's boot sequence.

Ok, but I'm still missing something. The appID is in the TBF header.
The kernel checks the signature to ensure that app1 is signed by the
government's public key then loads it. Why do we then need an appID?
Just for the ACL?

Yes.
 

>
>>
>>
>> > mechanisms and (2) it will be sufficient for resource allocations where
>> > a one-to-one mapping is desirable.
>>
>> So we will support both a 1:1 mapping and a more full featured ACL?
>> Now there are two ways to allocate resources in the kernel.
>
>
> These are identification, authentication, and authorization mechanisms, not resource allocation mechanisms.
>
>>
>> >
>> > Furthermore, this application ID can then be used by the individual
>> > subsystems to grant access using more complex mechanisms by use of more
>> > complex authorization mechanisms, which can grant access based on a
>> >
>> >     (principal, request, object)
>> >
>> > granularity. An example for such a mechanism could be an access control
>> > list. This proposal introduces the principal (app) identification
>> > mechanism only.
>> >
>> > > The same problem as above. An appID doesn't fix the syscall filtering
>> > > problem. How do we then specify the filters for each app?
>> >
>> > Using a separate table, which this proposal does not want to
>> > introduce. This explains the thread subject, which was my initial idea
>> > with this discussion: "Nonvolatile application storage, part 1:
>> > application ID". In a subsequent thread we can think about the precise
>> > ACL design.
>>
>> How can we pick an appID mechanism without any idea of what the ACL
>> will look like? It seems like we are just picking the first part and
>> hoping the rest will match up later.
>
>
> ACLs are built on top of app IDs, not the other way around.

Agreed, but they should at least be considered when we pick an appID mechanism.

Agreed on considering them. Note that proposal v3 contains a design for compressing app IDs to make ACLs more compact. I just don't want to say "ACLs *will* be designed this way" because that constrains future development in a way that is unnecessary.
 

>
>>
>> For example, if we go with a security manifest that specifies all apps
>> and permissions. So instead of a linked list of apps we have a
>> serialised json file and a list of apps (just an example) why do we
>> need unique appIDs?
>>
>> --------------------------
>> | Security Manifest |
>> | App1: ...               |
>> | App2: ...               |
>> --------------------------
>> | App1                    |
>> --------------------------
>> |  App2                   |
>> --------------------------
>>
>> If the entire bundle is signed and checked before loading what does an
>> appID give us? We could just use the order of the apps.
>
>
> Who signs the bundle? OpenTitan's cryptographic use cases of app IDs require a higher level of trust than we have in the application loader, so it can't be the application loader.

I didn't realise that OT will support different vendors shipping apps.
That does make the problem harder.

I can't make that claim on behalf of the OT project, but I want to keep it open as a possibility. It is directly in line with Tock's threat model, which considers different apps to be potentially hostile to each other.
 

>
>>
>> If we put the ACLs in the TBF headers do we need appIDs either?
>
>
> That would require trusting the TBF headers, which we don't want to do, as they might be malicious.

But the appID is in the same header.

>
> We don't need cryptographically-verified app IDs if we trust TBF headers.
>
>>
>> It seems like we have settled on appIDs without a clear use case of
>> what they let us accomplish that we couldn't do without them.
>
>
> I listed 5, and I've yet to see another efficient way to accomplish that.

I still don't fully see how appIDs link to all of them.

  • Secure boot: The kernel has a list of app IDs it is willing to launch, and it only launches those app IDs.
  • Syscall filter: The kernel has ACLs giving app IDs the permission to call specific system calls.
  • Key derivation: Processes can ask for encryption keys that are only available to other processes with the same app ID (including across system restarts).
  • IPC: An process can confirm that a process it is exchanging messages with has a particular app ID, which e.g. can be tied to a particular author.
  • Storage: When a process writes data into the storage layer, it can tell the storage layer "only apps X, Y, and Z can access this", where X, Y, and Z are app IDs.
 

>
>>
>> At first I was all on board with appIDs, but the more I think about it
>> the less and less use cases I see. There are still fundamental
>> problems like how do we ensure that they are unique? Do we really want
>> every board to do it's own thing so that app loading is even less
>> generic?
>
>
> I never said they need to be unique. Boards can't all be the same as they will use different cryptographic verification mechanisms. OpenTitan will likely perform cryptographic verification that other platforms can't afford.

That's fine, but why not just have a sign TLV entry that OT uses?

You mean a signed TLV entry that OT uses while other boards ignore it? That would have the same effect as my design. Apps that want to run on any board but have high security on an OT board can have two TLV entries: one that uses the OT crypto verification scheme and one that is unverified. They could even list the same ID.

Leon Schuermann

unread,
Nov 3, 2020, 4:44:03 PM11/3/20
to Johnathan Van Why, Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion

Alistair, this is to explain my (and I believe up to an extend also
Johnathan's) reasoning behind the design with regards to nonvolatile
storage.

"'Johnathan Van Why' via Tock Embedded OS Development Discussion"
<tock...@googlegroups.com> writes:
> Merging apps is a bit tricky under my proposal. To merge app B into app A,
> you would need to deploy a new version of app B that adds app A to the ACLs
> for the storage being merged, then deploy the merged app (and remove app B).
>
> We could make this easier by allowing processes to have multiple app IDs,
> at the expense of added complexity (including dynamic allocation for app ID
> references).

It doesn't have to be. Just because we have an application ID does not
necessarily mean that we are going to have to use it in every scenario
that comes up. It might be entirely reasonable to design a
storage-resource allocation system which uses resource group identifiers
stored in the TBF header, and then provides access based on those groups
(to stay in the terminology, "ticket" or "token based authorization;
e.g. perform authorization with or without prior authentication, but not
bound a specific principal but on the holder's allowed (request,object)
tuples).

What I'm getting at: the application ID that this proposal would
introduce is first and foremost a technical and abstract interface in
the kernel which associates each application binary a (persistent)
ID. This is important as to allow different subsystems or even
authorization mechanisms themselves to be entirely independent of the
board and the ID storage, loading and potentially verification
mechanism. More precisely: the origin and the nature of the ID does not
influence the enforcement of whatever properties should be enforced
using this ID.



To "solve" the storage resource allocation issue: each board - for
different use cases - will use a different mechanism for enforcing the
storage subsystem permissions. We can define a standard method used in
upstream boards, but from the discussions it's apparent that there is no
fundamental "one size fits all" approach to this issue. Instead, we
should provide a pluggable infrastructure (traits) for granting access
to objects (as part of subsystems) to principals (apps).

The application ID can, but does not have to be one factor in the
authorization consideration, i.e. whether to give an application access
to resources. It's usability must not solely be justified with a
nonvolatile storage resource allocation goal, but is certainly suited
for this in many scenarios.

Leon
signature.asc

Alistair Francis

unread,
Nov 3, 2020, 5:11:19 PM11/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
If you can add a second process, can't a malicious app add itself?

I'm assuming that the public key used for the signature check is
matched from the appID, so the app fails to load if it's malicious?

What if the new app needs the old syscalls and some new ones?

This process seems a little convoluted and complex for most Tock uses.

>
> The "permissions" model (where we have a separate signature for each authorized action) supports this more nicely, but it may be too inefficient for Tock.

I don't understand this. Each permission is signed with what?

>
>>
>>
>> >
>> > Before the kernel loads an app claiming to be app 1, it needs to verify that app is actually app 1 so that app 2 cannot lie and impersonate app 1. It does this by validating the app's signature against app 1's public key. System call filtering lists would then prevent app 2 from doing operations that only app 1 can do, such as controlling the machine's boot sequence.
>>
>> Ok, but I'm still missing something. The appID is in the TBF header.
>> The kernel checks the signature to ensure that app1 is signed by the
>> government's public key then loads it. Why do we then need an appID?
>> Just for the ACL?
>
>
> Yes.

Ok, so the appID is just to say that this app X can do operations Y.
It can't be in the TBF header as we will allow possibly malicious apps
to be loaded and we can't trust that they will list the correct
operations they are allowed to do. I am understanding that?

In that case I still think overall a better solution is to list
permissions in the TBF headers. For OT or some other secure critical
application there can be a loader in the Tock board that checks the
permissions with an approved list. The appID to check can then come
from the public key used to sign.

I guess what I'm trying to say is that I see some OT specific use
cases where appIDs are useful, but for lots of Tock boards/uses they
aren't really required and I don't think they contribute much.

For example to have two apps both with their own persistent storage
now seems very complicated to implement.
Agreed. I don't think we need to have a complete solution here, but I
would like to see more consideration on what the ACLs look like, as
that is one of the major reasons for an appID.

>
>>
>>
>> >
>> >>
>> >> For example, if we go with a security manifest that specifies all apps
>> >> and permissions. So instead of a linked list of apps we have a
>> >> serialised json file and a list of apps (just an example) why do we
>> >> need unique appIDs?
>> >>
>> >> --------------------------
>> >> | Security Manifest |
>> >> | App1: ... |
>> >> | App2: ... |
>> >> --------------------------
>> >> | App1 |
>> >> --------------------------
>> >> | App2 |
>> >> --------------------------
>> >>
>> >> If the entire bundle is signed and checked before loading what does an
>> >> appID give us? We could just use the order of the apps.
>> >
>> >
>> > Who signs the bundle? OpenTitan's cryptographic use cases of app IDs require a higher level of trust than we have in the application loader, so it can't be the application loader.
>>
>> I didn't realise that OT will support different vendors shipping apps.
>> That does make the problem harder.
>
>
> I can't make that claim on behalf of the OT project, but I want to keep it open as a possibility. It is directly in line with Tock's threat model, which considers different apps to be potentially hostile to each other.

True.

>
>>
>>
>> >
>> >>
>> >> If we put the ACLs in the TBF headers do we need appIDs either?
>> >
>> >
>> > That would require trusting the TBF headers, which we don't want to do, as they might be malicious.
>>
>> But the appID is in the same header.
>>
>> >
>> > We don't need cryptographically-verified app IDs if we trust TBF headers.
>> >
>> >>
>> >> It seems like we have settled on appIDs without a clear use case of
>> >> what they let us accomplish that we couldn't do without them.
>> >
>> >
>> > I listed 5, and I've yet to see another efficient way to accomplish that.
>>
>> I still don't fully see how appIDs link to all of them.
>
>
> Secure boot: The kernel has a list of app IDs it is willing to launch, and it only launches those app IDs.
> Syscall filter: The kernel has ACLs giving app IDs the permission to call specific system calls.
> Key derivation: Processes can ask for encryption keys that are only available to other processes with the same app ID (including across system restarts).
> IPC: An process can confirm that a process it is exchanging messages with has a particular app ID, which e.g. can be tied to a particular author.
> Storage: When a process writes data into the storage layer, it can tell the storage layer "only apps X, Y, and Z can access this", where X, Y, and Z are app IDs.

This is very helpful, thanks.

>
>
>>
>>
>> >
>> >>
>> >> At first I was all on board with appIDs, but the more I think about it
>> >> the less and less use cases I see. There are still fundamental
>> >> problems like how do we ensure that they are unique? Do we really want
>> >> every board to do it's own thing so that app loading is even less
>> >> generic?
>> >
>> >
>> > I never said they need to be unique. Boards can't all be the same as they will use different cryptographic verification mechanisms. OpenTitan will likely perform cryptographic verification that other platforms can't afford.
>>
>> That's fine, but why not just have a sign TLV entry that OT uses?
>
>
> You mean a signed TLV entry that OT uses while other boards ignore it? That would have the same effect as my design. Apps that want to run on any board but have high security on an OT board can have two TLV entries: one that uses the OT crypto verification scheme and one that is unverified. They could even list the same ID.

Yes. A signed TLV entry that is used for secure boards (OT and whoever
else) to load apps. In which case after the app is loaded by the
kernel we don't need to worry about the appIDs. This takes the burden
off other apps/boards having to have an appID.

Alistair
> --
> You received this message because you are subscribed to the Google Groups "Tock Embedded OS Development Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to tock-dev+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/tock-dev/CAJqTQ1jFU5oibkJKDQDctjkNq%3D%2B_jEXpNK0gei2vsDjX1sfGJA%40mail.gmail.com.

Alistair Francis

unread,
Nov 3, 2020, 5:20:56 PM11/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Tue, Nov 3, 2020 at 12:17 PM Johnathan Van Why <jrva...@google.com> wrote:
>
> On Tue, Nov 3, 2020 at 11:52 AM Alistair Francis <alist...@gmail.com> wrote:
>>
>> On Tue, Nov 3, 2020 at 10:56 AM 'Johnathan Van Why' via Tock Embedded
>> OS Development Discussion <tock...@googlegroups.com> wrote:
>> >
>> > On Tue, Nov 3, 2020 at 8:18 AM Alistair Francis <alist...@gmail.com> wrote:
>> >>
>> >> On Tue, Oct 27, 2020 at 5:18 PM 'Johnathan Van Why' via Tock Embedded
>> >> OS Development Discussion <tock...@googlegroups.com> wrote:
>> >> >
>> >> > Use Case Analysis
>> >> >
>> >> > First, I want to perform some more analysis of the uses cases for application IDs that I have identified so far:
>> >> >
>> >> > Storage: Processes may want to store data into non-volatile storage that can be accessed by future processes that are part of the same application.
>> >>
>> >> As Phil has pointed out we actually have some more use cases here. It
>> >> is possible that you want a second app to access the data. Phil gave
>> >> the example of migrating the data from app A to app B.
>> >>
>> >> It's also possible that we would want to split or merge apps. So what
>> >> used to be two separate apps with their own storage will be merged
>> >> into 1 single app.
>> >
>> >
>> > Merging apps is a bit tricky under my proposal. To merge app B into app A, you would need to deploy a new version of app B that adds app A to the ACLs for the storage being merged, then deploy the merged app (and remove app B).
>>
>> I don't see any mention of how this ACL is set up though. Am I missing
>> something?
>
>
> No, you're not missing something. That is the storage system's responsibility. I'm not trying to limit the choices that storage system authors have for designing permissions systems.

The problem is I don't see mappings from an appID to a storage system
that is easy to use for most boards.

>
>>
>>
>> >
>> > We could make this easier by allowing processes to have multiple app IDs, at the expense of added complexity (including dynamic allocation for app ID references).
>>
>> Then it isn't really an appID and is more a list of permissions (which
>> I think is much more useful).
>
>
> I agree that we can extend that idea to a permissions system, but it would get expensive. You would need a signature check for each permission. That doesn't interact well with syscall filtering, and for storage I expect it would be used as a single app ID in practice.

What signature would we check with each permission?
I was picturing a single public key, you pointed out though that OT
might expect multiple different public keys from multiple vendors.
I like this a lot more.

>
>>
>>
>> >
>> > Ostensibly that seems okay to me, but I think it would be very awkward.
>>
>> This is the conclusion we reached in the OpenTitan meeting about how
>> to handle appIDs. My original appID PR added the idea with elf2tab,
>> but that got a lot of push back. Instead an idea was to do it in the
>> loader (tockloader) when we create the apps.
>
>
> I did not realize the OpenTitan WG meeting would be discussing this, or I would have attended.
>
> For security-relevant discussions, we should have at least one member present who understands the tock threat model.
>
>>
>>
>> As for signed apps it would work then as well as the bundle would be
>> signed after we have set the IDs for the apps.
>
>
> That's a no-go for OpenTitan. Here's an example build and deployment flow:
>
> Apps are built in secure infrastructure, which we trust for integrity.
> Apps are signed using an offline CA, which we trust for integrity.

At this point the loader application (tockloader, a python script,
whatever it is) will set the appID and then sign the app.

When I say loader application I mean the thing that takes all of the
apps we want to run and stitches them together. I was assuming a
single application binary (with all the apps) would be signed and
deployed.

For RISC-V in reality this is pretty much what will have to happen as
apps can't be run from any address, so we need to have some
understanding of where they will go when built.

> Apps are transmitted using an untrusted deployment mechanism, which we don't trust.
> Apps are installed by an application loader. The only part of the application loader we trust is the part that makes sure the linked list remains intact (as required by the threat model), which we only trust for confidentiality and availability.
> The next time the chip boots, the kernel verifies the signature from the offline CA to ensure integrity.
>
> We don't trust the application loader for integrity, so it can't produce the app ID signature.
>
>>
>>
>> >
>> >>
>> >> > decode_app_id and invent_app_id will probably return Result with error enums, not Option, but I left other error cases out of the proposal to keep it understandable.
>> >> > The system call filter short IDs and storage short IDs cannot be the same, as the ID allocation is different between them. The system call filter ACLs do not need to dynamically allocate short IDs, as the ACLs themselves are static (presumably compiled into the kernel). On the other hand, the storage system will need to allocate short IDs when new applications are loaded.
>> >>
>> >> The more I think about appIDs the less of a reason I see for them.
>> >>
>> >> Why can we not store the syscall and storage ACLs in the TBF header? A
>> >> secure system will do a crypto signature check of the app/TBF header
>> >> when loading it so we know they haven't been tampered with.
>> >
>> >
>> > Who would sign the header? It can't be the application author, because they might be malicious. For OpenTitan, we generally won't trust the application loader, or even fully trust the flash memory itelf.
>>
>> In the case of OpenTitan it would be whoever signs the kernel and the
>> apps, so I'm assuming the vendor. The header is just part of the app.
>
>
> The kernel and apps may be signed by different groups.

Yep, I got that from your other email. This does make things more complicated.

Alistair

Alistair Francis

unread,
Nov 3, 2020, 5:39:01 PM11/3/20
to Leon Schuermann, Johnathan Van Why, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
I'm all on board with this. What I'm concerned about is Johanathon's 5
use cases. It is described that an appID is the solution to all of
these and I'm not convinced that it's the only or the best solution to
each one. I kind of think that we are designing an appID without
designing the complete solution and we will have these appIDs that
don't really help with the use cases we thought they would and add
extra complexity. For something like OT the extra complexity isn't an
issue as there is a team of people setting up the IDs, build systems
and ACLs. But the appID implementation will apply to all boards/apps
and it seems overlay complex for some simple use cases such as
persistent storage.

>
>
>
> To "solve" the storage resource allocation issue: each board - for
> different use cases - will use a different mechanism for enforcing the
> storage subsystem permissions. We can define a standard method used in
> upstream boards, but from the discussions it's apparent that there is no
> fundamental "one size fits all" approach to this issue. Instead, we
> should provide a pluggable infrastructure (traits) for granting access
> to objects (as part of subsystems) to principals (apps).

My worry here is that each board does it's own thing and we have
competing versions all with their own bugs and problems. I agree we
can't please everyone, OT wants security while a hobby board wants
ease of use. What I'm hoping to see though is more of the
implementation in the kernel instead of custom boards so that we don't
have all these board specific options that are hard to test and
maintain.

Ideally it would be great if the only different between OT and other
apps is that OT does crypto signature checks and we can otherwise
re-use the existing infrastructure. I am worried that the appID
proposal will split to OT doing all their own things while everyone
else has to support appIDs but never uses them.

>
> The application ID can, but does not have to be one factor in the
> authorization consideration, i.e. whether to give an application access
> to resources. It's usability must not solely be justified with a
> nonvolatile storage resource allocation goal, but is certainly suited
> for this in many scenarios.

I'm not convinced an appID is a good fit for non-volatile storage. As
Phil pointed out to me I think it reduces flexibility using just an
appID for access. So then we need an ACL, but how do we do that
generically? Do we tell all users to rebuild their kernel everytime
they want to change app permissions?

I see Johnathon's point about using an appID to seculey load
applications from multiple different vendors (with different
public/private keys for signing). I think that could be done with a
signature TLV instead though, which only secure boards like OT need to
worry about.

Alistair

Johnathan Van Why

unread,
Nov 20, 2020, 3:31:39 PM11/20/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
Alistair requested that I explain how my application ID proposal can be used in solutions to each of the five use cases I identified. Here are examples for how to solve each of the use cases (except for Secure Boot, where I instead argue the use case doesn't exist).

Proposal v3 Appendix A

Storage
Note: In this example I omit the short ID mechanism I described earlier to keep the description simpler.

For each piece of data stored in storage, the storage layer would store the following metadata:

enum Permissions { ReadOnly, ReadWrite }

Owner: [u8]
Name: [u8]
Other users with access: [(app: [u8], Permissions)]

When processes try to access a piece of data in storage, they specify both the owner of the data and the name of the data (different owners can own data with the same name -- this is to avoid the denial-of-service issues I brought up at the core WG call a week or two ago). If the process is the owner of the data (its app ID matches the "owner" byte sequence), then it is automatically granted access. If not, the system scans through the "other users with access" list to see if the app ID matches any of the entries in that list. If so, the permissions in that list are checked against the requested operation (i.e. read or write). If the process' app ID is not found the access is denied.

System call filtering

The kernel image contains the following data structure:

enum Filter { AllowAll, AllowOnly(&[&[u8]]) }
static ACLs: &[(driver: usize, Filter)] = ...;

When a process makes a system call, the system call filter scans through ACLs searching for the driver number. If it does not find the driver number, it denies the request. If it finds the driver number, along with Filter::AllowAll, it allows the syscall. If it finds the driver number along with Filter::AllowOnly, it scans through the list inside AllowOnly checking for the app ID. If it finds the app ID, the syscall is allowed, otherwise it denies the syscall.

Secure boot

I no longer think this use case is important. For secure boot to be useful, you would need a kernel that is cryptographically verified by a bootloader, but not verify the apps with the same bootloader. I would expect anyone wanting secure boot in Tock to just verify the entire image (kernel + apps) via the bootloader.

Key derivation

When a process asks for the encryption key named "key1", the kernel feeds the following data into the key derivation function: the hardware's secret key, "key1", and the process' app ID. This produces an encryption key that is unique to that hardware, the name "key1" (so the app can ask for multiple different encryption keys), and the application.

IPC

When a process receives a message via the IPC capsule, the IPC capsule writes the application ID of the process that sent the message into an allow-ed buffer. The app can then use that information as it wishes, such as to separate data between apps (e.g. an app that implements UDP can route packets to other apps based on the port number) or only accept messages from another app it trusts.

When a process sends a message via the IPC capsule, it specifies the app ID of the process that it wants to send the message to. The IPC capsule routes the message appropriately.

IPC is the one use case where having distinct app IDs (no two processes concurrently executing on a Tock system may have the same app ID) is beneficial. If we allow duplicate app IDs, here is a few ways the IPC capsule can route requests:
  1. It could broadcast the message to all processes with that app ID
  2. It could return an error, forcing processes that want to use IPC to have their own app ID
  3. It could identify processes by a combination of app ID and a second, non-cryptographically-verified identified encoded in the TBF header.

Johnathan Van Why

unread,
Nov 20, 2020, 3:33:55 PM11/20/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
I spent some time trying to design a secure permission list mechanism that avoids the per-app signatures that I have in my scheme, and I failed to come up with a workable option. Anyone else is welcome to come up with a concrete design we can discuss, but for now I will stand behind my existing proposal.

-Johnathan

Alistair Francis

unread,
Dec 3, 2020, 2:39:24 PM12/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Fri, Nov 20, 2020 at 12:31 PM Johnathan Van Why <jrva...@google.com> wrote:
>
> Alistair requested that I explain how my application ID proposal can be used in solutions to each of the five use cases I identified. Here are examples for how to solve each of the use cases (except for Secure Boot, where I instead argue the use case doesn't exist).

Thanks for writing this up.

>
> Proposal v3 Appendix A
>
> Storage
> Note: In this example I omit the short ID mechanism I described earlier to keep the description simpler.
>
> For each piece of data stored in storage, the storage layer would store the following metadata:
>
> enum Permissions { ReadOnly, ReadWrite }
>
>
> Owner: [u8]
>
> Name: [u8]
>
> Other users with access: [(app: [u8], Permissions)]
>
>
> When processes try to access a piece of data in storage, they specify both the owner of the data and the name of the data (different owners can own data with the same name -- this is to avoid the denial-of-service issues I brought up at the core WG call a week or two ago). If the process is the owner of the data (its app ID matches the "owner" byte sequence), then it is automatically granted access. If not, the system scans through the "other users with access" list to see if the app ID matches any of the entries in that list. If so, the permissions in that list are checked against the requested operation (i.e. read or write). If the process' app ID is not found the access is denied.

Where does this permission list come from though? Is it specified by
the app that creates the region? Is it hard coded in the kernel?

So for each piece of data we store an owner, name and list of users
with access? That seems like a lot of overhead for a small flash
storage.

>
> System call filtering
>
> The kernel image contains the following data structure:
>
> enum Filter { AllowAll, AllowOnly(&[&[u8]]) }
> static ACLs: &[(driver: usize, Filter)] = ...;
>
>
> When a process makes a system call, the system call filter scans through ACLs searching for the driver number. If it does not find the driver number, it denies the request. If it finds the driver number, along with Filter::AllowAll, it allows the syscall. If it finds the driver number along with Filter::AllowOnly, it scans through the list inside AllowOnly checking for the app ID. If it finds the app ID, the syscall is allowed, otherwise it denies the syscall.

The same question. I like this list, but I don't understand where it
comes from. Does it have to be hard coded in the kernel?

I also don't see why this wouldn't work with the method I mentioned.

For syscall filtering for example. Each TBF header lists the syscalls
the app is allowed to do. The kernel can then use this list exactly as
you mention above. We now have an easy way for apps to specify
permissions. Although note, this is not secure enough for OpenTitain.

For OpenTitan and other secure use cases the kernel can also have it's
own list, exactly like you mention above. Then the kernel can compare
the app TBF header permissions with the one it already has. This
provides the same security you are mentioning here, but also allows
boards/users who don't need the extra complexity to just use apps.

>
> Secure boot
>
> I no longer think this use case is important. For secure boot to be useful, you would need a kernel that is cryptographically verified by a bootloader, but not verify the apps with the same bootloader. I would expect anyone wanting secure boot in Tock to just verify the entire image (kernel + apps) via the bootloader.
>
> Key derivation
>
> When a process asks for the encryption key named "key1", the kernel feeds the following data into the key derivation function: the hardware's secret key, "key1", and the process' app ID. This produces an encryption key that is unique to that hardware, the name "key1" (so the app can ask for multiple different encryption keys), and the application.

Agreed.

>
> IPC
>
> When a process receives a message via the IPC capsule, the IPC capsule writes the application ID of the process that sent the message into an allow-ed buffer. The app can then use that information as it wishes, such as to separate data between apps (e.g. an app that implements UDP can route packets to other apps based on the port number) or only accept messages from another app it trusts.
>
> When a process sends a message via the IPC capsule, it specifies the app ID of the process that it wants to send the message to. The IPC capsule routes the message appropriately.
>
> IPC is the one use case where having distinct app IDs (no two processes concurrently executing on a Tock system may have the same app ID) is beneficial. If we allow duplicate app IDs, here is a few ways the IPC capsule can route requests:
>
> It could broadcast the message to all processes with that app ID

This seems like a way for a malicious app to eavesdrop on messages.

> It could return an error, forcing processes that want to use IPC to have their own app ID

The app that is sending data can't really handle that error though.
This will allow a DOS from a malicious appID.

> It could identify processes by a combination of app ID and a second, non-cryptographically-verified identified encoded in the TBF header.

I think IPC is hard to do without unique IDs enforced on the board, at
least to do securley.

Alistair

Johnathan Van Why

unread,
Dec 3, 2020, 3:40:31 PM12/3/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Thu, Dec 3, 2020 at 11:39 AM Alistair Francis <alist...@gmail.com> wrote:
On Fri, Nov 20, 2020 at 12:31 PM Johnathan Van Why <jrva...@google.com> wrote:
>
> Alistair requested that I explain how my application ID proposal can be used in solutions to each of the five use cases I identified. Here are examples for how to solve each of the use cases (except for Secure Boot, where I instead argue the use case doesn't exist).

Thanks for writing this up.

>
> Proposal v3 Appendix A
>
> Storage
> Note: In this example I omit the short ID mechanism I described earlier to keep the description simpler.
>
> For each piece of data stored in storage, the storage layer would store the following metadata:
>
> enum Permissions { ReadOnly, ReadWrite }
>
>
> Owner: [u8]
>
> Name: [u8]
>
> Other users with access: [(app: [u8], Permissions)]
>
>
> When processes try to access a piece of data in storage, they specify both the owner of the data and the name of the data (different owners can own data with the same name -- this is to avoid the denial-of-service issues I brought up at the core WG call a week or two ago). If the process is the owner of the data (its app ID matches the "owner" byte sequence), then it is automatically granted access. If not, the system scans through the "other users with access" list to see if the app ID matches any of the entries in that list. If so, the permissions in that list are checked against the requested operation (i.e. read or write). If the process' app ID is not found the access is denied.

Where does this permission list come from though? Is it specified by
the app that creates the region? Is it hard coded in the kernel?

It is specified by the app that creates the region.
 
So for each piece of data we store an owner, name and list of users
with access? That seems like a lot of overhead for a small flash
storage.

Yes.
 
>
> System call filtering
>
> The kernel image contains the following data structure:
>
> enum Filter { AllowAll, AllowOnly(&[&[u8]]) }
> static ACLs: &[(driver: usize, Filter)] = ...;
>
>
> When a process makes a system call, the system call filter scans through ACLs searching for the driver number. If it does not find the driver number, it denies the request. If it finds the driver number, along with Filter::AllowAll, it allows the syscall. If it finds the driver number along with Filter::AllowOnly, it scans through the list inside AllowOnly checking for the app ID. If it finds the app ID, the syscall is allowed, otherwise it denies the syscall.

The same question. I like this list, but I don't understand where it
comes from. Does it have to be hard coded in the kernel?

Yes, it would be hardcoded in the kernel.
 
I also don't see why this wouldn't work with the method I mentioned.

For syscall filtering for example. Each TBF header lists the syscalls
the app is allowed to do. The kernel can then use this list exactly as
you mention above. We now have an easy way for apps to specify
permissions. Although note, this is not secure enough for OpenTitain.

For OpenTitan and other secure use cases the kernel can also have it's
own list, exactly like you mention above. Then the kernel can compare
the app TBF header permissions with the one it already has. This
provides the same security you are mentioning here, but also allows
boards/users who don't need the extra complexity to just use apps.

That doesn't work if certification constraints prevent you from deploying a new kernel for each version of your app (how does the kernel know which app corresponds to each entry in its internal permissions list?).
 
>
> Secure boot
>
> I no longer think this use case is important. For secure boot to be useful, you would need a kernel that is cryptographically verified by a bootloader, but not verify the apps with the same bootloader. I would expect anyone wanting secure boot in Tock to just verify the entire image (kernel + apps) via the bootloader.
>
> Key derivation
>
> When a process asks for the encryption key named "key1", the kernel feeds the following data into the key derivation function: the hardware's secret key, "key1", and the process' app ID. This produces an encryption key that is unique to that hardware, the name "key1" (so the app can ask for multiple different encryption keys), and the application.

Agreed.

>
> IPC
>
> When a process receives a message via the IPC capsule, the IPC capsule writes the application ID of the process that sent the message into an allow-ed buffer. The app can then use that information as it wishes, such as to separate data between apps (e.g. an app that implements UDP can route packets to other apps based on the port number) or only accept messages from another app it trusts.
>
> When a process sends a message via the IPC capsule, it specifies the app ID of the process that it wants to send the message to. The IPC capsule routes the message appropriately.
>
> IPC is the one use case where having distinct app IDs (no two processes concurrently executing on a Tock system may have the same app ID) is beneficial. If we allow duplicate app IDs, here is a few ways the IPC capsule can route requests:
>
> It could broadcast the message to all processes with that app ID

This seems like a way for a malicious app to eavesdrop on messages.

That's not possible with verified app IDs.
 
> It could return an error, forcing processes that want to use IPC to have their own app ID

The app that is sending data can't really handle that error though.
This will allow a DOS from a malicious appID.

It would not allow denial of service with verified app IDs.
 
> It could identify processes by a combination of app ID and a second, non-cryptographically-verified identified encoded in the TBF header.

I think IPC is hard to do without unique IDs enforced on the board, at
least to do securley.

I am fine with enforcing unique IDs.

Alistair Francis

unread,
Dec 3, 2020, 4:00:49 PM12/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Thu, Dec 3, 2020 at 12:40 PM Johnathan Van Why <jrva...@google.com> wrote:
>
> On Thu, Dec 3, 2020 at 11:39 AM Alistair Francis <alist...@gmail.com> wrote:
>>
>> On Fri, Nov 20, 2020 at 12:31 PM Johnathan Van Why <jrva...@google.com> wrote:
>> >
>> > Alistair requested that I explain how my application ID proposal can be used in solutions to each of the five use cases I identified. Here are examples for how to solve each of the use cases (except for Secure Boot, where I instead argue the use case doesn't exist).
>>
>> Thanks for writing this up.
>>
>> >
>> > Proposal v3 Appendix A
>> >
>> > Storage
>> > Note: In this example I omit the short ID mechanism I described earlier to keep the description simpler.
>> >
>> > For each piece of data stored in storage, the storage layer would store the following metadata:
>> >
>> > enum Permissions { ReadOnly, ReadWrite }
>> >
>> >
>> > Owner: [u8]
>> >
>> > Name: [u8]
>> >
>> > Other users with access: [(app: [u8], Permissions)]
>> >
>> >
>> > When processes try to access a piece of data in storage, they specify both the owner of the data and the name of the data (different owners can own data with the same name -- this is to avoid the denial-of-service issues I brought up at the core WG call a week or two ago). If the process is the owner of the data (its app ID matches the "owner" byte sequence), then it is automatically granted access. If not, the system scans through the "other users with access" list to see if the app ID matches any of the entries in that list. If so, the permissions in that list are checked against the requested operation (i.e. read or write). If the process' app ID is not found the access is denied.
>>
>> Where does this permission list come from though? Is it specified by
>> the app that creates the region? Is it hard coded in the kernel?
>
>
> It is specified by the app that creates the region.

Ok, this seems like a good idea. Can the list of users be changed?

>
>>
>> So for each piece of data we store an owner, name and list of users
>> with access? That seems like a lot of overhead for a small flash
>> storage.
>
>
> Yes.

That is a large amount of overhead. Assuming a 64-bit appID, 64-bit
name and 1 other user we are at around 25 bytes of overhead for every
object. Not including the key (for a KV store), CRCs or anything else.

>
>>
>> >
>> > System call filtering
>> >
>> > The kernel image contains the following data structure:
>> >
>> > enum Filter { AllowAll, AllowOnly(&[&[u8]]) }
>> > static ACLs: &[(driver: usize, Filter)] = ...;
>> >
>> >
>> > When a process makes a system call, the system call filter scans through ACLs searching for the driver number. If it does not find the driver number, it denies the request. If it finds the driver number, along with Filter::AllowAll, it allows the syscall. If it finds the driver number along with Filter::AllowOnly, it scans through the list inside AllowOnly checking for the app ID. If it finds the app ID, the syscall is allowed, otherwise it denies the syscall.
>>
>> The same question. I like this list, but I don't understand where it
>> comes from. Does it have to be hard coded in the kernel?
>
>
> Yes, it would be hardcoded in the kernel.

So now apps can not be updated independently from the kernel. That
seems like a large downside.

>
>>
>> I also don't see why this wouldn't work with the method I mentioned.
>>
>> For syscall filtering for example. Each TBF header lists the syscalls
>> the app is allowed to do. The kernel can then use this list exactly as
>> you mention above. We now have an easy way for apps to specify
>> permissions. Although note, this is not secure enough for OpenTitain.
>>
>> For OpenTitan and other secure use cases the kernel can also have it's
>> own list, exactly like you mention above. Then the kernel can compare
>> the app TBF header permissions with the one it already has. This
>> provides the same security you are mentioning here, but also allows
>> boards/users who don't need the extra complexity to just use apps.
>
>
> That doesn't work if certification constraints prevent you from deploying a new kernel for each version of your app (how does the kernel know which app corresponds to each entry in its internal permissions list?).

I don't understand. What certification constraints allow you to
hardcode the list in the kernel (like described above), but not check
that list against an app?

The kernel would know exactly the same way you describe above, using the appID.

>
>>
>> >
>> > Secure boot
>> >
>> > I no longer think this use case is important. For secure boot to be useful, you would need a kernel that is cryptographically verified by a bootloader, but not verify the apps with the same bootloader. I would expect anyone wanting secure boot in Tock to just verify the entire image (kernel + apps) via the bootloader.
>> >
>> > Key derivation
>> >
>> > When a process asks for the encryption key named "key1", the kernel feeds the following data into the key derivation function: the hardware's secret key, "key1", and the process' app ID. This produces an encryption key that is unique to that hardware, the name "key1" (so the app can ask for multiple different encryption keys), and the application.
>>
>> Agreed.
>>
>> >
>> > IPC
>> >
>> > When a process receives a message via the IPC capsule, the IPC capsule writes the application ID of the process that sent the message into an allow-ed buffer. The app can then use that information as it wishes, such as to separate data between apps (e.g. an app that implements UDP can route packets to other apps based on the port number) or only accept messages from another app it trusts.
>> >
>> > When a process sends a message via the IPC capsule, it specifies the app ID of the process that it wants to send the message to. The IPC capsule routes the message appropriately.
>> >
>> > IPC is the one use case where having distinct app IDs (no two processes concurrently executing on a Tock system may have the same app ID) is beneficial. If we allow duplicate app IDs, here is a few ways the IPC capsule can route requests:
>> >
>> > It could broadcast the message to all processes with that app ID
>>
>> This seems like a way for a malicious app to eavesdrop on messages.
>
>
> That's not possible with verified app IDs.

Great!

>
>>
>> > It could return an error, forcing processes that want to use IPC to have their own app ID
>>
>> The app that is sending data can't really handle that error though.
>> This will allow a DOS from a malicious appID.
>
>
> It would not allow denial of service with verified app IDs.
>
>>
>> > It could identify processes by a combination of app ID and a second, non-cryptographically-verified identified encoded in the TBF header.
>>
>> I think IPC is hard to do without unique IDs enforced on the board, at
>> least to do securley.
>
>
> I am fine with enforcing unique IDs.

Me too :)

Alistair

Johnathan Van Why

unread,
Dec 3, 2020, 4:13:49 PM12/3/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Thu, Dec 3, 2020 at 1:00 PM Alistair Francis <alist...@gmail.com> wrote:
On Thu, Dec 3, 2020 at 12:40 PM Johnathan Van Why <jrva...@google.com> wrote:
>
> On Thu, Dec 3, 2020 at 11:39 AM Alistair Francis <alist...@gmail.com> wrote:
>>
>> On Fri, Nov 20, 2020 at 12:31 PM Johnathan Van Why <jrva...@google.com> wrote:
>> >
>> > Alistair requested that I explain how my application ID proposal can be used in solutions to each of the five use cases I identified. Here are examples for how to solve each of the use cases (except for Secure Boot, where I instead argue the use case doesn't exist).
>>
>> Thanks for writing this up.
>>
>> >
>> > Proposal v3 Appendix A
>> >
>> > Storage
>> > Note: In this example I omit the short ID mechanism I described earlier to keep the description simpler.
>> >
>> > For each piece of data stored in storage, the storage layer would store the following metadata:
>> >
>> > enum Permissions { ReadOnly, ReadWrite }
>> >
>> >
>> > Owner: [u8]
>> >
>> > Name: [u8]
>> >
>> > Other users with access: [(app: [u8], Permissions)]
>> >
>> >
>> > When processes try to access a piece of data in storage, they specify both the owner of the data and the name of the data (different owners can own data with the same name -- this is to avoid the denial-of-service issues I brought up at the core WG call a week or two ago). If the process is the owner of the data (its app ID matches the "owner" byte sequence), then it is automatically granted access. If not, the system scans through the "other users with access" list to see if the app ID matches any of the entries in that list. If so, the permissions in that list are checked against the requested operation (i.e. read or write). If the process' app ID is not found the access is denied.
>>
>> Where does this permission list come from though? Is it specified by
>> the app that creates the region? Is it hard coded in the kernel?
>
>
> It is specified by the app that creates the region.

Ok, this seems like a good idea. Can the list of users be changed?

That's really up to the filesystem, but "yes" is a reasonable answer.
 
>
>>
>> So for each piece of data we store an owner, name and list of users
>> with access? That seems like a lot of overhead for a small flash
>> storage.
>
>
> Yes.

That is a large amount of overhead. Assuming a 64-bit appID, 64-bit
name and 1 other user we are at around 25 bytes of overhead for every
object. Not including the key (for a KV store), CRCs or anything else.

>
>>
>> >
>> > System call filtering
>> >
>> > The kernel image contains the following data structure:
>> >
>> > enum Filter { AllowAll, AllowOnly(&[&[u8]]) }
>> > static ACLs: &[(driver: usize, Filter)] = ...;
>> >
>> >
>> > When a process makes a system call, the system call filter scans through ACLs searching for the driver number. If it does not find the driver number, it denies the request. If it finds the driver number, along with Filter::AllowAll, it allows the syscall. If it finds the driver number along with Filter::AllowOnly, it scans through the list inside AllowOnly checking for the app ID. If it finds the app ID, the syscall is allowed, otherwise it denies the syscall.
>>
>> The same question. I like this list, but I don't understand where it
>> comes from. Does it have to be hard coded in the kernel?
>
>
> Yes, it would be hardcoded in the kernel.

So now apps can not be updated independently from the kernel. That
seems like a large downside.

The system call filtering use case only makes sense if some apps (that the kernel knows about up front) are privileged relative to other apps (which the kernel may not know about up front).

Even if the app list is hardcoded in the kernel, privileged apps can be updated independently from the kernel as long as their app ID remains the same.
 
>
>>
>> I also don't see why this wouldn't work with the method I mentioned.
>>
>> For syscall filtering for example. Each TBF header lists the syscalls
>> the app is allowed to do. The kernel can then use this list exactly as
>> you mention above. We now have an easy way for apps to specify
>> permissions. Although note, this is not secure enough for OpenTitain.
>>
>> For OpenTitan and other secure use cases the kernel can also have it's
>> own list, exactly like you mention above. Then the kernel can compare
>> the app TBF header permissions with the one it already has. This
>> provides the same security you are mentioning here, but also allows
>> boards/users who don't need the extra complexity to just use apps.
>
>
> That doesn't work if certification constraints prevent you from deploying a new kernel for each version of your app (how does the kernel know which app corresponds to each entry in its internal permissions list?).

I don't understand. What certification constraints allow you to
hardcode the list in the kernel (like described above), but not check
that list against an app?

The kernel would know exactly the same way you describe above, using the appID.

I clearly misunderstood your suggestion. My understanding was that you were suggesting apps would have no app ID and would just have the permission list in the TBF headers.

If an app ID is available, why does the app need permissions in the TBF headers? It seems like the kernel list and the TBF header list would be redundant. The exception is if we want to allow apps to limit their own privileges as a defense-in-depth mechanism against app compromise.

Alistair Francis

unread,
Dec 3, 2020, 4:34:00 PM12/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
On Thu, Dec 3, 2020 at 1:13 PM Johnathan Van Why <jrva...@google.com> wrote:
>
> On Thu, Dec 3, 2020 at 1:00 PM Alistair Francis <alist...@gmail.com> wrote:
>>
>> On Thu, Dec 3, 2020 at 12:40 PM Johnathan Van Why <jrva...@google.com> wrote:
>> >
>> > On Thu, Dec 3, 2020 at 11:39 AM Alistair Francis <alist...@gmail.com> wrote:
>> >>
>> >> On Fri, Nov 20, 2020 at 12:31 PM Johnathan Van Why <jrva...@google.com> wrote:
>> >> >
>> >> > Alistair requested that I explain how my application ID proposal can be used in solutions to each of the five use cases I identified. Here are examples for how to solve each of the use cases (except for Secure Boot, where I instead argue the use case doesn't exist).
>> >>
>> >> Thanks for writing this up.
>> >>
>> >> >
>> >> > Proposal v3 Appendix A
>> >> >
>> >> > Storage
>> >> > Note: In this example I omit the short ID mechanism I described earlier to keep the description simpler.
>> >> >
>> >> > For each piece of data stored in storage, the storage layer would store the following metadata:
>> >> >
>> >> > enum Permissions { ReadOnly, ReadWrite }
>> >> >
>> >> >
>> >> > Owner: [u8]
>> >> >
>> >> > Name: [u8]
>> >> >
>> >> > Other users with access: [(app: [u8], Permissions)]
>> >> >
>> >> >
>> >> > When processes try to access a piece of data in storage, they specify both the owner of the data and the name of the data (different owners can own data with the same name -- this is to avoid the denial-of-service issues I brought up at the core WG call a week or two ago). If the process is the owner of the data (its app ID matches the "owner" byte sequence), then it is automatically granted access. If not, the system scans through the "other users with access" list to see if the app ID matches any of the entries in that list. If so, the permissions in that list are checked against the requested operation (i.e. read or write). If the process' app ID is not found the access is denied.
>> >>
>> >> Where does this permission list come from though? Is it specified by
>> >> the app that creates the region? Is it hard coded in the kernel?
>> >
>> >
>> > It is specified by the app that creates the region.
>>
>> Ok, this seems like a good idea. Can the list of users be changed?
>
>
> That's really up to the filesystem, but "yes" is a reasonable answer.

Ok, this looks good then.

I'm worried about the large overhead though. I would prefer an owner
and users to be region based instead of object based. Each app would
only have a handful or regions (most would have 1) which is much less
overhead.

For example App A uses flash region 1. App A is the owner for region 1
and controls permissions for the entire region. This is less granular,
but I think the trade off is worth it for the space saving.

This is then pretty similar to my proposal:
https://github.com/tock/tock/pull/2177. I like your naming better
though :)

Any thoughts on the risk of an app being compromised and then changing
the permissions to allow access to everyone? This does seem like a
risk for apps storing sensitive data.

For example a CTAP app is exposed over USB and has access to sensitive
secrets. What if a compromise over USB allows the app to change the
permissions, then a malicious app can read the secrets.

Actually, on top of that, what if a malicious attacker just changes
the app code to allow more permissions in the offline flash? I'm
assuming apps will be signed, which should prevent that, but at which
point why not move it to the TBF header so it isn't run-time chanable?

>
>>
>> >
>> >>
>> >> So for each piece of data we store an owner, name and list of users
>> >> with access? That seems like a lot of overhead for a small flash
>> >> storage.
>> >
>> >
>> > Yes.
>>
>> That is a large amount of overhead. Assuming a 64-bit appID, 64-bit
>> name and 1 other user we are at around 25 bytes of overhead for every
>> object. Not including the key (for a KV store), CRCs or anything else.
>>
>> >
>> >>
>> >> >
>> >> > System call filtering
>> >> >
>> >> > The kernel image contains the following data structure:
>> >> >
>> >> > enum Filter { AllowAll, AllowOnly(&[&[u8]]) }
>> >> > static ACLs: &[(driver: usize, Filter)] = ...;
>> >> >
>> >> >
>> >> > When a process makes a system call, the system call filter scans through ACLs searching for the driver number. If it does not find the driver number, it denies the request. If it finds the driver number, along with Filter::AllowAll, it allows the syscall. If it finds the driver number along with Filter::AllowOnly, it scans through the list inside AllowOnly checking for the app ID. If it finds the app ID, the syscall is allowed, otherwise it denies the syscall.
>> >>
>> >> The same question. I like this list, but I don't understand where it
>> >> comes from. Does it have to be hard coded in the kernel?
>> >
>> >
>> > Yes, it would be hardcoded in the kernel.
>>
>> So now apps can not be updated independently from the kernel. That
>> seems like a large downside.
>
>
> The system call filtering use case only makes sense if some apps (that the kernel knows about up front) are privileged relative to other apps (which the kernel may not know about up front).
>
> Even if the app list is hardcoded in the kernel, privileged apps can be updated independently from the kernel as long as their app ID remains the same.

True, but also only if their syscalls don't change.

>
>>
>> >
>> >>
>> >> I also don't see why this wouldn't work with the method I mentioned.
>> >>
>> >> For syscall filtering for example. Each TBF header lists the syscalls
>> >> the app is allowed to do. The kernel can then use this list exactly as
>> >> you mention above. We now have an easy way for apps to specify
>> >> permissions. Although note, this is not secure enough for OpenTitain.
>> >>
>> >> For OpenTitan and other secure use cases the kernel can also have it's
>> >> own list, exactly like you mention above. Then the kernel can compare
>> >> the app TBF header permissions with the one it already has. This
>> >> provides the same security you are mentioning here, but also allows
>> >> boards/users who don't need the extra complexity to just use apps.
>> >
>> >
>> > That doesn't work if certification constraints prevent you from deploying a new kernel for each version of your app (how does the kernel know which app corresponds to each entry in its internal permissions list?).
>>
>> I don't understand. What certification constraints allow you to
>> hardcode the list in the kernel (like described above), but not check
>> that list against an app?
>>
>> The kernel would know exactly the same way you describe above, using the appID.
>
>
> I clearly misunderstood your suggestion. My understanding was that you were suggesting apps would have no app ID and would just have the permission list in the TBF headers.

Ah ok. Yes, I meant appIDs as well.

My worry is that the appID system and hard coded syscall filtering is
too complex for most users, so it will just never be used. I think
even without hard coding it in the kernel it still makes sense.

>
> If an app ID is available, why does the app need permissions in the TBF headers? It seems like the kernel list and the TBF header list would be redundant. The exception is if we want to allow apps to limit their own privileges as a defense-in-depth mechanism against app compromise.

That is my thinking. I think there will be use cases where security is
not the most critical part (like it is in OT), but it is still useful.
In this case apps can limit their power.

An example of this would be a Blutooth app. It's high risk for take
over, so maybe the app is limited from doing other syscalls. It's only
in the TBF header, which we don't trust enough for it to be really
secure, but it's still something.

The other benefit that I mentioned the other day is then the Tock
kernel can use the list in the TBF header for all apps. Then the
OpenTitan code only needs to verify the TBF header list matches it's
hard coded list. After that we are re-using code for every platform.

Alistair

Johnathan Van Why

unread,
Dec 3, 2020, 5:16:05 PM12/3/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
If a filesystem allows data to be read by anyone other than that data's owner, then yes the filesystem is a possible exfiltration tool for adversaries. For that to be useful, the adversary would need to have some way to change the permissions through app compromise that doesn't also give them another exfiltration means (i.e. isn't remote code execution). As a result, my suspicion is this isn't a big deal, but there may be some threat model where it makes a difference.

Note that I'm trying to give suggestions on how someone could use app IDs to build a filesystem that satisfies Tock's threat model, not dictating a specific information. I am 100% okay with a filesystem that only allows a file's owner access to that file. It would be simpler, and would avoid the exfiltration caveat entirely. I suggested a design that allows other apps to access a file because I saw that other Tock contributors want to build storage drivers with that functionality.
 
For example a CTAP app is exposed over USB and has access to sensitive
secrets. What if a compromise over USB allows the app to change the
permissions, then a malicious app can read the secrets.

Actually, on top of that, what if a malicious attacker just changes
the app code to allow more permissions in the offline flash? I'm
assuming apps will be signed, which should prevent that, but at which
point why not move it to the TBF header so it isn't run-time chanable?

I think you are suggesting the following design (let me know if I am misunderstanding you):

The filesystem metadata stores the app ID of the application that owns each file. The list of additional users with read/write access to that file is stored in the owning application's TBF headers, rather than in the read-write storage area.

If so, that seems fine to me. It would require enforcing unique app IDs, but I am fine with that anyway.
 
>
>>
>> >
>> >>
>> >> So for each piece of data we store an owner, name and list of users
>> >> with access? That seems like a lot of overhead for a small flash
>> >> storage.
>> >
>> >
>> > Yes.
>>
>> That is a large amount of overhead. Assuming a 64-bit appID, 64-bit
>> name and 1 other user we are at around 25 bytes of overhead for every
>> object. Not including the key (for a KV store), CRCs or anything else.
>>
>> >
>> >>
>> >> >
>> >> > System call filtering
>> >> >
>> >> > The kernel image contains the following data structure:
>> >> >
>> >> > enum Filter { AllowAll, AllowOnly(&[&[u8]]) }
>> >> > static ACLs: &[(driver: usize, Filter)] = ...;
>> >> >
>> >> >
>> >> > When a process makes a system call, the system call filter scans through ACLs searching for the driver number. If it does not find the driver number, it denies the request. If it finds the driver number, along with Filter::AllowAll, it allows the syscall. If it finds the driver number along with Filter::AllowOnly, it scans through the list inside AllowOnly checking for the app ID. If it finds the app ID, the syscall is allowed, otherwise it denies the syscall.
>> >>
>> >> The same question. I like this list, but I don't understand where it
>> >> comes from. Does it have to be hard coded in the kernel?
>> >
>> >
>> > Yes, it would be hardcoded in the kernel.
>>
>> So now apps can not be updated independently from the kernel. That
>> seems like a large downside.
>
>
> The system call filtering use case only makes sense if some apps (that the kernel knows about up front) are privileged relative to other apps (which the kernel may not know about up front).
>
> Even if the app list is hardcoded in the kernel, privileged apps can be updated independently from the kernel as long as their app ID remains the same.

True, but also only if their syscalls don't change.

It's fine as long as the permissions the app has don't change. You would need the foresight to give applications all the permissions that a future version of that application might need.
Okay, let me make sure I understand your discussion (to make things *more) clear:
  1. The kernel has a hardcoded system call filter list, that limits certain privileged operations to known apps, while leaving other operations available to unknown apps. This list may give apps *more* permissions than they need, as is necessary to allow for application updates without updating the kernel as well.
  2. Each process binary's TBF headers contain a system call filter list that applies to that process. This list is updated with the process, and as a result can be changed during process updates. This list would only give processes permissions that they need (i.e. it is more restrictive), but is not as trusted as the list in the kernel.
  3. During startup, at process load time, the kernel verifies that the filter list in the TBF headers for each process is no looser than the kernel's hardcoded permission list (i.e. the TBF header list does NOT allow any action the kernel list does not allow).
  4. At runtime, the TBF header permission list is used.
If so, I don't disagree with this design. However, I am confused why you would suggest this design when you are concerned about the complexity of my application ID design proposal. As far as I can tell, this solution is strictly more complex than the design I suggested.

What am I missing?

Alistair Francis

unread,
Dec 3, 2020, 6:00:28 PM12/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
I agree that it's unlikely. Just something to think about.

>
> Note that I'm trying to give suggestions on how someone could use app IDs to build a filesystem that satisfies Tock's threat model, not dictating a specific information. I am 100% okay with a filesystem that only allows a file's owner access to that file. It would be simpler, and would avoid the exfiltration caveat entirely. I suggested a design that allows other apps to access a file because I saw that other Tock contributors want to build storage drivers with that functionality.

Thanks for clarifying that. Even if we only allow a single app/owner
to access the data, I think at least thinking this through makes us
more future proof.

>
>>
>> For example a CTAP app is exposed over USB and has access to sensitive
>> secrets. What if a compromise over USB allows the app to change the
>> permissions, then a malicious app can read the secrets.
>>
>> Actually, on top of that, what if a malicious attacker just changes
>> the app code to allow more permissions in the offline flash? I'm
>> assuming apps will be signed, which should prevent that, but at which
>> point why not move it to the TBF header so it isn't run-time chanable?
>
>
> I think you are suggesting the following design (let me know if I am misunderstanding you):
>
> The filesystem metadata stores the app ID of the application that owns each file. The list of additional users with read/write access to that file is stored in the owning application's TBF headers, rather than in the read-write storage area.
>
> If so, that seems fine to me. It would require enforcing unique app IDs, but I am fine with that anyway.

Yes. That is basically my proposal. At one point I would have prefered
to use "storage IDs" instead of the appIDs (see
https://github.com/tock/tock/pull/2177), which wouldn't have to be
unique, but your approach sounds good to me.

For storage regions I would prefer the permissions to apply to a
region instead of files/kv stores though.

>
>>
>> >
>> >>
>> >> >
>> >> >>
>> >> >> So for each piece of data we store an owner, name and list of users
>> >> >> with access? That seems like a lot of overhead for a small flash
>> >> >> storage.
>> >> >
>> >> >
>> >> > Yes.
>> >>
>> >> That is a large amount of overhead. Assuming a 64-bit appID, 64-bit
>> >> name and 1 other user we are at around 25 bytes of overhead for every
>> >> object. Not including the key (for a KV store), CRCs or anything else.
>> >>
>> >> >
>> >> >>
>> >> >> >
>> >> >> > System call filtering
>> >> >> >
>> >> >> > The kernel image contains the following data structure:
>> >> >> >
>> >> >> > enum Filter { AllowAll, AllowOnly(&[&[u8]]) }
>> >> >> > static ACLs: &[(driver: usize, Filter)] = ...;
>> >> >> >
>> >> >> >
>> >> >> > When a process makes a system call, the system call filter scans through ACLs searching for the driver number. If it does not find the driver number, it denies the request. If it finds the driver number, along with Filter::AllowAll, it allows the syscall. If it finds the driver number along with Filter::AllowOnly, it scans through the list inside AllowOnly checking for the app ID. If it finds the app ID, the syscall is allowed, otherwise it denies the syscall.
>> >> >>
>> >> >> The same question. I like this list, but I don't understand where it
>> >> >> comes from. Does it have to be hard coded in the kernel?
>> >> >
>> >> >
>> >> > Yes, it would be hardcoded in the kernel.
>> >>
>> >> So now apps can not be updated independently from the kernel. That
>> >> seems like a large downside.
>> >
>> >
>> > The system call filtering use case only makes sense if some apps (that the kernel knows about up front) are privileged relative to other apps (which the kernel may not know about up front).
>> >
>> > Even if the app list is hardcoded in the kernel, privileged apps can be updated independently from the kernel as long as their app ID remains the same.
>>
>> True, but also only if their syscalls don't change.
>
>
> It's fine as long as the permissions the app has don't change. You would need the foresight to give applications all the permissions that a future version of that application might need.

Good point. That sounds good to me then.
Yes. Although this would only apply to OpenTitan and other security
concise boards/setups. I suspect most Tock users will skip this part
and just allow everything at this level.

> Each process binary's TBF headers contain a system call filter list that applies to that process. This list is updated with the process, and as a result can be changed during process updates. This list would only give processes permissions that they need (i.e. it is more restrictive), but is not as trusted as the list in the kernel.

Yes, exactly. Again for non-secure boards we just trust the list in
the TBF header.

I did some experiments the other day and this can be auto-generated
when building TBF headers, at least for libtock-rs.

I am expecting it to look something like this:
https://github.com/tock/tock/pull/2172

> During startup, at process load time, the kernel verifies that the filter list in the TBF headers for each process is no looser than the kernel's hardcoded permission list (i.e. the TBF header list does NOT allow any action the kernel list does not allow).

Exactly! Again some boards might skip this step if they aren't worried
about security.

> At runtime, the TBF header permission list is used.

Yep :)

>
> If so, I don't disagree with this design. However, I am confused why you would suggest this design when you are concerned about the complexity of my application ID design proposal. As far as I can tell, this solution is strictly more complex than the design I suggested.
>
> What am I missing?

So I don't disagree that it's more confusing inside the kernel and it
makes the OpenTitan use case more confusing. I think that it does make
the setup easier for standard users who aren't worried about security
to the same level OT is.

My worry with your initial proposal (which I might have
mis-understood) is that users who wanted to write an app to store data
in region 1 would have to modify this hard coded list in the kernel to
specify appIDs and regions. I was worried we would have a complex
permissions model in each board that users have to understand. To me
that makes it more complex for users to run apps. With this process
the idea is that apps will already contain the syscall filtering list
and a storage ID, for persistent storage. We don't need any
permissions inside the kernel or any changes to boards, everything
will just work for users, at the expense of some security though. In
the case of a secure situation (like OT) we can add the extra checks
to the kernel/board and still get the level of security you are
looking for.

Alistair

Johnathan Van Why

unread,
Dec 3, 2020, 7:09:02 PM12/3/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
That's totally fine with me. I'm arguing that my application ID proposal can be used to build a secure permissions system for a filesystem, not dictating what that permissions system looks like. If anything, the "what should the permissions model look like" question is dictated by Tock's threat model. Applying permissions on a per-region basis rather than a per-file basis (for some definition of "region" that I don't feel the need to inquire about) is fine in that respect.
The threat model is currently a bit vague about storage, because it currently speaks almost exclusively about processes. That is because we hadn't defined the concept of application identity. The obvious way to extend the threat model to storage is to say that the kernel provides applications confidentiality, integrity, and availability guarantees against other applications (I'm ignoring capsules for now). I've been operating under the assumption that we extend the threat model in this manner, but I realize now that I never stated it.

It seems to me implementing that isolation in a storage layer requires the storage layer to understand application IDs. The process-based isolation that we've used so far does not work for storage because storage must enforce isolation across reboots. I think any definition of a "storage ID" that attempts to provide that isolation would in fact be an application ID in disguise.

As for system call filtering, whether the filter needs to know about app IDs depends on the use case, and I see two separate use cases here:
  1. Allowing a Tock system to run arbitrary applications but only allowing a fixed list of known applications to call certain sensitive system calls (the OpenTitan use case I've been arguing for).
  2. Providing defense-in-depth against process compromise, preventing a compromised process from issuing system calls it was not supposed to issue (a use case I believe you have been arguing for).
Use case 1 provides isolation against malicious apps. That requires application IDs, because the list of known applications is part of the kernel and application IDs are the cryptographically-verifiable mechanism for authenticating apps to the kernel.

Use case 2 does not provide isolation against malicious apps. It provides extra security to an app under the assumption that app trusts its own TBF headers, which does not need application IDs.

I'm completely fine with building a system call filtering mechanism that supports both use cases. The mechanism you suggested is to use a permissions list in the application's TBF headers at runtime (which is good enough for use case 2), plus optionally verify the permissions list against a kernel-provided master list during startup (which supports use case 1). That design seems fine to me.

-Johnathan

Alistair Francis

unread,
Dec 3, 2020, 9:19:38 PM12/3/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
Great!
Sounds good to me. It would be interesting to include capsules in
there as well though.

>
> It seems to me implementing that isolation in a storage layer requires the storage layer to understand application IDs. The process-based isolation that we've used so far does not work for storage because storage must enforce isolation across reboots. I think any definition of a "storage ID" that attempts to provide that isolation would in fact be an application ID in disguise.

I agree that they end up being similar. The only possible differences
between a "storage ID" and application ID would be uniqueness
constraints, lengths and the ability to change owners. A storage ID
could change (to change the app that owns the data) while maintaining
the same appID.

I don't think that's very important though, as long as we have the
ability to provide permissions to others I think we are mostly ok
here.

I'm guessing there will be a translation between the full appID and a
short ID used inside the board?

>
> As for system call filtering, whether the filter needs to know about app IDs depends on the use case, and I see two separate use cases here:
>
> Allowing a Tock system to run arbitrary applications but only allowing a fixed list of known applications to call certain sensitive system calls (the OpenTitan use case I've been arguing for).
> Providing defense-in-depth against process compromise, preventing a compromised process from issuing system calls it was not supposed to issue (a use case I believe you have been arguing for).
>
> Use case 1 provides isolation against malicious apps. That requires application IDs, because the list of known applications is part of the kernel and application IDs are the cryptographically-verifiable mechanism for authenticating apps to the kernel.
>
> Use case 2 does not provide isolation against malicious apps. It provides extra security to an app under the assumption that app trusts its own TBF headers, which does not need application IDs.
>
> I'm completely fine with building a system call filtering mechanism that supports both use cases. The mechanism you suggested is to use a permissions list in the application's TBF headers at runtime (which is good enough for use case 2), plus optionally verify the permissions list against a kernel-provided master list during startup (which supports use case 1). That design seems fine to me.

Great! That hits the nail on the head. I think this will allow us to
use the same code for both use cases, with just some more checks and
verification for OT.

Sorry that this has been more work then you probably originally
intended. I was just worried that we would end up with a complex
situation for most users who just want to run their app on Tock.

Alistair

Johnathan Van Why

unread,
Dec 3, 2020, 9:47:10 PM12/3/20
to Alistair Francis, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
Actually, in retrospect, there may not need to be a specific statement about storage or isolation provided to applications. I would probably update the threat model to define an application as something similar to "all processes -- potentially across multiple reboots of the system -- with the same application ID". All application data is also process data, and processes are already allowed to explicitly share data with other entities. Saving data into nonvolatile storage would then be considered explicitly sharing data with the rest of the application.

>
> It seems to me implementing that isolation in a storage layer requires the storage layer to understand application IDs. The process-based isolation that we've used so far does not work for storage because storage must enforce isolation across reboots. I think any definition of a "storage ID" that attempts to provide that isolation would in fact be an application ID in disguise.

I agree that they end up being similar. The only possible differences
between a "storage ID" and application ID would be uniqueness
constraints, lengths and the ability to change owners. A storage ID
could change (to change the app that owns the data) while maintaining
the same appID.

I don't think that's very important though, as long as we have the
ability to provide permissions to others I think we are mostly ok
here.

I'm guessing there will be a translation between the full appID and a
short ID used inside the board?

That translation has to be specific to each use case -- i.e. the translation will be different for storage versus kernel-ACL'd syscall filtering. The reason is that storage needs to be able to extend its translation map when a new application is deployed, whereas kernel-ACL'd syscall filtering cannot.

I would also be okay with shortening the basic app ID type (the one I specified in my proposal) to 32 bits, as a collision is really only a problem if two colliding apps are deployed to a single board. Then building short IDs wouldn't be terribly important for use cases that don't use cryptographically-verified IDs.

Alistair Francis

unread,
Dec 4, 2020, 9:21:36 PM12/4/20
to Johnathan Van Why, Leon Schuermann, Pat Pannuto, Julien Cretin, Tock Embedded OS Development Discussion
Ah ok. So there will be a "storage ID" and a "ACL ID" effectively,
both generated from the appID.

>
> I would also be okay with shortening the basic app ID type (the one I specified in my proposal) to 32 bits, as a collision is really only a problem if two colliding apps are deployed to a single board. Then building short IDs wouldn't be terribly important for use cases that don't use cryptographically-verified IDs.

Well inside the kernel we could probably just use 8-bit IDs. It seems
unlikley we would have more than 256 apps. The original conversion
between the verified ID to the short ID would avoid duplicates and and
handle crypto requirements.

Alistair
Reply all
Reply to author
Forward
0 new messages