Nice work. I have some comments :-)
default:
UNREACHABLE();Unnecessary, please drop.
Background: we strongly prefer not to have `default:` cases at all, so that the compiler warns us when we forget to handle a case. When a few cases don't need to be handled in a particular `switch`, we prefer to list them explicitly, and make them `UNREACHABLE()`.
One exception to this is when we decode wire bytes (or read possibly-corrupted in-sandbox data) and hence have to handle any possible value; an example is the `default:` in line 1146.
Another exception is when an `enum` has a very large number of cases (say, a hundred), and we only need to handle a few of them, so listing all the others would be overly verbose. That should be rare though.
switch (static_cast<CompactImportCode>(kind_or_compact_flag)) {I don't feel strongly about it, but I think I'd simplify the control flow here. How about this (with renaming `kind_or_compact_flag` to `kind` for brevity):
```
if (kind == kCompactImportByModule && field_name.length() == 0) {
if (!enabled_features_.has_compact_imports()) {
errorf(pc_ - 1,
"Invalid import kind %d, enable with "
"--experimental-wasm-compact-imports", kind);
break;
}
uint32_t compact_import_table_count = /* lines 1173-1184 */
} else if (kind == kCompactImportByType && field_name.length() == 0) {
if (!enabled_features_...) ...
ImportExportKindCode kind_compact = /* lines 1188-1191 */
} else {
DecodeSingleOrGroupImport(..., static_cast<ImportExportKindCode>(kind), false);
}
```
Also, seeing this implementation leads me to the following spec feedback: the requirement that `nm == ""` is an unnecessary (though minimal) burden on decoders; we could just as well state that the field name is ignored, and the `0x7F`/`0x7E` byte is the only indicator that a compact encoding is being used.
// Check if this import is using a compact import formatnit: All comments should end with appropriate punctuation. (Same in line 1198.)
In this particular case one could also argue that the comment doesn't really say anything that the next two lines of code don't also say.
if (tracer_) tracer_->ImportOffset(pc_offset());That's not good enough any more. This needs to be called once per `module_->import_table.push_back()`, and each time report the wire bytes offset that's the best definition for "where this import starts": in case of classic ungrouped imports, that's the starting position of the module name (i.e. where this call is right now); in case of import groups, it's the starting position of the field name.
This is a bit tricky to test, but here's one way:
(1) Define a module that uses import groups (you have one in your test, so this step is already done).
(2) Run `d8` over that test with `--dump-wasm-module`. That will put some file `<hash>.ok.wasm` into the current directory, with `<hash>` computed from the wire bytes, so find it with `ls *.wasm` or similar. (It'll dump more than one file if the test defines multiple modules).
(3) Compile our Wasm Module Inspector: `gm x64.release wami`.
(4) Test the regular disassembler (which is also used by Chrome DevTools): `out/x64.release/wami --full-wat --offsets file.wasm`. Pay attention to the offsets column for the import section.
(5) Test the hexdump mode (which is a debugging tool for us): `out/x64.release/wami --full-hexdump file.wasm`. Pay attention to the `// import #123` comments.
Oh, and if testing these two disassembly modes uncovers other deficiencies, then we'll have to fix those too :-)
It's fine to do that in a separate CL. What's clear is that the current disassembler will not reproduce the grouping; if we want to preserve that, we'll have to build support for that.
If you want to write disassembler tests, see `test/unittests/wasm/wasm-disassembler-unittest-*` for inspiration.
module_->import_table.reserve(import_table_count);Bummer that this isn't a reliable prediction any more.
We should probably add a replacement in `GetImportGroupCount`, especially when a large group is encountered: the primary intended use case for this feature is importing several thousand string constants as globals, whereas the count we just read might well be just a handful (if all imports are grouped) or in the ballpark of 100 or so (if functions are not grouped).
Perhaps some logic like:
```
// We don't know the cumulative size of import groups up front. Usually,
// a std::vector's exponential growth behavior should do a good job;
// in case of very large import groups we can help it with an explicit
// reservation.
std::vector<WasmImport>& table = module_->import_table;
if (count > table.capacity()) {
table.reserve(table.capacity() + count);
}
```
for (uint32_t ic = 0; ok() && ic < count; ++ic) {nit: why not just `i`?
uint32_t getImportGroupCount(bool is_compact_encoding_2) {Please give this a more descriptive name, e.g. `by_type`.
uint32_t getImportGroupCount(bool is_compact_encoding_2) {`G`, nontrivial C++ functions have `UpperCamelCase` names.
if (tracer_) {
tracer_->Description(": ");
tracer_->Description(ExternalKindName(kind));
}Please preserve this, it's very useful for `wami`'s hexdump mode (see other comment). The beginning of `DecodeSingleOrGroupImport` would probably be a good place for it, since `kind` has always just been decoded when we call that.
kCompactImportEncoding1 = 0x7F,If you name this `kCompactImportByModule` (and `...ByModuleAndType` below, or perhaps just `...ByType` for brevity), you don't need the comment, and use sites become more descriptive (less likely to mix up "encoding1" and "encoding2" when forgetting which is which).
/* V8 side owner: thibaudm */ \`ryandiaz` perhaps? :-)
// Flags: --experimental-wasm-compact-imports --experimental-wasm-multi-memoryThat flag doesn't even exist any more.
d8.file.execute('test/mjsunit/mjsunit.js');You don't need this, our test driver loads that file automatically.
// Mixed types -> Encoding 1 (0x7F)`Group by module name only`
And then you don't really need the next line (14).
// 1. FunctionThis comment (and 2. through 5. below) don't really say anything that isn't immediately obvious from the code they describe. I'd drop them.
If you want, you can document intention here by saying `// Add one import of each kind.`
builder.addImport("mod", "f", sig, kExternalFunction, kCompactImportEncoding1);nit: please respect 80-column lines. Again several times below.
Consider running a formatter over this file. (For this particular test it should work pretty well. For many other Wasm tests that contain function bodies, the formatter doesn't handle those very well, so it's generally preferable to format function body array literals by hand -- but formatters can still be used for other ranges of the file.)
if (buffer[i] === 0x7F && buffer[i-1] === 0x00) { // empty name before itnit: 2-space indentation please
if (buffer[i] === 0x7F && buffer[i-1] === 0x00) { // empty name before itWhy use a raw numeric constant and a comment (line 34), when you could just use the named constant here?
found = true;If we're going to test the module builder at all, how about counting occurrences and checking that there's exactly one?
// Scan for 0x7E (kCompactImportEncoding2)nit: indentation is off.
You can just drop this line entirely if you use the named constant below.
(function TestNormalImportsAlsoWork() {
let builder = new WasmModuleBuilder();
// Not setting compact imports -> normal behavior
let sig = builder.addType(kSig_v_v);
builder.addImport("mod", "f1", sig);
builder.addImport("mod", "f2", sig);
let buffer = builder.toBuffer();
let module = new WebAssembly.Module(buffer);
let imports = WebAssembly.Module.imports(module);
assertEquals(2, imports.length);
})();What is this testing?
let kCompactImportEncoding1 = 0x7F;Descriptive names here too please (same as in C++)
this.compact_imports = kIndividualImportEncoding;I don't think this global state flag is worth having. See also larger comment below.
const imported_memories = this.num_imported_memories;nit: just inline this in the next line. (Same in 1519/1520.)
_addImport(module, name, kind, extra, importEncodingType) {nit: here and below: we prefer style consistency _within the project_, and most of V8 is C++, so we apply the C++ style rules to our JS testing code (as far as possible/reasonable). In particular, that means `snake_case` for local variables and parameters. (`kCamelCase` for constants is right.)
(Admittedly the status quo isn't entirely consistent, e.g. the lowercase `a` in `addFunctions` doesn't match our C++ style, but I'd like to keep it as consistent as possible when adding new things.)
} else {Pro tip: if you use an early `return` instead of this `else`:
```
if (...) {
...
return;
}
```
you can save one level of indentation for everything below. Avoiding indentation isn't generally important, but when you have as much as below (up to 7 levels!), perhaps avoiding one level is worth it.
let can_append = last && last.module === module && last.encoding === importEncodingType;I think this is a bit too much magic, considering the job of this module builder: we want it to be convenient, yes; but most importantly we want it to give us very detailed control over the modules we build. Suppose we had a bug that only triggered when there are two consecutive by-module import groups for the name module name right after each other; the current machinery here wouldn't let us express the regression test for that bug AFAICS. Automatically un-grouping imports that the caller erroneously requested to be grouped also has the potential to cause time-consuming confusion when debugging something.
I would suggest to move control over the grouping to callers. Instead of adding an `importEncodingType` parameter to the various `AddImport...` functions below, how about an interface along the lines of:
```
class ImportGroupBuilder {
constructor(builder, encoding, module, kind=null, fields=null) {
this.builder = builder;
this.encoding = encoding;
this.module = module;
this.kind = kind;
Object.assign(this, fields);
this.imports = [];
}
add(name) {
if (this.kind === null) {
throw new Error("must specify import kind/type in import group");
}
this.imports.push(name);
}
addFunction(name, type) {
if (this.kind !== null) {
throw new Error("use simple 'add(name)' for typed import groups");
}
this.imports.push({name, kind: kExternalFunction, type});
return this.builder.num_imported_funcs++;
}
addGlobal(name, type, mutable, shared) {
if (this.kind !== null) {
throw new Error("use simple 'add(name)' for typed import groups");
}
this.imports.push({name, kind: kExternalGlobal, type, mutable, shared});
return this.builder.num_imported_globals++;
}
}
// in WasmModuleBuilder:
addImportGroup(module) {
let group = new ImportGroupBuilder(this, kCompactImportByModule, module);
this.imports.push(group);
return group;
}
addImportedFunctionGroup(module, type) {
let group = new ImportGroupBuilder(
this, kCompactImportByType, module, kExternalFunction, {type});
this.imports.push(group);
return group;
}
addImportedGlobalsGroup(module, type, mutable, shared) {
let group = new ImportGroupBuilder(
this, kCompactImportByType, module, kExternalGlobal,
{type, mutable, shared});
this.imports.push(group);
return group;
}
```
There'd be a bit of duplication between the `ImportGroupBuilder`'s methods and the main builder's `addImportedFoo`, but in the interest of simplicity (of the implementation) and flexibility (of the usage) I think that'd be a price worth paying.
(As an unrelated follow-up, I'd actually like to move to options bags where the number of parameters is getting out of hands, so that callers can e.g. write
```
addImportedGlobal({type: kWasmExternRef, shared: true, mutable: false});
```
but that'd be a huge change of its own and probably shouldn't be folded into the new features in this CL.)
section.emit_string(""); // nmNot a helpful comment. Spell it out or drop it.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// Check if this import is using a compact import formatnit: All comments should end with appropriate punctuation. (Same in line 1198.)
In this particular case one could also argue that the comment doesn't really say anything that the next two lines of code don't also say.
Done
for (uint32_t ic = 0; ok() && ic < count; ++ic) {nit: why not just `i`?
Done
uint32_t getImportGroupCount(bool is_compact_encoding_2) {`G`, nontrivial C++ functions have `UpperCamelCase` names.
Acknowledged
uint32_t getImportGroupCount(bool is_compact_encoding_2) {Please give this a more descriptive name, e.g. `by_type`.
Done
if (tracer_) {
tracer_->Description(": ");
tracer_->Description(ExternalKindName(kind));
}Please preserve this, it's very useful for `wami`'s hexdump mode (see other comment). The beginning of `DecodeSingleOrGroupImport` would probably be a good place for it, since `kind` has always just been decoded when we call that.
Done
kCompactImportEncoding1 = 0x7F,If you name this `kCompactImportByModule` (and `...ByModuleAndType` below, or perhaps just `...ByType` for brevity), you don't need the comment, and use sites become more descriptive (less likely to mix up "encoding1" and "encoding2" when forgetting which is which).
Done
/* V8 side owner: thibaudm */ \Ryan Diaz`ryandiaz` perhaps? :-)
Done
// Flags: --experimental-wasm-compact-imports --experimental-wasm-multi-memoryThat flag doesn't even exist any more.
Done
d8.file.execute('test/mjsunit/mjsunit.js');You don't need this, our test driver loads that file automatically.
Done
// Mixed types -> Encoding 1 (0x7F)`Group by module name only`
And then you don't really need the next line (14).
Done
// 1. FunctionThis comment (and 2. through 5. below) don't really say anything that isn't immediately obvious from the code they describe. I'd drop them.
If you want, you can document intention here by saying `// Add one import of each kind.`
Done
builder.addImport("mod", "f", sig, kExternalFunction, kCompactImportEncoding1);nit: please respect 80-column lines. Again several times below.
Consider running a formatter over this file. (For this particular test it should work pretty well. For many other Wasm tests that contain function bodies, the formatter doesn't handle those very well, so it's generally preferable to format function body array literals by hand -- but formatters can still be used for other ranges of the file.)
Done
if (buffer[i] === 0x7F && buffer[i-1] === 0x00) { // empty name before itWhy use a raw numeric constant and a comment (line 34), when you could just use the named constant here?
Done
if (buffer[i] === 0x7F && buffer[i-1] === 0x00) { // empty name before itRyan Diaznit: 2-space indentation please
Done
found = true;If we're going to test the module builder at all, how about counting occurrences and checking that there's exactly one?
Done
// Scan for 0x7E (kCompactImportEncoding2)nit: indentation is off.
You can just drop this line entirely if you use the named constant below.
Done
(function TestNormalImportsAlsoWork() {
let builder = new WasmModuleBuilder();
// Not setting compact imports -> normal behavior
let sig = builder.addType(kSig_v_v);
builder.addImport("mod", "f1", sig);
builder.addImport("mod", "f2", sig);
let buffer = builder.toBuffer();
let module = new WebAssembly.Module(buffer);
let imports = WebAssembly.Module.imports(module);
assertEquals(2, imports.length);
})();What is this testing?
This in intending to test the the module builder addImport method, without the optional compact encoding flag set, still works to generate valid Wasm modules with the correct number of imports, I'll make this more explicit and likely change it as I update the module builder based on your below comment about how we group the resulting imports.
let kCompactImportEncoding1 = 0x7F;Descriptive names here too please (same as in C++)
Done
const imported_memories = this.num_imported_memories;nit: just inline this in the next line. (Same in 1519/1520.)
Done
section.emit_string(""); // nmNot a helpful comment. Spell it out or drop it.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
for (uint32_t ic = 0; ok() && ic < count; ++ic) {Why not just `i`?
for (uint32_t ic = 0; ok() && ic < count; ++ic) {Why not just `i`?
for (uint32_t ic = 0; ok() && ic < count; ++ic) {Why not just `i`?
for (uint32_t ic = 0; ok() && ic < count; ++ic) {Why not just `i`?
// Groups imports by module.This comment and the next are now unneeded.
builder.addImportedGlobal(
"mod",
"g",
kWasmI32,
false,
false,
kCompactImportByModule,
);FWIW, another good option would be
```
builder.addImportedGlobal(
"mod", "g", kWasmI32, false, false, kCompactImportEncoding1);
```
but this is fine too.
(function TestNormalImportsAlsoWork() {
let builder = new WasmModuleBuilder();
// Not setting compact imports -> normal behavior
let sig = builder.addType(kSig_v_v);
builder.addImport("mod", "f1", sig);
builder.addImport("mod", "f2", sig);
let buffer = builder.toBuffer();
let module = new WebAssembly.Module(buffer);
let imports = WebAssembly.Module.imports(module);
assertEquals(2, imports.length);
})();Ryan DiazWhat is this testing?
This in intending to test the the module builder addImport method, without the optional compact encoding flag set, still works to generate valid Wasm modules with the correct number of imports, I'll make this more explicit and likely change it as I update the module builder based on your below comment about how we group the resulting imports.
Well, we probably have a thousand tests already that would fail if old-style imports didn't work any more, so I'd be fine with not adding one more.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Why not just `i`?
Done
Why not just `i`?
Done
Why not just `i`?
Done
Why not just `i`?
Done
Unnecessary, please drop.
Background: we strongly prefer not to have `default:` cases at all, so that the compiler warns us when we forget to handle a case. When a few cases don't need to be handled in a particular `switch`, we prefer to list them explicitly, and make them `UNREACHABLE()`.
One exception to this is when we decode wire bytes (or read possibly-corrupted in-sandbox data) and hence have to handle any possible value; an example is the `default:` in line 1146.
Another exception is when an `enum` has a very large number of cases (say, a hundred), and we only need to handle a few of them, so listing all the others would be overly verbose. That should be rare though.
Done
switch (static_cast<CompactImportCode>(kind_or_compact_flag)) {I don't feel strongly about it, but I think I'd simplify the control flow here. How about this (with renaming `kind_or_compact_flag` to `kind` for brevity):
```
if (kind == kCompactImportByModule && field_name.length() == 0) {
if (!enabled_features_.has_compact_imports()) {
errorf(pc_ - 1,
"Invalid import kind %d, enable with "
"--experimental-wasm-compact-imports", kind);
break;
}
uint32_t compact_import_table_count = /* lines 1173-1184 */
} else if (kind == kCompactImportByType && field_name.length() == 0) {
if (!enabled_features_...) ...
ImportExportKindCode kind_compact = /* lines 1188-1191 */
} else {
DecodeSingleOrGroupImport(..., static_cast<ImportExportKindCode>(kind), false);
}
```Also, seeing this implementation leads me to the following spec feedback: the requirement that `nm == ""` is an unnecessary (though minimal) burden on decoders; we could just as well state that the field name is ignored, and the `0x7F`/`0x7E` byte is the only indicator that a compact encoding is being used.
I agree we could ignore the field name when the compact import indicator is here, but since it is included in the current state of the proposal do we need to keep the field_name.length() == 0 check here? I assume yes?
if (tracer_) tracer_->ImportOffset(pc_offset());That's not good enough any more. This needs to be called once per `module_->import_table.push_back()`, and each time report the wire bytes offset that's the best definition for "where this import starts": in case of classic ungrouped imports, that's the starting position of the module name (i.e. where this call is right now); in case of import groups, it's the starting position of the field name.
This is a bit tricky to test, but here's one way:
(1) Define a module that uses import groups (you have one in your test, so this step is already done).
(2) Run `d8` over that test with `--dump-wasm-module`. That will put some file `<hash>.ok.wasm` into the current directory, with `<hash>` computed from the wire bytes, so find it with `ls *.wasm` or similar. (It'll dump more than one file if the test defines multiple modules).
(3) Compile our Wasm Module Inspector: `gm x64.release wami`.
(4) Test the regular disassembler (which is also used by Chrome DevTools): `out/x64.release/wami --full-wat --offsets file.wasm`. Pay attention to the offsets column for the import section.
(5) Test the hexdump mode (which is a debugging tool for us): `out/x64.release/wami --full-hexdump file.wasm`. Pay attention to the `// import #123` comments.Oh, and if testing these two disassembly modes uncovers other deficiencies, then we'll have to fix those too :-)
It's fine to do that in a separate CL. What's clear is that the current disassembler will not reproduce the grouping; if we want to preserve that, we'll have to build support for that.If you want to write disassembler tests, see `test/unittests/wasm/wasm-disassembler-unittest-*` for inspiration.
Thank you for this detailed comment.
I update the ImportOffset call using a helper AddImport should handle the kCompactImportByModuleAndType case correctly and reduce some code duplication regarding adding to the import table.
I think my tests look ok, but would like to add a disassembler test so going to to leave this comment unresolved for now.
Bummer that this isn't a reliable prediction any more.
We should probably add a replacement in `GetImportGroupCount`, especially when a large group is encountered: the primary intended use case for this feature is importing several thousand string constants as globals, whereas the count we just read might well be just a handful (if all imports are grouped) or in the ballpark of 100 or so (if functions are not grouped).Perhaps some logic like:
```
// We don't know the cumulative size of import groups up front. Usually,
// a std::vector's exponential growth behavior should do a good job;
// in case of very large import groups we can help it with an explicit
// reservation.
std::vector<WasmImport>& table = module_->import_table;
if (count > table.capacity()) {
table.reserve(table.capacity() + count);
}
```
Done
Bummer that this isn't a reliable prediction any more.
We should probably add a replacement in `GetImportGroupCount`, especially when a large group is encountered: the primary intended use case for this feature is importing several thousand string constants as globals, whereas the count we just read might well be just a handful (if all imports are grouped) or in the ballpark of 100 or so (if functions are not grouped).Perhaps some logic like:
```
// We don't know the cumulative size of import groups up front. Usually,
// a std::vector's exponential growth behavior should do a good job;
// in case of very large import groups we can help it with an explicit
// reservation.
std::vector<WasmImport>& table = module_->import_table;
if (count > table.capacity()) {
table.reserve(table.capacity() + count);
}
```
Done
This comment and the next are now unneeded.
Done
(function TestNormalImportsAlsoWork() {
let builder = new WasmModuleBuilder();
// Not setting compact imports -> normal behavior
let sig = builder.addType(kSig_v_v);
builder.addImport("mod", "f1", sig);
builder.addImport("mod", "f2", sig);
let buffer = builder.toBuffer();
let module = new WebAssembly.Module(buffer);
let imports = WebAssembly.Module.imports(module);
assertEquals(2, imports.length);
})();Ryan DiazWhat is this testing?
Jakob KummerowThis in intending to test the the module builder addImport method, without the optional compact encoding flag set, still works to generate valid Wasm modules with the correct number of imports, I'll make this more explicit and likely change it as I update the module builder based on your below comment about how we group the resulting imports.
Well, we probably have a thousand tests already that would fail if old-style imports didn't work any more, so I'd be fine with not adding one more.
Done
this.compact_imports = kIndividualImportEncoding;Ryan DiazI don't think this global state flag is worth having. See also larger comment below.
Done
_addImport(module, name, kind, extra, importEncodingType) {nit: here and below: we prefer style consistency _within the project_, and most of V8 is C++, so we apply the C++ style rules to our JS testing code (as far as possible/reasonable). In particular, that means `snake_case` for local variables and parameters. (`kCamelCase` for constants is right.)
(Admittedly the status quo isn't entirely consistent, e.g. the lowercase `a` in `addFunctions` doesn't match our C++ style, but I'd like to keep it as consistent as possible when adding new things.)
Done
Pro tip: if you use an early `return` instead of this `else`:
```
if (...) {
...
return;
}
```
you can save one level of indentation for everything below. Avoiding indentation isn't generally important, but when you have as much as below (up to 7 levels!), perhaps avoiding one level is worth it.
Done
let can_append = last && last.module === module && last.encoding === importEncodingType;This makes sense, I've updated the module builder and tests to use this pattern.
I wasn't sure if it was necessary to do the checks related to imported values being declared before local ones like exist in the other addMethods. e.g
```
addImportedGlobal(module, name, type, mutable = false, shared = false) {
if (this.globals.length != 0) {
throw new Error('Imported globals must be declared before local ones');
}
...
```
It seems like this is more relevant to the actual wasm encoding step correct? Why do these check exist at the add* method?
And for the optional arg refactor. Where do we usually track tasks like this for V8? create an issue in https://g-issues.chromium.org I assume?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// TODO(ryandiaz) this is still not correct in the hexdump output.Hmmm... this might be sort of unfixable with the current design: both `consume_utf8_string` and `tracer_->ImportOffset` immediately generate output; there's no back-patching support to insert the `// import #123` comment in the position that matches `start_offset`.
We may have to split `ImportOffset` into two callbacks: one for the import section entry, one for the actual import. Or we live with reduced accuracy of the annotations until we have a better idea.
This is not super important as it's not production code; but it can save our team quite some time when it comes in handy for debugging. This also implies that we don't want to pay a measurable performance price for the hexdump mode, nor do we want to introduce more code complexity than absolutely necessary, especially not here in the main decoder (if complex logic is unavoidable, it's better to put that into `module-inspector.cc`).
So, overall, I think the short-term conclusion is that figuring out the best solution for that hexdump mode is best left to a follow-up CL.
WasmImport* AddImport(WireBytesRef module_name, WireBytesRef field_name,Nice!
switch (static_cast<CompactImportCode>(kind_or_compact_flag)) {Ryan DiazI don't feel strongly about it, but I think I'd simplify the control flow here. How about this (with renaming `kind_or_compact_flag` to `kind` for brevity):
```
if (kind == kCompactImportByModule && field_name.length() == 0) {
if (!enabled_features_.has_compact_imports()) {
errorf(pc_ - 1,
"Invalid import kind %d, enable with "
"--experimental-wasm-compact-imports", kind);
break;
}
uint32_t compact_import_table_count = /* lines 1173-1184 */
} else if (kind == kCompactImportByType && field_name.length() == 0) {
if (!enabled_features_...) ...
ImportExportKindCode kind_compact = /* lines 1188-1191 */
} else {
DecodeSingleOrGroupImport(..., static_cast<ImportExportKindCode>(kind), false);
}
```Also, seeing this implementation leads me to the following spec feedback: the requirement that `nm == ""` is an unnecessary (though minimal) burden on decoders; we could just as well state that the field name is ignored, and the `0x7F`/`0x7E` byte is the only indicator that a compact encoding is being used.
I agree we could ignore the field name when the compact import indicator is here, but since it is included in the current state of the proposal do we need to keep the field_name.length() == 0 check here? I assume yes?
Yes, we should implement the proposal as it's currently written down. We may have to update the implementation if the upstream proposal changes.
if (tracer_) tracer_->ImportOffset(pc_offset());Ryan DiazThat's not good enough any more. This needs to be called once per `module_->import_table.push_back()`, and each time report the wire bytes offset that's the best definition for "where this import starts": in case of classic ungrouped imports, that's the starting position of the module name (i.e. where this call is right now); in case of import groups, it's the starting position of the field name.
This is a bit tricky to test, but here's one way:
(1) Define a module that uses import groups (you have one in your test, so this step is already done).
(2) Run `d8` over that test with `--dump-wasm-module`. That will put some file `<hash>.ok.wasm` into the current directory, with `<hash>` computed from the wire bytes, so find it with `ls *.wasm` or similar. (It'll dump more than one file if the test defines multiple modules).
(3) Compile our Wasm Module Inspector: `gm x64.release wami`.
(4) Test the regular disassembler (which is also used by Chrome DevTools): `out/x64.release/wami --full-wat --offsets file.wasm`. Pay attention to the offsets column for the import section.
(5) Test the hexdump mode (which is a debugging tool for us): `out/x64.release/wami --full-hexdump file.wasm`. Pay attention to the `// import #123` comments.Oh, and if testing these two disassembly modes uncovers other deficiencies, then we'll have to fix those too :-)
It's fine to do that in a separate CL. What's clear is that the current disassembler will not reproduce the grouping; if we want to preserve that, we'll have to build support for that.If you want to write disassembler tests, see `test/unittests/wasm/wasm-disassembler-unittest-*` for inspiration.
Thank you for this detailed comment.
I update the ImportOffset call using a helper AddImport should handle the kCompactImportByModuleAndType case correctly and reduce some code duplication regarding adding to the import table.
I think my tests look ok, but would like to add a disassembler test so going to to leave this comment unresolved for now.
It's up to you, but I'd recommend to land this CL soon (it's both old enough and big enough), and split out as much follow-up work as possible. Smaller CLs tend to lead to a better overall experience for both authors and reviewers (and, if necessary, later bisections and reverts and so on). Our feature flags allow us to work incrementally, and we should fully lean into that ability; a new feature should not be one massive code drop.
As a _very rough_ rule of thumb, I'd say the optimal CL size is in the ballpark of 100 lines; if you make them much smaller you need too many and they lack too much context; if you make them much bigger they become unwieldy (for rebasing, and reviewing, and bisecting/debugging). For new features, such a small size is often utopian, and it can be difficult to cut the work into meaningful pieces that are smaller than 300-500 lines; that's okay. And of course bug fixes or small cleanups can also be much smaller.
// (function TestCompactImportsByModule_AllKinds() {Editing artifact from playing with `--dump-wasm-module`?
let can_append = last && last.module === module && last.encoding === importEncodingType;Why do these check exist at the add* method?
It's a tradeoff between convenience and simplicity. It would be possible to implement a module builder where you can define stuff in any order you want, and the builder internally figures out how it has to assign final indices to make everything work out in the generated binary encoding. But such a module builder would be quite a bit more complicated than the one we have here, and implementing a super-convenient module builder isn't our main deliverable, and also the builder is somewhat performance sensitive because our test suite runs it thousands of times (including in Debug/simulator/sanitizer/... configurations where executing JavaScript is pretty slow) and we don't want to make that slower than necessary. So, for simplicity's sake, the builder assumes that it can assign the final index of every function immediately, and since imported functions must have indices before locally-defined functions, the builder requires us to define them first. The checks are just for us, to help us realize our mistakes as quickly as possible when we "hold it wrong" when writing a new test. We can drop them if we think it's unlikely that anyone would ever make a given mistake; in this case I think it's probably better to have them but I could be convinced otherwise.
Where do we usually track tasks like this for V8?
You can file a bug, sure; that's very appropriate for important tasks. You can also keep a personal list of "things I'll do if I ever have time for them". Or just keep it in the back of your head if you prefer.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// TODO(ryandiaz) this is still not correct in the hexdump output.Hmmm... this might be sort of unfixable with the current design: both `consume_utf8_string` and `tracer_->ImportOffset` immediately generate output; there's no back-patching support to insert the `// import #123` comment in the position that matches `start_offset`.
We may have to split `ImportOffset` into two callbacks: one for the import section entry, one for the actual import. Or we live with reduced accuracy of the annotations until we have a better idea.
This is not super important as it's not production code; but it can save our team quite some time when it comes in handy for debugging. This also implies that we don't want to pay a measurable performance price for the hexdump mode, nor do we want to introduce more code complexity than absolutely necessary, especially not here in the main decoder (if complex logic is unavoidable, it's better to put that into `module-inspector.cc`).
So, overall, I think the short-term conclusion is that figuring out the best solution for that hexdump mode is best left to a follow-up CL.
Yeah, not 100% sure of the correct way to fix this, but will follow up separately.
Currently I updated the behavior so that for the old import encoding, the `// import #123` comment ends up showing up after the import, not before. Definitely not ideal, will work on fixing this soon though.
if (tracer_) tracer_->ImportOffset(pc_offset());Ryan DiazThat's not good enough any more. This needs to be called once per `module_->import_table.push_back()`, and each time report the wire bytes offset that's the best definition for "where this import starts": in case of classic ungrouped imports, that's the starting position of the module name (i.e. where this call is right now); in case of import groups, it's the starting position of the field name.
This is a bit tricky to test, but here's one way:
(1) Define a module that uses import groups (you have one in your test, so this step is already done).
(2) Run `d8` over that test with `--dump-wasm-module`. That will put some file `<hash>.ok.wasm` into the current directory, with `<hash>` computed from the wire bytes, so find it with `ls *.wasm` or similar. (It'll dump more than one file if the test defines multiple modules).
(3) Compile our Wasm Module Inspector: `gm x64.release wami`.
(4) Test the regular disassembler (which is also used by Chrome DevTools): `out/x64.release/wami --full-wat --offsets file.wasm`. Pay attention to the offsets column for the import section.
(5) Test the hexdump mode (which is a debugging tool for us): `out/x64.release/wami --full-hexdump file.wasm`. Pay attention to the `// import #123` comments.Oh, and if testing these two disassembly modes uncovers other deficiencies, then we'll have to fix those too :-)
It's fine to do that in a separate CL. What's clear is that the current disassembler will not reproduce the grouping; if we want to preserve that, we'll have to build support for that.If you want to write disassembler tests, see `test/unittests/wasm/wasm-disassembler-unittest-*` for inspiration.
Jakob KummerowThank you for this detailed comment.
I update the ImportOffset call using a helper AddImport should handle the kCompactImportByModuleAndType case correctly and reduce some code duplication regarding adding to the import table.
I think my tests look ok, but would like to add a disassembler test so going to to leave this comment unresolved for now.
Ryan DiazIt's up to you, but I'd recommend to land this CL soon (it's both old enough and big enough), and split out as much follow-up work as possible. Smaller CLs tend to lead to a better overall experience for both authors and reviewers (and, if necessary, later bisections and reverts and so on). Our feature flags allow us to work incrementally, and we should fully lean into that ability; a new feature should not be one massive code drop.
As a _very rough_ rule of thumb, I'd say the optimal CL size is in the ballpark of 100 lines; if you make them much smaller you need too many and they lack too much context; if you make them much bigger they become unwieldy (for rebasing, and reviewing, and bisecting/debugging). For new features, such a small size is often utopian, and it can be difficult to cut the work into meaningful pieces that are smaller than 300-500 lines; that's okay. And of course bug fixes or small cleanups can also be much smaller.
Thank you for the guidance!
Editing artifact from playing with `--dump-wasm-module`?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
Nice! LGTM with a few final optional nits.
// TODO(b/491514514) this is still not really correct in the hexdumpnit: we don't need `b/`, the bug number is enough.
// TODO(ryandiaz) this is still not correct in the hexdump output.Ryan DiazHmmm... this might be sort of unfixable with the current design: both `consume_utf8_string` and `tracer_->ImportOffset` immediately generate output; there's no back-patching support to insert the `// import #123` comment in the position that matches `start_offset`.
We may have to split `ImportOffset` into two callbacks: one for the import section entry, one for the actual import. Or we live with reduced accuracy of the annotations until we have a better idea.
This is not super important as it's not production code; but it can save our team quite some time when it comes in handy for debugging. This also implies that we don't want to pay a measurable performance price for the hexdump mode, nor do we want to introduce more code complexity than absolutely necessary, especially not here in the main decoder (if complex logic is unavoidable, it's better to put that into `module-inspector.cc`).
So, overall, I think the short-term conclusion is that figuring out the best solution for that hexdump mode is best left to a follow-up CL.
Yeah, not 100% sure of the correct way to fix this, but will follow up separately.
Currently I updated the behavior so that for the old import encoding, the `// import #123` comment ends up showing up after the import, not before. Definitely not ideal, will work on fixing this soon though.
That might be a good-enough hack: add `↑` to the printout (so it's `// ↑ import #123`) and leave it at that 😊
(Doing that separately is fine nevertheless.)
// (function TestCompactImportsByModule_AllKinds() {Ryan DiazEditing artifact from playing with `--dump-wasm-module`?
Yes :)
An alternative trick that I find handy every now and then: add
```
console.log(JSON.stringify(builder.toArray()));
```
after you're done adding stuff to the builder (i.e. usually right before calling `builder.instantiate()`). `wami` understands both array literals and stdin, so you can then run `out/x64.release/d8 my-test.js | out/x64.release/wami --full-hexdump` to debug the generated module, and quickly switch to debugging V8's behavior by dropping the `| ...` part.
(Corollary: the output of `wami --full-hexdump` is also valid input to `wami` -- although that's probably never useful 😊
What is sometimes useful though is putting that output as an array literal in a test case that doesn't have access to wasm-module-builder.js, e.g. if it's in some external repository. Random example: https://github.com/bmeurer/wasm-dbg-stories/blob/main/src/gc.html was partially generated by `wami`.)
"Should contain 1 compact import by module grouping",I don't mind, but FWIW, we usually skip explicit messages, unless it's somehow non-obvious why a particular assertion should hold. In this case, the message doesn't say anything that `assertEquals(1, compactImportCount)` doesn't also say just as clearly.
(function TestCompactImportsByModuleAndType_Globals() {I would have merged all of these into one test, mostly because it would save a little bit of overhead, and I think it would be just as easy to understand/debug if it ever fails. You can also keep what you have though, it doesn't matter much either way.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// TODO(b/491514514) this is still not really correct in the hexdumpnit: we don't need `b/`, the bug number is enough.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
if (!type.is_shared()) {Is mixing of shared/unshared tabled allowed by the spec? I'm wondering if this check needs to be done for every iteration of the loop.
module_->import_table.push_back({module_name, name, kind});Keep the designated initializers to be consistent with the rest of the file.
`module_->import_table.push_back(WasmImport{.module_name = module_name,.field_name = name,.kind = kind});`
}Is there a downside to unconditionally reserving? If not, this can just be `table.reserve(table.size() + count);`
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
if (!type.is_shared()) {Is mixing of shared/unshared tabled allowed by the spec? I'm wondering if this check needs to be done for every iteration of the loop.
Good point, this could be moved up to line 1022 (with some minor modifications to operate on `table_template`).
Is there a downside to unconditionally reserving? If not, this can just be `table.reserve(table.size() + count);`
https://en.cppreference.com/w/cpp/container/vector/reserve.html says:
Correctly using reserve() can prevent unnecessary reallocations, but inappropriate uses of reserve() (for instance, calling it before every push_back() call) may actually increase the number of reallocations (by causing the capacity to grow linearly rather than exponentially) and result in increased computational complexity and decreased performance.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Jakob KummerowIs mixing of shared/unshared tabled allowed by the spec? I'm wondering if this check needs to be done for every iteration of the loop.
Good point, this could be moved up to line 1022 (with some minor modifications to operate on `table_template`).
Done
module_->import_table.push_back({module_name, name, kind});Keep the designated initializers to be consistent with the rest of the file.
`module_->import_table.push_back(WasmImport{.module_name = module_name,.field_name = name,.kind = kind});`
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
[wasm] Implement Compact Import encoding
https://github.com/WebAssembly/compact-import-section/blob/main/proposals/compact-import-section/Overview.md
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |