Performing inference locally can:
- Preserve privacy, by not shipping user data across the network
- Improve performance, by eliminating network latency, running models natively and taking advantage of hardware acceleration, including specialized hardware not available with WebGL, WebGPU or WASM.
- Provide a fallback if network access is unavailable, possibly using a smaller and lower quality model
The graph-based Web NN API and the Model Loader API are complementary approaches. We'll need to do some benchmarking to understand if there are performance differences, and get feedback from developers to see if it's valuable to offer both types of API.
The Model Loader API is currently included as a Tentative Specification in the Web Machine Learning Working Group Charter.
Gecko: No signals
WebKit: No signals
Intel : Collaborating in Working Group
Microsoft/ONNX: Collaborating in Working Group
PyTorch: No signals
Salesforce: Collaborating in Working Group
No milestones specified
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAEK6eFxSmK6ajiQ6_7rhPspiaDXO3xM3uNJnbSgCobQ_nk%2Bxpg%40mail.gmail.com.