Hi Loki
I came across your profile and given your recent contributions, I think you might be interested in what my team and I are building. We're working on an open source Rust SDK that runs LLMs (as well as voice and vision) directly on-device. We're hitting ~40 tok/s on iPhones and Android, with bindings for Swift, Kotlin, Flutter and Unity.
We're actively looking for contributors, so if on-device inference would be useful for anything you're building, happy to give you early access to the full toolkit.
Best,
Sam