Offline SmartSpeaker with BeagleBone. My approach, comments and suggestions?

71 views
Skip to first unread message

Apoorv Gupta

unread,
Mar 12, 2018, 1:44:02 PM3/12/18
to BeagleBoard GSoC
Hi everyone!

I became instantly interested in the project idea Offline SmartSpeaker with BeagleBone, it co-incidentally happens that I am in process of implementing a very similar idea with a raspberry pi. I have made a significant advances till now i have already configured an external audio card with alsamixer and a few changes in the kernel working on CMUsphinx now, I plan to switch to beagle board. this would be an extensive project, I am planning to aim for
  • I would be using CMUsphinx for voice recognition and Espeak voice synthesizer platform for feedback and talkback functionality for the smartspeaker.
  • all of this will be offline.
  • The smart speaker will include features like trigger words, ability to understand and do tasks like control music, alarms, calendars, etc.
  • I would also like to aim for creating a home automation API which would allow users to attach there own controllable electrical appliances with different operation states to the beagle bone with ease and thus control them with voice
  • yes all beagle A8 platforms

I would like the mentors and the community to comment on the relevance of my approach and give suggestions.


Thanks

Apoorv Gupta

IRC-apoorvtintin

Hunyue Yau

unread,
Mar 14, 2018, 5:31:23 PM3/14/18
to beaglebo...@googlegroups.com, Apoorv Gupta
On Monday, March 12, 2018 10:44:01 Apoorv Gupta wrote:
> Hi everyone!
>
> I became instantly interested in the project idea Offline SmartSpeaker with
> BeagleBone, it co-incidentally happens that I am in process of implementing
> a very similar idea with a raspberry pi. I have made a significant advances
> till now i have already configured an external audio card with alsamixer
> and a few changes in the kernel working on CMUsphinx now, I plan to switch
> to beagle board. this would be an extensive project, I am planning to aim
> for
>
> - I would be using CMUsphinx for voice recognition and Espeak voice
> synthesizer platform for feedback and talkback functionality for the
> smartspeaker.
> - all of this will be offline.
> - The smart speaker will include features like trigger words, ability to
> understand and do tasks like control music, alarms, calendars, etc.
> - I would also like to aim for creating a home automation API which
> would allow users to attach there own controllable electrical appliances
> with different operation states to the beagle bone with ease and thus
> control them with voice
> - yes all beagle A8 platforms
>
> I would like the mentors and the community to comment on the relevance of
> my approach and give suggestions.

There are a few more things to consider for this:
- Have you looked in to how well CMUsphinx works on a processor like what is
on the Beagle family (Cortex-A8)?
- Have you considered pocketsphinx?
- Does the combination of CMUsphinx/Espeak leave enough free CPU to do other
things?
- How do you plan to implement trigger words in an efficient/useable fashion?

The goal here is not necessarily to have a "product" but to have a framework
showing it is possible to have an open-eco system smart speaker.
>
>
> Thanks
>
> Apoorv Gupta
>
> IRC-apoorvtintin

--
Hunyue Yau
http://www.hy-research.com/

Apoorv Gupta

unread,
Mar 15, 2018, 4:30:33 AM3/15/18
to BeagleBoard GSoC

Thanks for the comments Hunyue Yau. I would be glad to answer them

  1.   Pocket sphinx is the choice, since it is best suited for embedded applications
  2.   According to my experience working with pocket sphinx on similar embedded systems so far, I think for an offline speaker with some limitations pocketsphinx would work just fine on the A8 and ram available on beaglebone.
  3.   Also in accordance to figures I’ve come across! pocketsphinx on idle listening uses much less CPU(single core multithreading enabled) power(approx. 10%) than when it is actually processing the input audio(approx. 60-70%) which happens in burst, so considering the voice/audio will be processed only on specific moments, yes the A8 can be used to perform other tasks as well.
  4.  Pocketsphinx already has some support for keywords. Pocket sphinx is always listening for voice, when it detects audio it looks up in the key wordlist, dictionary and language model which I will provide, this is not very accurate unless tuned properly which will be one of my biggest task. After the keyword is successfully detected it will invoke the complete voice recognition and feedback part.
  5. Yes I totally have the same thing in mind, goal will be to create a framework system showing it is possible to have an open-eco system smart speaker, as well as creating robust flexible framework on which other hobbyists and the beagle community can build upon/customize for various voice controlled applications. that would involve clear documentation and readable code as well.
Reply all
Reply to author
Forward
0 new messages