Presenting... Speech-to-Text plugin for TW5!!!

381 views
Skip to first unread message

Finn Lancaster

unread,
Aug 9, 2021, 9:25:34 AM8/9/21
to TiddlyWiki
Have you ever wanted a plugin in TW5 to make tiddler-creation easier?

Well... look no further! With the Speech-to-Text-in-TW5 plugin, you can click the Record button in the sidebar, speak into your computer, and a new tiddler with the transcript will be created for you!

The project was the idea of me (flancast90), and was a combination of BurningTreeC's TW knowledge, and my JS API calls, and really made it what it is!

The plugin can be found at https://github.com/flancast90/Speech-To-Text-in-TW5, where documentation is in progress. Please feel free to drop a star if you appreciate our work. 

Lastly, I want to thank everyone involved in this forum, and in GitHub. Besides BurningTreeC and I, TW Tones and Joshua Fontay also supported our project early on with their knowledge and ideas. 

Enjoy!!

ludwa6

unread,
Aug 9, 2021, 9:56:32 AM8/9/21
to TiddlyWiki
Sounds awesome, @flanc... But how to install? 
The github repo you've linked has a bunch of files; is there one we can just drag&drop into a TW instance, like most other plugins?  

/walt

Finn Lancaster

unread,
Aug 9, 2021, 10:09:53 AM8/9/21
to TiddlyWiki
@ludwa6,

Because the plugin is still (sort of) in its Beta stage, you must load it yourself. The page at https://tiddlywiki.com/dev/static/How%2520to%2520create%2520plugins%2520in%2520the%2520browser.html seems to document how to do this well. 

Another method to load the plugin from the github repo is to setup a Node.Js TW instance. Then, you can just follow BurningTreeC's instructions at https://github.com/flancast90/Speech-To-Text-in-TW5/issues/1#issuecomment-894827645, then start the Node.Js server to get going.

I will mention the drag-and-drop to BurningTreeC. I am extremely inexperienced with TW plugin-creation, which is why I just handled the vanilla JS side of things. I am sure it is something simple, and I'll tell you when I know.

Thanks for the valuable feedback!

ludwa6

unread,
Aug 9, 2021, 10:17:38 AM8/9/21
to TiddlyWiki
Thanks for the quick help, @flanc... But that's a few more steps than i have time for on this lunch break :-) 
Will definitely give this a whirl come weekend, if not sooner. 
Amazing accomplishment, Finn, presuming it actually works -congrats! 

/walt

BurningTreeC

unread,
Aug 9, 2021, 10:56:02 AM8/9/21
to TiddlyWiki

It's probably worth mentioning that it doesn't work on Firefox because it's missing the API

best wishes, BTC

PMario

unread,
Aug 9, 2021, 11:21:37 AM8/9/21
to TiddlyWiki
On Monday, August 9, 2021 at 4:09:53 PM UTC+2 flanc...@gmail.com wrote:
@ludwa6,

Because the plugin is still (sort of) in its Beta stage, you must load it yourself.

Beta with a 1.0.x version number? That's new. You should probably have a closer look at semantic versioning: https://semver.org/
For me it looks more like alpha.
-m

BurningTreeC

unread,
Aug 9, 2021, 12:57:32 PM8/9/21
to TiddlyWiki
Drag&Drop installation can now be done from https://flancast90.github.io/Speech-To-Text-in-TW5/

arun babu

unread,
Aug 9, 2021, 1:45:39 PM8/9/21
to TiddlyWiki
Hi Finn Lancaster and BurningTreeC,

Great work. I tried the plug-in in Chrome browser. It works fine. One thing I noticed is that if I pause or stop talking for a few seconds, it automatically stops recording. It would be nice if the button can be used to "stop recording" also in addition to "start recording" function.

ludwa6

unread,
Aug 9, 2021, 2:02:30 PM8/9/21
to TiddlyWiki
Ditto @arunn : works in Chrome for me too, tho not in TiddlyDesktop nor Quine2 -my primary working tools for desktop and mobile, alas- and it would indeed be good to have recording stop by button-push instead of timeout.

In fact it is in mobile UseCase that i am most keen to use this, so if there's any way to get it working on iOS, i will be all over it!

Anyway: that's quite a neat trick you have pulled off here, Finn. Top marks for out-of-the-box thinking!

/walt

Finn Lancaster

unread,
Aug 9, 2021, 3:26:07 PM8/9/21
to TiddlyWiki
ludwa6: I believe it should work on iOS, although I have not
 done the necessary testing myself to say so. The required API which makes it not work for FireFox is present in Safari, so it should function normally!

Let me know how it goes!

David Gifford

unread,
Aug 9, 2021, 3:50:05 PM8/9/21
to TiddlyWiki
Looking forward to the demo site!

Finn Lancaster

unread,
Aug 9, 2021, 5:15:00 PM8/9/21
to tiddl...@googlegroups.com
@David Grifford, 

You should be able to run a test, right in the browser, at the demo site 

From there, you can also drag-and-drop the plug-in, etc. 

Enjoy! 

--
You received this message because you are subscribed to the Google Groups "TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlywiki+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tiddlywiki/27cf09e7-a90a-4cba-9ac9-6ed0633d55b4n%40googlegroups.com.

TW Tones

unread,
Aug 9, 2021, 6:06:11 PM8/9/21
to TiddlyWiki
flanc...

Lovely, works well on windows/chrome browser as it does natively on Android. This is great.

IoS users should look at apples native transcription on iPhone or mac or make use of this in chrome.

I do think the current method hit mike and once you stop talking a transcript tiddler is great for quick notes. And is a required feature.
For me however I would like to keep taking indefinitely into a text field. Perhaps a seperate microphone button in the editor Toolbar could allow this different method.
Such that when clicked it listens and transcribes either immediately or after a pause, but reenters listening mode right away so we can continue. 

This plugin and Googles current voice system are so good, even with my Australian accent, that it is a game changer. Finally voice dictation has come of age.

Futures
Now we need to locate the methods by which to issue commands to tiddlywiki such done to save and close a tiddler etc... or issue shortcuts to tiddlywiki from a voice command. I think this is a separate and distinct function to the transcription. I am thinking of "control s" or "OK tiddlywiki" to accept voice commands. I imagine google has developed something within the transcription to support this?

Thanks so much for being part of this game changer.
Tones

BurningTreeC

unread,
Aug 10, 2021, 1:18:35 AM8/10/21
to TiddlyWiki
We haven't tested it on iOS, but it should work there when Siri is enabled

BurningTreeC

unread,
Aug 10, 2021, 2:40:35 AM8/10/21
to TiddlyWiki
Hi @Tones

We're currently thinking about how to make such an Editor Toolbar button reality, which then updates the text field with the spoken text
It should be possible somehow

Meanhwhile, I've released v1.0.2 of the plugin (now that we've already released v1.0.0 we stick with the versioning)
It now does no more stop listening after the first result ... it continues listening until either a long timeout has passed or the record button is clicked again

What we could already do is making voice commands a reality
Do you have some ideas for useful commands?

best wishes, BTC

TiddlyTweeter

unread,
Aug 10, 2021, 2:54:01 AM8/10/21
to TiddlyWiki
BurningTreeC wrote...

Meanhwhile, I've released v1.0.2 of the plugin (now that we've already released v1.0.0 we stick with the versioning)
It now does no more stop listening after the first result ... it continues listening until either a long timeout has passed or the record button is clicked again

Brilliant! Tx BTC. I was about to complain that WAS the only major offput. 
Changing that to continuous until long time-out is really ACE!

I will re-test with this new version and comment more later.

HATS OFF to both you and le Flanc.

Best wishes
TT

TiddlyTweeter

unread,
Aug 10, 2021, 4:49:25 AM8/10/21
to TiddlyWiki
BTC & Flanc

Very timely! 

FYI I been messing around with "speech-to-text" forever, and always ended up Waiting For The Miracle To Come

The miracle is here, now, speech recognition that works. 
That API does the biz.
Your implementation  is neat!

ONE thing is "vocabulary". My acid test on speech recog. is Jabberwocky . The system does well with it ... I get back from  it ...

 >> ... twas brillig and the slithy toves did gyre and gimble in the wabe all mimsy were the borogoves and The Mome Raths outgrabe

That is fantastically accurate!

I'll comment more a bit later, after some more usage.

Very best
TT

TiddlyTweeter

unread,
Aug 10, 2021, 5:22:39 AM8/10/21
to TiddlyWiki
Ciao Flank & BTC

These issues are maybe more API than the plugin? Dunno. But  I thought  I should ask about them ...

1 - PUNCTUATION. Say you dictating a long diatribe is there A WAY TO INSERT PUNCTUATION? Like "punc-stop, punc-comma, punc-colon, punc-question"

2 - NEW SENTENCE.   Say you dictating a long diatribe is there A WAY TO CAPITALIZE THE FIRST LETTER AFTER A STOP? (see 1, above)

Minor points, but I think relevant.

GREAT work by you two! This will change my TW usage significantly. 
Kudos!

TT

TiddlyTweeter

unread,
Aug 10, 2021, 5:33:25 AM8/10/21
to TiddlyWiki
Ciao Again BTC & Flank

My one remaining saga is LANGUAGE.

I just can't figure out how to switch from  English to ITALIAN.
I'm UNCLEAR if that is an OS, BROWSER or PLUGIN thing?
I could not get it to work.
I did look at the docs for the API but its too technical for my brain to work it out.

SO: Quick question, will we be able to change Language Recognized at will in a full-on version?

Just wondering. 
Marvelous stuff, a Wondrous Machine.

Best wishes
TT

ludwa6

unread,
Aug 10, 2021, 6:27:45 AM8/10/21
to TiddlyWiki
I'm not sure @BurningTreeC / @Flanc if the misunderstanding about iOS is on my side or yours, but the problem w/ using this plugin on that platform is not about voice (Siri works fine), but rather about file saving & sync; in that respect, neither Safari nor Chrome on iOS can do the job, for reasons i don't entirely understand, but will tell you what i know.

There are many cloud-based apps on iOS that use Siri, which you can then access from your desktop, and always be working on the same copy... But this is not possible with TW, so what's needed is a files-based solution that will sync to desktop via some cloud service, like iCloud or Dropbox.

To that end: the only way i know how to do this is via the "Quine 2" app (a free download from the AppStore), which is purpose-built to do exactly this -and it does so beautifully.  Indeed: to enable mobile access to the full range of different TW instances i am managing -each with its own unique config of plugins/ theme/ templates- i've got the whole lot in a Quine folder on my iOS desktop volume, and they all work in Quine 2 just as nicely as they do in TiddlyDesktop (my power-tool for administrating these instances) -with the exception of this Speech-to-Text plugin, alas, which fails in a different way on each, i.e.:
  • In TiddlyDesktop, the mic icon light up red and says "Recording started," but then goes dark and says "Stopped recording" in <1sec, w/o creating a new tiddler.
  • In Quine 2, the mic icon turns from grey to black when i click it, but nothing else happens after that.
Now i can't begin to guess why this doesn't "just work" the way all these other plugins do in Quine 2, but i can tell you that this app has a sizeable and quite active user community, in which developer Chris Hunt does a great job of responding to any issues that arise -and i would guess that he'd be as as interested as any of us here in the idea of bringing dictation capability to Quine 2.  

Tho i guess Android probably has more "pocket-share" than iOS in this TW community, let's face it: there's a whole lotta iPhone users out there for whom a mobile NoteTaking/ NoteMaking app WITH voice dictation capability would be a game changer indeed...So i really hope there's an easy way to get this wonderful plugin working in Quine 2! 

/walt

unread,
Aug 10, 2021, 6:32:13 AM8/10/21
to TiddlyWiki
Hi all and thanks Flank & BTC for this extremely promising plugin!

Another few ideas and comments:

(1) It would be nice to have a record button directly in the editor and not only have to rely on creating new transcript tiddlers.
(2) Automatic punctuation should be easy to switch on and off as per https://cloud.google.com/speech-to-text/docs/automatic-punctuation.
(3) There are quite a number of languages and accents available on the Web Speech API demo. It would be nice to be able to set the languageCode parameter manually (fully list at: https://cloud.google.com/speech-to-text/docs/languages). When this happens, could you make sure that custom record buttons can be added using only wikitext to allow multilingual users to have several record buttons based on their own needs?
(5) It could be worth mentioning in the readme that Google's Web Speech API demo at https://www.google.com/intl/en/chrome/demos/speech.html is a great way of checking for browser compatibility. I haven't found anything that works on my Linux system by the way. Any ideas as Firefox is unsupported and even Chromium strangely doesn't seem to be able to run the API on my system (V. 91.0.4472.114 on Linux Mint)? On my phone, Quinoid V1.0 doesn't work either :(
(6) Bouncing on TT's idea of using speech-driven commands while recording, this would make terrific sense not only for punctuation, but also to enunciate proper nouns for instance, or to switch languages on the fly within a given recording or prior to a recording ("switchtoItalian"…).
(7) I'm still running into problems regarding the short hearing span on the plugin's TW demo despite the version being seemingly v. 1.0.3.

Thanks :)

Finn Lancaster

unread,
Aug 10, 2021, 8:57:36 AM8/10/21
to tiddl...@googlegroups.com
Thanks everyone for the great advice! 

My main takeaways for what I will start working on are commands: punctuation, language switch, etc. 

As this is on the side of the things I develop for the plug-in, I will set to work immediately reading documentation and example code. 

To get more technical,  I believe I can just attach a function to check the last word of the transcription, and if it matches a set array of word/s it will the execute a command. 

BTC and I continue to talk on the Issues page of the repo, if any other devs would like to chime in with other ideas/implementation, feel free to chime in there. 

Thanks again! 

--
You received this message because you are subscribed to the Google Groups "TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlywiki+...@googlegroups.com.

TiddlyTweeter

unread,
Aug 10, 2021, 8:57:54 AM8/10/21
to TiddlyWiki
Ciao Flank & BTC

Underlying my previous on ITALIANO issues ... 

It would be great if I could sing in the lyrics for Adriano Celantano's MA PERKE ... starting ...

  Su confessa amore mio
  Io non sono più il solo, l'unico
  Hai nascosto nel cuore tuo
  Una storia irrinunciabile
  Io non sono più il tuo pensiero
  Non sono più il tuo amore vero
  Sono il dolce con fondo amaro
  Che non mangi più ...

Confesso di richieste irragionevoli.
TT


TiddlyTweeter

unread,
Aug 10, 2021, 9:21:18 AM8/10/21
to TiddlyWiki
flanc ...

I'm NOT a dev. 

The thing you done already is brilliant!!
In practice PUNCTUATION matters to me to notch it up, After that, that I can be recognized in the Italian LANGUAGE.
Anything more is more icing on the cake. 

NB: Having it in editor is low priority for me. 
It is Good Enough Already transcribing the lingo per me (for me) to a simple Tiddler.

Just a comment.

Best wishes
TT

Télumire

unread,
Aug 10, 2021, 1:24:18 PM8/10/21
to TiddlyWiki
Hello @Flanc,

I just tested the demo and this works really well with french (with chrome, it didn't work with Firefox). Being able to see the live transcript would be even better (to see if the recording works properly), but this is already awesome. Great work! :)

TW Tones

unread,
Aug 10, 2021, 6:49:32 PM8/10/21
to TiddlyWiki
Flanc

I think a design configuration is how do we trigger a more involved "command" on tiddlywiki. Perhaps we should leverage the existing keyboard shortcuts, such that either one can speak eg alt-space then the user can assign whatever they wish in response to that, ie trigger a set of actions. Of course in time "close tiddler" etc would be nice, but I think the key is the escape process, so lets say one can happily dictate any thing, except "OK TiddlyWiki" in which case rather than dictate into the current text field a dialogue opens in which the same dictation can search for any tiddlywiki designed terms, eg ok tiddlywiki, new tiddler (which you see on screen). 

We have a way to detect what I call the focused tiddler eg;  {{$:/HistoryList!!current-tiddler}}, for which such "ok tiddlywiki" can be designed to use as currentTiddler.

The TiddlyWiki command plugin and others may be a short cut for us, to jump into, to move from dictation to commands. One could imagine where the user can assign what action or keyboard shortcut could be used as a result of "OK tiddlywiki". Only later we could see how to get the commands recognised in the middle of dictation, but I expect we still need this escape method.

Regards
Tony

Finn Lancaster

unread,
Aug 10, 2021, 6:53:42 PM8/10/21
to TiddlyWiki
@Tones, way ahead of you :) Already added command functionality for language switching in new release. Updating Github Pages so drag and drop is on correct version right now!

TW Tones

unread,
Aug 10, 2021, 6:57:15 PM8/10/21
to TiddlyWiki
Flanc,

I was not after language switching, but are you suggesting we change from text dictation, to a tiddlywiki command, using the same mechaisium?

Tones

Finn Lancaster

unread,
Aug 10, 2021, 7:10:24 PM8/10/21
to TiddlyWiki
Something along those lines. My next goal, as I've shared with BTC, is to allow users to assign their own commands via tiddler inputs, and essentially, when they speak these commands out-loud via the plugin, the plugin then executes the command. I've started a minimal implementation of this with the new release (GH pages is acting up, so drag and drop may not be available for a while), which allows to users to speak "change the language to X" and the language will change if available, all that's needed is the custom-input ones :)

Frédéric Demers

unread,
Aug 10, 2021, 9:57:31 PM8/10/21
to TiddlyWiki
Thanks for the speech-to-text plugin, quite impressive from my early trials (chrome). As a couple people have suggested, I think an editor toolbar button/keyboard shortcut to insert the transcript text in the current tiddler's text field (where the cursor is?) would be neat and be best for me.  I'm happy to try mod'ing myself unless you've already got plans. 

Finn Lancaster

unread,
Aug 10, 2021, 10:01:54 PM8/10/21
to TiddlyWiki
Fred: Already in-progress. Also, I just made a new thread on oral commands with the plugin in the new update. I am interested to hear what you have to say about it, and would love someone to get a conversation going there so others see and give more feedback.

Reply all
Reply to author
Forward
0 new messages