Working on an HTML to TW5 text converter...

271 views
Skip to first unread message

Mister Mochi

unread,
Mar 26, 2021, 2:50:05 AM3/26/21
to TiddlyWiki
Hello everyone! Another newbie posting.

I have been working on a HTML to TW5 wikitext converter. I use TW5 as a digital notebook for my ever-increasing pile of notes from my studies. I have been trying to convert my old MS word-based notes into TW5.

I'm chose wikitext over directly embedding HTML code because TW5 is much more lightweight and can greatly reduce the size of the notebook

I don't want to use markdown either because TW5 allows for much more flexible formats e.g. merged table cells & nested ordered & unordered lists

I have brought in inspirations from project called Turndown (a HTML > MD converter) and adapting it to tiddlywiki format. I have also thrown in a WYSIWYG editor. (yes I know many people have issues with WYSIWYG... But I really need a helping hand with copy-and-pasting from MS word)

So far I'm making some progress and I have been breathing fresh air into my stale pile of MS-word notes.

I need some advice on:
  1. Is there a similar project out there? I have tried to search for a similar tool but none found so far.
  2. Is there any edge cases in wikitext (e.g. nesting blockquotes in bullets... those kind of stuff) That you think is important for such a tool to be able to handle?
Thank you! :) I hope I can continue to improve this tool and hopefully post it on Github within a few months' time.

Screenshot 2021-03-26 at 2.13.32 PM.png

Saq Imtiaz

unread,
Mar 26, 2021, 4:44:36 AM3/26/21
to TiddlyWiki
I posted a demo a while back exploring TW5 -> html -> TW5 syntax conversion:

It may provide some ideas or inspiration.

IIRC I also used a very quick hack of turndown for the html to wiki syntax conversion step. From what I remember it works reasonably well for most text formatting, widgets and macros are of course another story but that wont matter if you are only interested in converting HMTL to wiki syntax and not attempting a round trip conversion.

PMario

unread,
Mar 26, 2021, 6:21:09 AM3/26/21
to TiddlyWiki
On Friday, March 26, 2021 at 7:50:05 AM UTC+1 mochit...@gmail.com wrote:
Hello everyone! Another newbie posting.

Hi and Welcome!
 
I have been working on a HTML to TW5 wikitext converter. I use TW5 as a digital notebook for my ever-increasing pile of notes from my studies. I have been trying to convert my old MS word-based notes into TW5.

Great!
If you post a link to your project, I think you can become a community - hero ;)

I'm chose wikitext over directly embedding HTML code because TW5 is much more lightweight and can greatly reduce the size of the notebook

+1 on this
 
I don't want to use markdown either because TW5 allows for much more flexible formats e.g. merged table cells & nested ordered & unordered lists

Nice to here that!
 
So far I'm making some progress and I have been breathing fresh air into my stale pile of MS-word notes.

Good!
 
I need some advice on:
  1. Is there a similar project out there? I have tried to search for a similar tool but none found so far.
  2. Is there any edge cases in wikitext (e.g. nesting blockquotes in bullets... those kind of stuff) That you think is important for such a tool to be able to handle?
I think you can deal with edge cases once the come up. ... Release early and often ;)
 
Thank you! :) I hope I can continue to improve this tool and hopefully post it on Github within a few months' time.

A view months' ... Come on. ... You can't launch a teaser with an image like that and let us wait for months. ...
 

Screenshot 2021-03-26 at 2.13.32 PM.png

It looks good. ... I want to get my hands on it q:-)

-mario

Saq Imtiaz

unread,
Mar 26, 2021, 6:35:36 AM3/26/21
to TiddlyWiki
@pmario I could be wrong, but I am guessing this is a standalone tool and not something integrated into TiddlyWiki and the editor mechanism.

See the original Turndown demo: https://domchristie.github.io/turndown/

I'll be very pleased if I am wrong about that though :)

Regards,
Saq

Mark S.

unread,
Mar 26, 2021, 10:01:24 AM3/26/21
to TiddlyWiki
Thanks to working on the aggregator, I've become aware of a lot of semi-forgotten projects.  There have a fair number of slideshow presentations, for one.

For HTML conversion, there has been this:


I wrote a converter for use with Tiddlyclip, but I don't know if it was ever incorporated.

What I like about markdown is that it is a true standard that is supported by pandoc and other utilities and is used throughout the internet. It will likely be here in 30 years. Wikitext, on the other hand, changed between TWC and TW5 and might change again with TWx .

The new markdown plugin lets you use some wikitext, so you can get the best of both worlds.

Those discussions about which markup is better remind me of video tape enthusiasts clinging to their Beta tapes because the format was so much better than VHS. Of course I'm dating myself just by mentioning tape.

PMario

unread,
Mar 27, 2021, 8:03:31 AM3/27/21
to TiddlyWiki
On Friday, March 26, 2021 at 11:35:36 AM UTC+1 Saq Imtiaz wrote:
@pmario I could be wrong, but I am guessing this is a standalone tool and not something integrated into TiddlyWiki and the editor mechanism.

Yea, ... Non the less it would be killer to throw some HTML content into left side and get TW text on the right side. ... right?
 
See the original Turndown demo: https://domchristie.github.io/turndown/

I'll be very pleased if I am wrong about that though :)

Saq. As far as I can see in your link to the github repo. It only contains HTML and Javascript. ... I'm interested in the code, that does the conversion. ... As long as it's javascript it can be included into TW.

I hope the code in the OP-repo is JS too. ... If not, it stays in a SPA ... that's also OK.

Just my thoughts.
-mario

PMario

unread,
Mar 27, 2021, 8:04:20 AM3/27/21
to TiddlyWiki
On Friday, March 26, 2021 at 3:01:24 PM UTC+1 Mark S. wrote:
Thanks to working on the aggregator, I've become aware of a lot of semi-forgotten projects.  There have a fair number of slideshow presentations, for one.

For HTML conversion, there has been this:


Cool. I may have missed that. ... Or I forgot about it ;)
-m

Saq Imtiaz

unread,
Mar 27, 2021, 9:37:59 AM3/27/21
to TiddlyWiki
@pmario OP uses the same JavaScript library (turndown) to achieve the HTML to markdown/wikitext conversion as my demo posted above.

Turndown by default converts to Markdown, but it is relatively simple to override or add new rules for wikitext. It would indeed be nice to collaborate on getting a complete set or rules for all supported wikitext syntax.

This particular demo escapes HTML entered into the editor as part of a round trip conversion process. If that escaping was turned off, it would save HTML pasted into the editor as wikitext.

Hope this helps while we wait to hear back from the OP :)

Mister Mochi

unread,
Mar 28, 2021, 1:17:18 AM3/28/21
to TiddlyWiki
Thank you so much for all of your overwhelming response!!!

1. Yes I am using javascript for my codebase as it extends the behavior of Turndown

2. Thank you very much for the suggestions and references. The project by @saqimtiaz certainly showcases a possibility of what my project can turn out to be. Sadly I am not planning on implementing a full featured round-trip converter yet, because there will be too many tiddlywiki intestines on my hands (e.g. macro, widgets...) than I could have handled.

3. I have started a code base on github https://github.com/mistermochi/HTML2TW5 for all of your curious eyes and eager hands (LOL). Join in the conversation if you have any suggestion/ ideas/ code to offer!

I will continue to post updates in this post shall i make remarkable progress in the project.

Mister Mochi

unread,
Apr 2, 2021, 3:38:05 PM4/2/21
to TiddlyWiki
Hello everyone, here are some updates to the converter project:

Most of the standard behaviors of tiddlywiki syntax have been implemented! this includes major elements like (In-line styles, Lists, Headings, Images, Links, Block Quotes, Code Blocks, Definitions, Tables...)

As of my brief tests, most of the syntax will follow tiddlywiki documentations.

One major problem is dealing with "merged" table cells where "merging into a merged cell" will result in unexpected behavior (as illustrated in this issue https://github.com/mistermochi/HTML2TW5/issues/10)

Still figuring out how tiddlywiki handles these edge cases...

Otherwise please feel free to test out the converter and let me know if you run into any issues!

Mark S.

unread,
Apr 2, 2021, 4:05:19 PM4/2/21
to TiddlyWiki
In the case where you have a link around an external image, it tries to do something like this:

[ext[[img class="circle-img" [https ...

Unfortunately, AFAIK this is not possible. In any event, it didn't render for me. I think you would either have to leave the items as HTML, or put the external link either just before or just after the image.
Reply all
Reply to author
Forward
0 new messages