How about the searching speed on 50,000 articles?

323 views
Skip to first unread message

Kathir J

unread,
Jul 30, 2014, 4:47:54 AM7/30/14
to tiddl...@googlegroups.com
Hi,

i have already posted the question in http://stackoverflow.com/questions/25030572/50-000-articles-parsing-html-text-xml-faster-search-results and this is not double posting but to check about the TiddlyWiki feasibility and fast.

I have huge collection of articles which can be converted into a TiddlyWiki pages.

currently for example

1947 - 100 1948 - 200 ... 2014....

Each page has huge text example 250/500 KB size.

I tried with Zim portable edition and added all the pages and sub-pages and the search was really slow to get the results eventhough it was referring to local directory and index is enabled.

I need to search and display the results in less than 5 seconds.

example search keywords:

  1. going
  2. going away
  3. passing over
clarifications:

1. How about the speed of Tiddly Wiki
2. How to get the results in the expected time frame?

Thanks.

stefan infp

unread,
Jul 30, 2014, 12:40:32 PM7/30/14
to tiddl...@googlegroups.com

Interesting question. I think for such a large information mediawiki would be good but because its not portable and the local server version is slow this option fails. (I saw an older portable version of MediaWiki here but didnt really tested it:  http://framakey.org/WebApp/MediaWikiPortable). I'm also looking for such software as you want. I tried Onenote which is very powerful and supports the management of very large databases, but it doesnt have so flexible and practical tag system as Tiddlywiki (it give me tons of articles in search results and cannot categorize them as I want). Evernote has multiple useful viewing modes, advanced search feature but it s limited in note's size, attachment and other. Online note taking apps are not good (slow, lack of features, limitations,) for such a big database as you have. I tried Tagspaces which is a great document organizer, viewer, editor and tagger but yet it cannot search inside documents (it also has a non free android version). I found TiddlyWiki a very good solution (other personal wiki apps are too complex, not enough customizable, slow, dead (in development)) but TW5 doesnt support the include plugin to be able to search the content of multiple files, cant manage such a large information in a single file, the node version of it is great but not very flexible to me. I put my data in multiple TiddlyWiki files based on category, use a single html file which include all the links that direct to each TiddlyWiki file (these are embeded with the iframe tag) (this way I can organize my data well, dont overload a single TiddlyWiki files, I can copy the entire database to mobiles, tablets easily and use a file indexer to search the desired article in all my database). Im still searching for better solutions. Till now this is the best I know.
Stefan

Jeremy Ruston

unread,
Jul 30, 2014, 12:52:06 PM7/30/14
to TiddlyWiki
Hi Kathir

I think it's safe to say that TiddlyWiki is probably not presently a good choice for a scenario with 50,000 tiddlers of 250/500kb each. That volume of content would strain browsers to breaking point.

One could imagine some of the improvements that would be necessary to work well in that kind of scenario: we'd need external tiddler storage, and probably need to precompute search indices to make full text searching work properly. Those kind of changes are certainly on the roadmap but probably wouldn't happen soon.

Best wishes

Jeremy









--
You received this message because you are subscribed to the Google Groups "TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlywiki+...@googlegroups.com.
To post to this group, send email to tiddl...@googlegroups.com.
Visit this group at http://groups.google.com/group/tiddlywiki.
For more options, visit https://groups.google.com/d/optout.



--
Jeremy Ruston
mailto:jeremy...@gmail.com

Matabele

unread,
Jul 30, 2014, 3:21:12 PM7/30/14
to tiddl...@googlegroups.com
Hi

Perhaps try putting your documents into CouchDB with an assist from a full text search engine.

Project of this sort here: https://github.com/rnewson/couchdb-lucene

regards

Kathir J

unread,
Jul 31, 2014, 8:34:25 AM7/31/14
to tiddl...@googlegroups.com

How to put the documents in Couch DB and access it? Does it supports huge files? What is the client?

Kathir J

unread,
Jul 31, 2014, 8:36:09 AM7/31/14
to tiddl...@googlegroups.com, jeremy...@gmail.com

How many tiddlers the TiddlyWiki can support and the size of each file?

Kathir J

unread,
Jul 31, 2014, 8:37:09 AM7/31/14
to tiddl...@googlegroups.com

Can the MediaWiki can able to handle huge files as said above? What will be the time taken to return the search results? How quick it will be?

Matabele

unread,
Jul 31, 2014, 9:18:55 AM7/31/14
to tiddl...@googlegroups.com
Hi

If you are looking for something fast with a GUI, have a look at Recoll (uses Xapian index database.) This is my preference as it is actively maintained and comes with a decent manual (the Beagle project appears to have been abandoned.)

If you want something portable with a GUI, have a look at DocFetcher (uses Lucene.)

http://alternativeto.net/software/recoll/

There are numerous server side solutions -- a few listed here.

http://alternativeto.net/software/elasticsearch/


regards

On Wednesday, July 30, 2014 10:47:54 AM UTC+2, Kathir J wrote:

Kathir J

unread,
Jul 31, 2014, 10:31:51 AM7/31/14
to tiddl...@googlegroups.com
I am using windows and thinking of a tool which works irrespective of the OS atleast linux and windows.

Thanks for the details.


--
You received this message because you are subscribed to a topic in the Google Groups "TiddlyWiki" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tiddlywiki/F6_iYGx86OM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tiddlywiki+...@googlegroups.com.

Matabele

unread,
Jul 31, 2014, 11:35:45 AM7/31/14
to tiddl...@googlegroups.com
Hi

Then go with DocFetcher -- it's Java and entirely portable -- the portable version can be run of the same portable storage as your files.

http://docfetcher.sourceforge.net/en/index.html

regards

Danielo Rodríguez

unread,
Aug 1, 2014, 6:18:22 AM8/1/14
to tiddl...@googlegroups.com
El jueves, 31 de julio de 2014 15:18:55 UTC+2, Matabele escribió:
If you are looking for something fast with a GUI, have a look at Recoll (uses Xapian index database.) This is my preference as it is actively maintained and comes with a decent manual (the Beagle project appears to have been abandoned.) 

If you want something portable with a GUI, have a look at DocFetcher (uses Lucene.) 

http://alternativeto.net/software/recoll/

There are numerous server side solutions -- a few listed here.

http://alternativeto.net/software/elasticsearch/

regards


Thank you matabelle for your software recomendations. I always took something interesting from your posts.


Stephan Hradek

unread,
Aug 2, 2014, 2:35:44 AM8/2/14
to tiddl...@googlegroups.com, jeremy...@gmail.com


Am Donnerstag, 31. Juli 2014 14:36:09 UTC+2 schrieb Kathir J:

How many tiddlers the TiddlyWiki can support and the size of each file?

Depends on Browser, OS and Computer.

cangaroo joe

unread,
Aug 2, 2014, 4:37:36 AM8/2/14
to tiddl...@googlegroups.com
Hi,
You can try the portable version of Wordpress (http://www.instantwp.com/) which has tons of plugins. You can embed your documents into Wordpress' database and use it as a portable application from your usb stick, external hard disk, etc.

PMario

unread,
Aug 2, 2014, 8:16:16 AM8/2/14
to tiddl...@googlegroups.com, jeremy...@gmail.com
On Thursday, July 31, 2014 2:36:09 PM UTC+2, Kathir J wrote:

How many tiddlers the TiddlyWiki can support and the size of each file?

I did several tests with many tiddlers. 20'000 and 100'000 tiddlers

see:
https://groups.google.com/d/msg/tiddlywiki/E9DMg4ZTccw/P1qjS_-HOjAJ  .... inital test code
https://groups.google.com/d/msg/tiddlywiki/-82VBw5GRvk/AE2aYh--o_IJ ... 20'000 tiddler test
https://groups.google.com/d/msg/tiddlywiki/sH4EZukozdM/PM4P3Iy_iWoJ ... 100'000 tiddler test -> will probably break every browser

-mario

Kathir J

unread,
Aug 2, 2014, 2:26:39 PM8/2/14
to tiddl...@googlegroups.com
Thanks a lot for the info cangraoo jee..


On Sat, Aug 2, 2014 at 1:37 AM, cangaroo joe <canga...@gmail.com> wrote:
Hi,
You can try the portable version of Wordpress (http://www.instantwp.com/) which has tons of plugins. You can embed your documents into Wordpress' database and use it as a portable application from your usb stick, external hard disk, etc.

--

Kathir J

unread,
Aug 2, 2014, 2:29:33 PM8/2/14
to tiddl...@googlegroups.com

If tiddlers handlers 800 kb of each file of 50,000 articles in the future please let me know as i like tiddlers as it is easy to take backup too...



--
You received this message because you are subscribed to a topic in the Google Groups "TiddlyWiki" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tiddlywiki/F6_iYGx86OM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tiddlywiki+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages