A Offline Desktop Wiktionary

37 views
Skip to first unread message

arunt...@gmail.com

unread,
Nov 22, 2011, 10:29:14 AM11/22/11
to freetamil...@googlegroups.com
Hello All,

I have moved the ownership of my tawiktionary-offline project to the thamizha organisation in github. Its built using the wxPython GUI library, if anyone wishes to improve it feel free to do so. Kindly clone the repo and make a new branch for yourself with the branch name of the library you are working on. If you want to build upon the existing code, make a fork of it as a personal copy and send a pull request. This way we will always be in sync and not loose commits and changes.

There are issues to fix for the present code, which are listed here https://github.com/thamizha/tawiktionary-offline/issues/

Do whatever you wish to do with the code, as a personal request, kindly maintain the name of the final piece of software as Karthika :)
 

--
Regards
P.Arunmozhi
Twitter: @tecoholic
Website: http://arunmozhi.in

Muguntharaj Subramanian

unread,
Nov 23, 2011, 11:41:12 PM11/23/11
to freetamil...@googlegroups.com
Thanks Arunmozhi for making this available under thamizha.

This is going to be very useful project to the tamil community if delivered right way.

If anyone wish to work on this please give a shout.

1. As a first step, I hope we can make this available in present form to windows/linux/mac users. Following may be the tasks involved.
 -  Need to create installer package for all these 3 platform.
 - Need to make some improvement to GUI. Color/Logos need to be designed - Designers among us can help on this.
 - Need to create a facility to get database updated with latest wiktionary content whenever user wishes
- And we need volunteers to test this and to make documentation wikis/help videos etc...

2. Later we can focus on making it available in mobile platforms - for this we might need to port it to some other technology preferably those using html5/javascript like phonegap.

Regarding name, ya we will maintain the name Karthika (we may prefix that with Thamizha to be in consistant with other thamizha projects in future - Arunmozhi, i hope you dont mind that )

Friends,
Have

--
You received this message because you are subscribed to the Google Groups "ThamiZha! - Free Tamil Computing(FTC)" group.
To post to this group, send an email to freetamil...@googlegroups.com.
To unsubscribe from this group, send email to freetamilcomput...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/freetamilcomputing?hl=en-GB.



--
Blog: http://mugunth.blogspot.com
Follow me @ http://twitter.com/mugunth

Srikanth Lakshmanan

unread,
Nov 24, 2011, 5:20:03 AM11/24/11
to freetamil...@googlegroups.com
On Thu, Nov 24, 2011 at 10:11, Muguntharaj Subramanian <mug...@gmail.com> wrote:

2. Later we can focus on making it available in mobile platforms - for this we might need to port it to some other technology preferably those using html5/javascript like phonegap.

There is no harm in doing the above, but we would get the above support from WMF mobile team and anyone who is interested should probably help the base effort to improve wikimedia sites on mobile which will do lot more good while achieving the same goal. 
 
--
Regards
Srikanth.L

Muguntharaj Subramanian

unread,
Nov 24, 2011, 6:25:11 AM11/24/11
to freetamil...@googlegroups.com

Hi Srikanth,
What we are attempting is to create an offline dictionary for which the data is taken from wiktionary (not online access to wiktionary).  Does WMF mobile team works on offline dictionary app or works on mobile access wikimedia sites ?

Please send links where I can find more info on WMF mobile team.

Regards,
Mugunth



 
 
--
Regards
Srikanth.L

--
You received this message because you are subscribed to the Google Groups "ThamiZha! - Free Tamil Computing(FTC)" group.
To post to this group, send an email to freetamil...@googlegroups.com.
To unsubscribe from this group, send email to freetamilcomput...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/freetamilcomputing?hl=en-GB.

த*உழவன்

unread,
Nov 24, 2011, 2:00:30 PM11/24/11
to ThamiZha! - Free Tamil Computing(FTC)
தமிழ் விக்சனரியை கீழ்காணும் தொடுப்பில் இருந்து பதிவிறக்கம் செய்து
பயன்படுத்தலாம்.
http://dumps.wikimedia.org/tawiktionary/20110429/
அதுபற்றிய மேலதிக குறிப்புகளையும், பின்வரும் மற்றொரு இணையப் பக்கத்தில்
காணலாம்.
http://en.wikipedia.org/wiki/Wikipedia:Database_download

மேற்கண்டவைகளுக்கும், தங்களது முயற்சிக்கும் எத்தகைய வேறுபாடு என்று அறிய
விரும்புகிறேன்.

இவண்,
--த.உ.

On Nov 24, 4:25 pm, Muguntharaj Subramanian <mugu...@gmail.com> wrote:
> On Thu, Nov 24, 2011 at 8:20 PM, Srikanth Lakshmanan <srik....@gmail.com>wrote:

Srikanth Lakshmanan

unread,
Nov 24, 2011, 10:47:16 PM11/24/11
to freetamil...@googlegroups.com
Hi Srikanth,
What we are attempting is to create an offline dictionary for which the data is taken from wiktionary (not online access to wiktionary).  Does WMF mobile team works on offline dictionary app or works on mobile access wikimedia sites ?

Please send links where I can find more info on WMF mobile team.

Hi Mugunth,

My comment was specific to the HTML5/JS solution using phonegap, that cant be offline right? The WMF mobile team works on mobile access across wikimedia sites[1].

Another feature that should go into the offline dictionary app is font support. The postive side of a desktop app is, search would be efficient, but its difficult to add font support, so the app is useful only to those who have fonts installed. Where as a browser based offline site can help through webfonts, but searching on offline is challange especially for Indic.

 
--
Regards
Srikanth.L

arunt...@gmail.com

unread,
Nov 25, 2011, 7:53:36 AM11/25/11
to freetamil...@googlegroups.com
திரு தா உழவன்,
On Fri, Nov 25, 2011 at 12:30 AM, த*உழவன் <tha.u...@gmail.com> wrote:
மேற்கண்டவைகளுக்கும், தங்களது முயற்சிக்கும் எத்தகைய வேறுபாடு என்று அறிய
விரும்புகிறேன்.

நீங்கள் கொடுத்த சுட்டியில் உள்ள தகவலுக்கும் இந்த முயற்சிக்கும் பெரிய வேறுபாடு இல்லை. அதில் இருப்பது "ப்ரௌசெர்/உலாவி" கொண்டு பயன்படுத்தும் முறை, நாங்கள் முயற்சிப்பது அதை ஒரு "desktop application"ஆகா பயன்படுத்தும் முறை அவ்வளவே.

http://www.arunmozhi.in/2011/07/karthika-building-a-wiktionary-completely-offline/
சென்று இந்த முயற்சியின் முழு விவரமும் பெறுங்கள்.

--
Regards

Arunmozhi
Twitter: @tecoholic
Website: http://arunmozhi.in
IRC Nick: teco

த*உழவன்

unread,
Nov 25, 2011, 1:48:14 PM11/25/11
to ThamiZha! - Free Tamil Computing(FTC)
அருள் மொழி!
நீங்கள் கொடுத்தத் தொடுப்பில் சென்று அனைத்தினையும் படித்துப்
பார்த்தேன். மிக நுட்பமான விளக்கங்கள். பலருக்கும் இதனால் நன்மை
கிடைக்கும் என்று எண்ணும் போது மகிழ்ச்சியாக இருக்கிறது. விக்சனரியில்
ஏறத்தாழ 5,6 விதமான பக்க வடிவங்கள் உள்ளன.அவற்றை ஒழுங்குப் படுத்த
வேண்டும். அப்பொழுது உங்களின் அனுபவங்களையும் தெரிவித்திடுங்கள்.
பைத்தானில் தானியங்கியை இயக்க, முயற்சிகளை மேற்கொள்கிறேன்.அதிலும்
உங்களது வழிகாட்டுதல்கள் கிடைக்கும் என்று மகிழ்ச்சியில் இவ்வுரையை
முடிக்கிறேன்.
வாழ்த்துக்கள்.
வணக்கம்.

ஆவலுடன் எதிர்நோக்கும்,
இவண்,
தகவலுழவன்.

On Nov 25, 5:53 pm, "arunthe...@gmail.com" <arunthe...@gmail.com>
wrote:


> திரு தா உழவன்,
>

> On Fri, Nov 25, 2011 at 12:30 AM, த*உழவன் <tha.uzha...@gmail.com> wrote:
> > மேற்கண்டவைகளுக்கும், தங்களது முயற்சிக்கும் எத்தகைய வேறுபாடு என்று அறிய
> > விரும்புகிறேன்.
>
> நீங்கள் கொடுத்த சுட்டியில் உள்ள தகவலுக்கும் இந்த முயற்சிக்கும் பெரிய
> வேறுபாடு இல்லை. அதில் இருப்பது "ப்ரௌசெர்/உலாவி" கொண்டு பயன்படுத்தும் முறை,
> நாங்கள் முயற்சிப்பது அதை ஒரு "desktop application"ஆகா பயன்படுத்தும் முறை
> அவ்வளவே.
>

> http://www.arunmozhi.in/2011/07/karthika-building-a-wiktionary-comple...

Ashok

unread,
Jun 3, 2012, 4:47:27 PM6/3/12
to freetamil...@googlegroups.com, arun...@ieee.org
Hi Mugunth and ThamiZha! team,

While searching for a way to build on Arunmozhi's code base, I found the following project on Launchpad by Benjamin Thyreau - An application to easily read Wikipedia's downloaded dump files:
https://launchpad.net/wikipediadumpreader

As you can see from the above link, this is open source and dual licensed under Simplified BSD Licence and GNU GPL v2.

By combining features from Benjamin's code and Arunmozhi's code, I have built an application that seems to do the basic functions OK. Please see the enclosed screenshot.

Benjamin's code base had two things that we are looking for: PyQT4 user interface and more usable (though not complete) parsing of the wiki markup. On top of that he had built the ability to follow links. A user can click on hyperlinked words in the results to look-up those words further. However, on additional  testing I discovered his code base had one big limitation for our use. It can only be used as an English to Tamil dictionary, but Tamil words cannot be looked up. This is because his indexing was not in unicode - as he himself noted in comments in his code.

Arunmozhi uses Python Whoosh module for indexing and searching. Whoosh is natively built to handle Unicode. He also split the larger Wiktionary dump file into smaller chunks for faster look-ups. And he went a step further and built a Windows exe as well.

"Wouldn't it be great if we can combine Benjamin's PyQT4 user interface and wiki parsing with Arunmozhi's indexing/searching and then build a Windows exe following Arunmozhi's steps", I thought. However, it turned out to be harder than it initially appeared ( Isn't it always ;-) ). Especially because I am new to Python! But, long story short, I have the modified code as well as the Windows exe now. Let me know how you want me to send it to the Github (thamizha / tawiktionary-offline) site.

With best regards,

இரா. அசோகன்
Karthika.png

Mugunth

unread,
Jun 5, 2012, 9:37:04 AM6/5/12
to ThamiZha! - Free Tamil Computing(FTC)


On Jun 4, 6:47 am, Ashok <ashokram...@gmail.com> wrote:

> "Wouldn't it be great if we can combine Benjamin's PyQT4 user interface and
> wiki parsing with Arunmozhi's indexing/searching and then build a Windows
> exe following Arunmozhi's steps", I thought. However, it turned out to be
> harder than it initially appeared ( Isn't it always ;-) ). Especially
> because I am new to Python! But, long story short, I have the modified code
> as well as the Windows exe now. Let me know how you want me to send it to
> the Github (thamizha / tawiktionary-offline) site.

Great work Ashok,
Lets take it forward. I like working with Qt framework. So can
contribute to this project.
You can upload the code to our thamizha github -
https://github.com/thamizha/tawiktionary-offline if you have access
to it. If you dont have acess please send me your github id. I will
add access.

Regards,
Mugunth

Ashok Ramachandran

unread,
Jun 8, 2012, 4:22:49 PM6/8/12
to freetamil...@googlegroups.com
Hi Mugunth,

Here is my Github ID: AshokR
Email: ashok...@gmail.com
Please grant me access. I will add the code files as well as the exe.

With best regards,
Ashok

--
You received this message because you are subscribed to the Google Groups "ThamiZha! - Free Tamil Computing(FTC)" group.

Muguntharaj Subramanian

unread,
Jun 27, 2012, 11:18:23 PM6/27/12
to freetamil...@googlegroups.com, ashok...@gmail.com
Dear All,
PyQt based code for our wiktionary offline project is updated by Ashok. Please try that code and give your feedback & also contribute to the project if you find that interesting.

Thanks Ashok for taking this project forward.

Regards,
Mugunth

---------- Forwarded message ----------
From: Ashok Ramachandran <ashok...@gmail.com>
Date: Thu, Jun 28, 2012 at 12:52 PM
Subject: Re: [FTC] Re: A Offline Desktop Wiktionary
To: Muguntharaj Subramanian <mug...@gmail.com>


Hi Mugunth,

I have added the steps under "How to set up for development". Please go through and let me know if you run into any problems.

Regards,
Ashok


Shrinivasan T

unread,
Jun 27, 2012, 11:20:34 PM6/27/12
to freetamil...@googlegroups.com, ashok...@gmail.com

share the url to get the code.

Muguntharaj Subramanian

unread,
Jun 27, 2012, 11:22:00 PM6/27/12
to freetamil...@googlegroups.com
Its here: https://github.com/thamizha/tawiktionary-offline

On Thu, Jun 28, 2012 at 1:20 PM, Shrinivasan T <tshrin...@gmail.com> wrote:

share the url to get the code.

--

Reply all
Reply to author
Forward
0 new messages