Scraped listing of Malaysian Electoral Districts needed

83 views
Skip to first unread message

Khairil Yusof

unread,
Jul 21, 2012, 10:36:30 AM7/21/12
to sinar-...@googlegroups.com
Need an updated scraped list of Malaysian Electoral Districts for use by developers.

Anybody interested in writing one?

http://en.wikipedia.org/wiki/List_of_Malaysian_electoral_districts


kiawin

unread,
Jul 21, 2012, 10:41:34 AM7/21/12
to sinar-...@googlegroups.com
I'm ok with this, when need it to be done?

Can we use python?
--
to be or not to be? http://blog.kiawin.com

kiawin

unread,
Jul 21, 2012, 10:46:12 AM7/21/12
to sinar-...@googlegroups.com
A check from google, is it better to scrap from http://undi.info?

Khairil Yusof

unread,
Jul 21, 2012, 11:26:04 AM7/21/12
to sinar-...@googlegroups.com
https://scraperwiki.com python or php or ruby too I think.

Not sure how easy it is to scrape undi.info, it would be great to have a link to respective undi.info pages/details. If they can connect to sinarproject and grab info for the various candidates/representatives then people would have quite complete data with nice app to use it with.

Khairil Yusof

unread,
Jul 21, 2012, 11:28:02 AM7/21/12
to sinar-...@googlegroups.com
It would be nice to have in a week or so for my use, but up to the time available for you. Not a deal breaker, as I as I have free text field users can fill in first. With this, they can at least have auto-complete to make it easier in filling MP profiles

On Sat, Jul 21, 2012 at 10:41 PM, kiawin <kia...@gmail.com> wrote:

sweemeng ng

unread,
Jul 21, 2012, 11:33:28 AM7/21/12
to sinar-...@googlegroups.com
on scraping 
yes you can use python. To make things easy, you can use scraperwiki to run your scraper, it support python,ruby and php. Point us the url of your scraper, so we can grab the data.

If you want to scrape from undi.info . Be careful, a js heavy page is not easily scrapable using scripting language. 
--
Just a random living organic computer code generator

kiawin

unread,
Jul 21, 2012, 11:41:55 AM7/21/12
to sinar-...@googlegroups.com
A quick glance at their html source all data are all in one page. I can use mechanize+lxml to lubricate the process.

Let's hope maxis give me back my internet line and I will sort it out in the coming one week.

Thanks :)

sweemeng ng

unread,
Jul 21, 2012, 11:52:02 AM7/21/12
to sinar-...@googlegroups.com
One thing I want to try is to use phantomjs with python. But good luck, 

kiawin

unread,
Jul 21, 2012, 1:55:53 PM7/21/12
to sinar-...@googlegroups.com
Some how can't sleep and done some simple scrapping for undi.info


Is this what you need?

Khairil Yusof

unread,
Jul 21, 2012, 11:23:38 PM7/21/12
to sinar-...@googlegroups.com
Yes. Awesomeness.

Thanks!

kiawin

unread,
Jul 21, 2012, 11:46:26 PM7/21/12
to sinar-...@googlegroups.com

Any format needed? How about CSV?

Khairil Yusof

unread,
Jul 21, 2012, 11:50:50 PM7/21/12
to sinar-...@googlegroups.com
What's best for nested? JSON? XML? Normalized CVSs would also be easy to use by folks I think.

Khairil Yusof

unread,
Jul 22, 2012, 9:18:21 AM7/22/12
to sinar-...@googlegroups.com
I just had time to double check while working on my autocomplete fields. Undi.info is missing some values eg. P122 Seputeh, or their data is incomplete.

http://en.wikipedia.org/wiki/List_of_Malaysian_electoral_districts as reference for complete list.

Not much of a problem, I'll just add missing ones to my vocab list.

On Sun, Jul 22, 2012 at 11:46 AM, kiawin <kia...@gmail.com> wrote:

kiawin

unread,
Jul 22, 2012, 12:30:43 PM7/22/12
to sinar-...@googlegroups.com
thousand apologies. my mistake actually. used actual state name instead of slugs which caused WP and NS went missing.

updated list here :)

kiawin

unread,
Jul 23, 2012, 1:26:49 PM7/23/12
to sinar-...@googlegroups.com
Dear all,

Electoral District results (2004, 2008) in csv format.

Parliament Electoral District results
parliament-code,election-year,name,party,votes

State Electoral District results
parliament-code,state-code,election-year,name,party,votes

Thanks :)

regards,
Sian Lerk

kiawin

unread,
Jul 23, 2012, 1:27:53 PM7/23/12
to sinar-...@googlegroups.com
Sorry, wrong link in previous email :D

Parliament Electoral District results
parliament-code,election-year,name,party,votes

State Electoral District results
parliament-code,state-code,election-year,name,party,votes

Khairil Yusof

unread,
Jul 23, 2012, 6:06:09 PM7/23/12
to sinar-...@googlegroups.com
I wonder what the legal implications are of scraping public information from an "all rights reserved site".

This is quite detailed info. I guess it's OK and really useful for private use, but I don't think we can reuse this on a public site without permission from undi.info (Malaysiakini) even though it's awesome work from Kiawin.

Coder KK

unread,
Jul 24, 2012, 12:33:52 AM7/24/12
to sinar-...@googlegroups.com
I think those information is public, and we just use a program to copy and paste (instead of manual copy and paste) the information to a file. Then save it to a database. Because this information is one time only (we no connect to it everytime), maybe until next election. That time, maybe can provide a interface to key-in by own.

sweemeng ng

unread,
Jul 24, 2012, 12:45:57 AM7/24/12
to sinar-...@googlegroups.com
Even copy by hand by rights need attribution

Maulviridha Abu Bakar

unread,
Jul 24, 2012, 12:58:55 AM7/24/12
to sinar-...@googlegroups.com

I believe election results are public info.  It is published by various media publications, including SPR themselves.

I'd say its in Public Domain.

sweemeng ng

unread,
Jul 24, 2012, 1:01:18 AM7/24/12
to sinar-...@googlegroups.com
Please move any new copyright topic on scraped data to new thread. thanks. 

This post is for scraped data on Electoral district
Reply all
Reply to author
Forward
0 new messages