Not sure how easy it is to scrape undi.info, it would be great to have a link to respective undi.info pages/details. If they can connect to sinarproject and grab info for the various candidates/representatives then people would have quite complete data with nice app to use it with.
Khairil Yusof
unread,
Jul 21, 2012, 11:28:02 AM7/21/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
It would be nice to have in a week or so for my use, but up to the time available for you. Not a deal breaker, as I as I have free text field users can fill in first. With this, they can at least have auto-complete to make it easier in filling MP profiles
On Sat, Jul 21, 2012 at 10:41 PM, kiawin <kia...@gmail.com> wrote:
sweemeng ng
unread,
Jul 21, 2012, 11:33:28 AM7/21/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
on scraping
yes you can use python. To make things easy, you can use scraperwiki to run your scraper, it support python,ruby and php. Point us the url of your scraper, so we can grab the data.
If you want to scrape from undi.info . Be careful, a js heavy page is not easily scrapable using scripting language.
--
Just a random living organic computer code generator
kiawin
unread,
Jul 21, 2012, 11:41:55 AM7/21/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
A quick glance at their html source all data are all in one page. I can use mechanize+lxml to lubricate the process.
Let's hope maxis give me back my internet line and I will sort it out in the coming one week.
Thanks :)
sweemeng ng
unread,
Jul 21, 2012, 11:52:02 AM7/21/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
One thing I want to try is to use phantomjs with python. But good luck,
kiawin
unread,
Jul 21, 2012, 1:55:53 PM7/21/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
Some how can't sleep and done some simple scrapping for undi.info
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
Yes. Awesomeness.
Thanks!
kiawin
unread,
Jul 21, 2012, 11:46:26 PM7/21/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
Any format needed? How about CSV?
Khairil Yusof
unread,
Jul 21, 2012, 11:50:50 PM7/21/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
What's best for nested? JSON? XML? Normalized CVSs would also be easy to use by folks I think.
Khairil Yusof
unread,
Jul 22, 2012, 9:18:21 AM7/22/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
I just had time to double check while working on my autocomplete fields. Undi.info is missing some values eg. P122 Seputeh, or their data is incomplete.
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
I wonder what the legal implications are of scraping public information from an "all rights reserved site".
This is quite detailed info. I guess it's OK and really useful for private use, but I don't think we can reuse this on a public site without permission from undi.info (Malaysiakini) even though it's awesome work from Kiawin.
Coder KK
unread,
Jul 24, 2012, 12:33:52 AM7/24/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
I think those information is public, and we just use a program to copy and paste (instead of manual copy and paste) the information to a file. Then save it to a database. Because this information is one time only (we no connect to it everytime), maybe until next election. That time, maybe can provide a interface to key-in by own.
sweemeng ng
unread,
Jul 24, 2012, 12:45:57 AM7/24/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
Even copy by hand by rights need attribution
Maulviridha Abu Bakar
unread,
Jul 24, 2012, 12:58:55 AM7/24/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
I believe election results are public info. It is published by various media publications, including SPR themselves.
I'd say its in Public Domain.
sweemeng ng
unread,
Jul 24, 2012, 1:01:18 AM7/24/12
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sinar-...@googlegroups.com
Please move any new copyright topic on scraped data to new thread. thanks.
This post is for scraped data on Electoral district