PhantomJS job: hack up an API for ACRA

263 views
Skip to first unread message

Meng Weng Wong

unread,
Jun 15, 2016, 10:26:01 AM6/15/16
to hacker...@googlegroups.com
For JS coders who either know or want to learn PhantomJS:

I was about to post a task on Upwork, but I thought I'd circulate it here first because the task might require a Singpass account.

Brief: write a PhantomJS-based library that provides an API to ACRA.gov.sg.

Background: In the UK, Companies House offers an impressive API:

In Singapore, ACRA does not. Its website assumes a human operating a browser.

So, use PhantomJS to pretend to be a human operating a browser.

It is not necessary for the API to support ALL the functionality that ACRA provides.

For the first version, the API should provide an interface to:
- incorporation of a Pte Ltd company
- CRUD directors
- CRUD shareholders
- submission of new constitution

Your library should expose its glue API as a set of methods.
It may also expose a REST API. However, ACRA's authentication model may make a REST API difficult.

Alternatives: Use Selenium or some other headless browser stack.

If you'd like to have a go at this work, please write back describing how you'd like to charge for and deliver the work.

This is a fairly chunky project which will probably include repeated episodes of significant frustration with ACRA's website, so factor that in accordingly.

Please also indicate if you have a preference for this being a work-for-hire (in which case you would just hand over the code and all rights) or an opensource project on Github.

Gibson Tang

unread,
Jun 15, 2016, 10:30:22 AM6/15/16
to hacker...@googlegroups.com
I would look at CasperJS as phantomJS is good for scraping, but it is not good when it comes to websites that requires authentication. CasperJS is build on top of PhantomJS and is optimized for websites that require authentication.

Btw, will such a project fall under the Computer Misuse Act as the act is quite vague and can be use on anything IT related





--
--
Chat: http://hackerspace.sg/chat

---
You received this message because you are subscribed to the Google Groups "HackerspaceSG" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hackerspaces...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Meng Weng Wong

unread,
Jun 15, 2016, 1:48:31 PM6/15/16
to hacker...@googlegroups.com
Thanks! So, job spec =~ s/PhantomJS/CasperJS or PhantomJS/g

I will take the risk on the CMA … there is no intent to do anything malicious. Besides, we're just trying to do our part for "Smart Nation".

Gibson Tang

unread,
Jun 15, 2016, 2:01:20 PM6/15/16
to hacker...@googlegroups.com
Well, intent of whether there is anything malicious depends on the strength of the lawyer. Is there a budget in mind?

--
--
Chat: http://hackerspace.sg/chat

---
You received this message because you are subscribed to the Google Groups "HackerspaceSG" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hackerspaces...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Stefan van der Bijl

unread,
Jun 16, 2016, 5:04:22 AM6/16/16
to hacker...@googlegroups.com

How are you planning on using this after? Will you charge for it? License model?

Meng Weng Wong

unread,
Jun 16, 2016, 8:06:00 AM6/16/16
to hacker...@googlegroups.com
I haven't gone to get budget approval yet but thinking it through over a cup of tea …

asynchronous poll architecture: Some ACRA operations are slow. Some are fast. Slow operations will require an asynchronous polling model; I assume websockets solves this problem so that when the middle tier discovers that a slow call against ACRA has returned, it can notify the client.

as a point of reference the latest back-end as a human interface at https://www.tis.bizfile.gov.sg/ngbtisinternet/faces/tismainpage.jsp
however, it is currently in "by left" state.

querying bizfile's "buy information" requires singpass login. With 2FA. This may be a showstopper: it might be tricky to take humans out of the loop if 2FA is involved. my 2FA login to Singpass is via a little OneKey device. If 2FA can be fulfilled using SMS, we may still be able to automate humans out of the picture: the software would need to get access to SMS somehow.

initial feature requirements:

- web UI to autocomplete a company name. Resolve the name to a UEN. ACRA is supposed to offer, for $363, a list of all entities (of a given type) that are live as of a given month. I bought a copy of this listing. The listing turned out to contain all entities that were registered during a given month – not all entities that were live in a given month. I have written to ACRA about this. So, no action at the moment.

- given a UEN, list all shareholders and directors. This needs to go through the bizfile "buy information" interface. Is the interface able to return a CSV, or only a PDF? If it is only able to retrieve a PDF, the work needed to extract data from a deliberately obfuscated, secured, password-protected PDF is probably not worth the project budget.

- (optional) update the list of shareholders. in the initial version, this can be referred to a traditional human corp sec. eventually we will want to do this programmatically.

Meng Weng Wong

unread,
Jun 16, 2016, 8:13:29 AM6/16/16
to hacker...@googlegroups.com
On Thu, Jun 16, 2016 at 10:04 AM, Stefan van der Bijl <ste...@valdebrain.com> wrote:

How are you planning on using this after? Will you charge for it? License model?


By sometime in 2017 this entire suite of code should be obsoleted by ACRA's actual official API, which I am told is Coming Soon. So it will be for internal use only, and/or stuck on Github to rot, since I assume nobody else will want to use this monstrosity.

I wrote to ACRA asking how to get access to the API documented at



Stefan van der Bijl

unread,
Jun 16, 2016, 9:08:54 AM6/16/16
to hacker...@googlegroups.com

That's funny. Government efficiency.

Tay Ray Chuan

unread,
Jun 19, 2016, 1:10:48 PM6/19/16
to hacker...@googlegroups.com
On Thu, Jun 16, 2016 at 8:05 PM, Meng Weng Wong <meng...@gmail.com> wrote:
(snip)
> querying bizfile's "buy information" requires singpass login. With 2FA. This
> may be a showstopper: it might be tricky to take humans out of the loop if
> 2FA is involved. my 2FA login to Singpass is via a little OneKey device. If
> 2FA can be fulfilled using SMS, we may still be able to automate humans out
> of the picture: the software would need to get access to SMS somehow.
(snip)

Random off-topic after too many shots - I have been thinking about
those cloud-based password managers where you can have them on all
your devices, desktop and mobile; what if we sync our browser's
'cookie jar' instead of just the passwords themselves? Maybe on a
per-domain basis.

Then we could solve this by allowing the real person to do the heavy
lifting (singpass login with 2FA), then transmit the cookie to
phantom/selenium to do the rest.

--
Cheers,
Ray Chuan
Reply all
Reply to author
Forward
0 new messages