Command line browser (or browser-like utility) that does java/javascript?

Kenny McCormack

unread,

Oct 14, 2016, 1:56:17 PM10/14/16

to

Full disclosure: I am actually trying to solve one specific problem -
which is to lookup the network card manufacturer based on the MAC address.
So, in this case, end-around responses are welcome.

At this URL: http://aruljohn.com/mac/000B14

you will see a MAC-address to manufacturer lookup engine. I think it is
implemented in Java or something like that (the details of this are
irrelevant and don't interest me - don't bother writing in to correct me)

The point is, of course, that I would like to automate this. But I have
tried with all of curl, wget, and lynx - all of which return some variation
on:
$ curl http://aruljohn.com/mac/00026F
curl: (52) Empty reply from server
$

So, this boils down to:

1) Is there some other command line tool that can automate this?

2) Is there another way to get this info? Presumably, all this guy is
doing is hitting some database - that's out there online.
Presumably, I could access this same database directly.

Note: The volume involved here is not particularly large - I don't need a
solution for millions of lookups. I probably have about 20 things I want
to lookup.

--
Modern Conservative: Someone who can take time out from demanding more
flag burning laws, more abortion laws, more drug laws, more obscenity
laws, and more police authority to make warrantless arrests to remind
us that we need to "get the government off our backs".

Kaz Kylheku

unread,

Oct 14, 2016, 2:41:31 PM10/14/16

to

On 2016-10-14, Kenny McCormack <gaz...@shell.xmission.com> wrote:
> Full disclosure: I am actually trying to solve one specific problem -
> which is to lookup the network card manufacturer based on the MAC address.
> So, in this case, end-around responses are welcome.
>
> At this URL: http://aruljohn.com/mac/000B14

WHTF is aruljohn? If you do a whois lookup on this domain,
the identity of the registrant is thoroughly cloaked.

The assignment of the "OUI" fields of MAC addresses is under the control
of the IEEE; that's the "horse's mouth" source from which to get the
info.

See here: http://standards.ieee.org/develop/regauth/oui/

Scrape-able text: http://standards-oui.ieee.org/oui/oui.txt

The disadvantage is that it's a big file. If you have to do these
queries regularly, you probably want to cache a local copy and fetch it,
say, once a month.

Here is the 000B14 entry:

00-0B-14 (hex) ViewSonic Corporation
000B14 (base 16) ViewSonic Corporation
381 Brea Canyon Road
Walnut California 91789
US

Kenny McCormack

unread,

Oct 14, 2016, 3:13:40 PM10/14/16

to

In article <201610141...@kylheku.com>,

Kaz Kylheku <221-50...@kylheku.com> wrote:
>On 2016-10-14, Kenny McCormack <gaz...@shell.xmission.com> wrote:
>> Full disclosure: I am actually trying to solve one specific problem -
>> which is to lookup the network card manufacturer based on the MAC address.
>> So, in this case, end-around responses are welcome.
>>
>> At this URL: http://aruljohn.com/mac/000B14
>
>WHTF is aruljohn? If you do a whois lookup on this domain,
>the identity of the registrant is thoroughly cloaked.

Who knows? Not really my problem, though.

>The assignment of the "OUI" fields of MAC addresses is under the control
>of the IEEE; that's the "horse's mouth" source from which to get the
>info.
>
>See here: http://standards.ieee.org/develop/regauth/oui/
>
>Scrape-able text: http://standards-oui.ieee.org/oui/oui.txt

Yup. Got that file. Looks good. Thanks.

--
b w r w g y b r y b

lawren...@gmail.com

unread,

Oct 14, 2016, 5:49:17 PM10/14/16

to

On Saturday, October 15, 2016 at 6:56:17 AM UTC+13, Kenny McCormack wrote:
> I am actually trying to solve one specific problem -
> which is to lookup the network card manufacturer based on the MAC address.

The (current) official source for this information is <http://standards.ieee.org/cgi-bin/ouisearch> and <http://standards-oui.ieee.org/oui.txt>.

I got the info from here <https://metacpan.org/pod/Net::MAC::Vendor>.

Ivan Shmakov

unread,

Oct 14, 2016, 10:40:35 PM10/14/16

to

>>>>> Kenny McCormack <gaz...@shell.xmission.com> writes:

> Full disclosure: I am actually trying to solve one specific problem -
> which is to lookup the network card manufacturer based on the MAC
> address. So, in this case, end-around responses are welcome.

> At this URL: http://aruljohn.com/mac/000B14

> you will see a MAC-address to manufacturer lookup engine. I think it
> is implemented in Java or something like that (the details of this
> are irrelevant and don't interest me - don't bother writing in to
> correct me)

> The point is, of course, that I would like to automate this. But I
> have tried with all of curl, wget, and lynx - all of which return
> some variation on:

> $ curl http://aruljohn.com/mac/00026F
> curl: (52) Empty reply from server
> $

The remote appears to filter by User-Agent:. Consider, e. g.:

$ lynx --dump -- http://aruljohn.com/mac/000B14

Looking up aruljohn.com
Making HTTP connection to aruljohn.com
Sending HTTP request.
HTTP request sent; waiting for response.
Alert!: Unexpected network read error; connection aborted.
Can't Access `http://aruljohn.com/mac/000B14'
Alert!: Unable to access document.

lynx: Can't access startfile
$ lynx --dump --useragent=xnyL -- http://aruljohn.com/mac/000B14
...
Results for MAC address 00:0B:14

Found 1 results.
MAC Address/OUI Vendor {Company}
00:0B:14 ViewSonic Corporation
...
$

[...]

--
FSF associate member #7257 58F8 0F47 53F5 2EB2 F6A5 8916 3013 B6A0 230E 334A

Kenny McCormack

unread,

Oct 15, 2016, 4:02:17 AM10/15/16

to

In article <87r37i5...@violet.siamics.net>,
Ivan Shmakov <iv...@siamics.net> wrote:
...

>$ lynx --dump --useragent=xnyL -- http://aruljohn.com/mac/000B14
>...
>Results for MAC address 00:0B:14
>
> Found 1 results.
> MAC Address/OUI Vendor {Company}
> 00:0B:14 ViewSonic Corporation
>...
>$
>
>[...]

Nice!

While I assumed that the better solution to the actual problem was just to
get the underlying database and thus to be able to skip using the aruljohn
site, it is still good to know how to get it via lynx. That is what I
was looking for when I decided to post.

Out of curiosity, how did you figure this out?

And what is 'xnyL' ?

--
The people who were, are, and always will be, wrong about everything, are still
calling *us* "libtards"...

(John Fugelsang)

Janis Papanagnou

unread,

Oct 15, 2016, 4:21:35 AM10/15/16

to

On 15.10.2016 10:02, Kenny McCormack wrote:
> In article <87r37i5...@violet.siamics.net>,
> Ivan Shmakov <iv...@siamics.net> wrote:
> ...
>> $ lynx --dump --useragent=xnyL -- http://aruljohn.com/mac/000B14
>>

[...]

>
> And what is 'xnyL' ?

'Lynx' backwards. But I'm also interested in the rationale behind it.

Janis

Ian Zimmerman

unread,

Oct 15, 2016, 1:04:34 PM10/15/16

to

On 2016-10-15 08:02, Kenny McCormack wrote:

> And what is 'xnyL' ?
>
> --
> The people who were, are, and always will be, wrong about everything, are still
> calling *us* "libtards"...

It is related to "sdratbil" :-)

--
Please *no* private Cc: on mailing lists and newsgroups
Personal signed mail: please _encrypt_ and sign
Don't clear-text sign: http://cr.yp.to/smtp/8bitmime.html

lawren...@gmail.com

unread,

Oct 16, 2016, 12:00:50 AM10/16/16

to

On Saturday, October 15, 2016 at 9:21:35 PM UTC+13, Janis Papanagnou wrote:
> But I'm also interested in the rationale behind it.

The site is blocking certain user agents, so you have to substitute something else.

Myself, I often use “Not Firefox”.

Ivan Shmakov

unread,

Oct 16, 2016, 5:45:48 AM10/16/16

to

>>>>> Janis Papanagnou <janis_pa...@hotmail.com> writes:
>>>>> On 15.10.2016 10:02, Kenny McCormack wrote:

>>>>> Ivan Shmakov <iv...@siamics.net> wrote:

[Cross-posting to news:comp.infosystems.www.misc, as the issue
at hand is arguably more related to WWW than to Unix Shell.]

>>> The remote appears to filter by User-Agent:.

>>> $ lynx --dump --useragent=xnyL -- http://aruljohn.com/mac/000B14

>> And what is 'xnyL' ?

> 'Lynx' backwards. But I'm also interested in the rationale behind it.

The rationale behind filtering by User-Agent:, or how did I find
it out?

Per my observations, sites attempt to filter by User-Agent:
to mitigate certain kinds of "abuse", such as unsanctioned
mirroring, or recursive retrieval in general (which is part of
operation of, say, email harvesters.) As such, disallowing
"Wget" -- a popular recursive downloading and mirroring tool --
is not uncommon; I've seen it done at such domains as arxiv.org,
classiccmp.org and datasheetcatalog.org. The proper solution
is, of course, to use the /robots.txt control file instead.
(Granted, GNU Wget can be configured to ignore one -- but so
can it be configured to use an arbitrary User-Agent: string.
For which my long-time preference is, and I'm not trying to
surprise anyone, "tegW".)

Personally, I consider it far worse an issue when the recursive
retrieval software misidentifies itself as a common Web user
agent. Per my experience, a number of such requests originate
from 202.46.48.0/20. Like, say:

202.46.54.133 - - 2016-10-15 21:27:23 +0000 "GET / HTTP/1.1" 200 2546 "-"
"Mozilla/5.0 (Windows NT 10.0; WOW64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.93 Safari/537.36"

Worse still is that even those requests from that same network
identified as "Baiduspider/2.0" in my logs do not seem to ever
refer to /robots.txt. As such, I've decided to deny access to
certain sections of my Web sites to certain User-Agent: and
request source IP combinations.

... Another popular option for ad-hoc crawlers is the Net::HTTP
library for Perl, commonly identified by "libwww-perl" in the
User-Agent: header. Incidentally, Lynx has exact same "libwww"
substring in its own default User-Agent: value, leading to some,
what I presume are, "false positives."

Which is one of the reasons why I tend to use somewhat random
User-Agent: strings for my long-running Lynx sessions. Thus,
when I was able to access the site in question from one so
configured Lynx instances perfectly well, and was then refused
access running $ lynx --dump from command line, the "User-Agent"
filtering was my guess right away.

lawren...@gmail.com

unread,

Oct 16, 2016, 9:35:01 PM10/16/16

to

On Sunday, October 16, 2016 at 10:45:48 PM UTC+13, Ivan Shmakov wrote:
> ... and was then refused access ... the "User-Agent" filtering was my guess
> right away.

That’s the #1 most common reason. Reason #2 is a “Referer” [sic] check.