A quest for a more accurate net.isIP() for IPv6

Showing 1-7 of 7 messages
A quest for a more accurate net.isIP() for IPv6 snoj 10/3/12 10:47 PM
Apologies a head of time if I'm out of line.

I've been working on an node and IPv6 project for my own curiosity and I've found what for me is a big problem with the current IPv6 portion of net.isIP. Basically, anything starting with :: returns as a valid IPv6 address. I'm not even at a regex newb level, but I felt that revamping the regex would just create one that was way too complex. Perhaps it could be done, but I don't have the skills. So I ended up doing what I know best, brute forcing a solution.

What I settled on was hand parsing the address and making that as fast as possible. With the tests below, the speed of my method is comparable with the old net.isIP. Being only within a few milliseconds over 10,000 iterations. I feel this is an acceptable number for the trade off in accuracy. 

At the moment I'm trying to learn how to build node and run the tests against it, so getting this into the pull request queue may take some time. In the mean time, could I get some scathing commentary on my approach and any pointers on how this could be done better or faster?

Thanks,
Josh Erickson

//Begin Code

var new_isIP = function(input) {
  if (!input) {
    return 0;
  } else if (/^(\d?\d?\d)\.(\d?\d?\d)\.(\d?\d?\d)\.(\d?\d?\d)$/.test(input)) {
    var parts = input.split('.');
    for (var i = 0; i < parts.length; i++) {
      var part = parseInt(parts[i]);
      if (part < 0 || 255 < part) {
        return 0;
      }
    }
    return 4;
//Changes start here
  } else if(/([a-fA-F0-9:]){2,39}/.test(input)) {
    var parts = input.split(":");
    var colons = 0;
    
    for(var i=0; i<parts.length; i++) {
      if(parts[i].length > 4 || (parts[i] != "" && !(/([a-fA-F0-9]){1,4}/.test(parts[i])))) {
        return  0;
      }
      if(parts[i].length == 0 && i%(parts.length-1)!=0) {
        colons++;
      }
    }
    return (colons>1)?0:6;
//changes end here
  } else {
    return 0;
  }
};

//speed tests
var net = require('net');
var t=["::t", "::", "a::b::c","::122:1:1","1::","::a11:abcd", "2001:aaa::41"];

console.log("old net.isIP");
for(var ti in t) {
console.time(t[ti]);

for(var i=0;i<10000;i++) {
net.isIP(t[ti]);
}
console.timeEnd(t[ti]);
}

console.log("new isIP");
for(var ti in t) {
console.time(t[ti]);

for(var i=0;i<10000;i++) {
isIP(t[ti]);
}
console.timeEnd(t[ti]);
}

Re: A quest for a more accurate net.isIP() for IPv6 Bradley Meck 10/4/12 11:58 AM
split is a fairly expensive operation, for the most part I would guess the regex compiler would do a better job and avoid GC fluff. Ugly though.

Re: [nodejs] Re: A quest for a more accurate net.isIP() for IPv6 Jonathan Buchanan 10/4/12 12:36 PM
I ported Django's IPv6 module as I needed it for my port of django.forms, if that's any good to you. It's on npm as "validators":

https://github.com/insin/validators/blob/master/lib/ipv6.js

---
Jonny

On 4 October 2012 11:58, Bradley Meck <bradle...@gmail.com> wrote:
split is a fairly expensive operation, for the most part I would guess the regex compiler would do a better job and avoid GC fluff. Ugly though.

--
Job Board: http://jobs.nodejs.org/
Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to nod...@googlegroups.com
To unsubscribe from this group, send email to
nodejs+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Re: [nodejs] Re: A quest for a more accurate net.isIP() for IPv6 snoj 10/4/12 9:14 PM
Thanks guys! I didn't know that about split operations.

Jonny, do you have benchmarks handy on your django port? If not, I'll hack something together and get them myself. If it is faster, I'd just rather it be brought into node itself then me stumble around in the dark anymore.

This past day I've done some more thinking on how regex might be done differently so as to not use splits and less lines. The result is the code below. The additions to the regex currently found in node fix the :: prefix issue as well as a::b::c and a:b:c situations. The trade off is that in some cases the tests can 3-5 times slower than the old net.isIP. I still think that's okay if better results are given. Still though, I want to believe this could be faster. I do have some code laying around for converting an IPv6 string to Buffer object.

It's also ran though the test-net-isip.js unit without issue, which is more than I could have said for my last set of code.

Beyond these, perhaps delving into some C and sending data straight to inet_pton and letting it success or fail would bring the speed up.

isIP = function(input) {
  if (!input) {
    return 0;
  } else if (/^(\d?\d?\d)\.(\d?\d?\d)\.(\d?\d?\d)\.(\d?\d?\d)$/.test(input)) {
    var parts = input.split('.');
    for (var i = 0; i < parts.length; i++) {
      var part = parseInt(parts[i]);
      if (part < 0 || 255 < part) {
        return 0;
      }
    }
    return 4;
  } else if (/^::$|^::1$|^([a-fA-F0-9]{0,4}::?){1,7}([a-fA-F0-9]{0,4})$/.test(input) && !(/::.+::/.test(input))) {
    if (input.match(/:/g).length < 7 && !(/::/.test(input))) {
      return 0;
    }
    return 6;
  } else {
    return 0;
  }
};
Re: [nodejs] Re: A quest for a more accurate net.isIP() for IPv6 snoj 10/4/12 10:16 PM
Hmm, looks like I forgot to account for :a:b:c: addresses now. I'm beginning to think a non-regex will be simpler in the long run.
Re: [nodejs] Re: A quest for a more accurate net.isIP() for IPv6 snoj 10/5/12 8:57 PM
It could be the cider talking, but I'm giving up on regex or attempting my own reworking in Javascript of what inet_pton/uv_inet_pton already does. So instead, my cider swimming brain has decided to take his first steps into C. Probably not his best idea, but so far the code is compiling...may not have hit the source files yet though.

From my POV, this should make things way faster or at the very least provide the most accurate validation the genius minds behind inet_pton/uv_inet_pton could come up with.

//in src/cares_wrap.cc
static Handle<Value> Inet_PToN(const Arguments&* args) {
  HandleScope scope;
  //int length, family;
  char address_buffer[sizeof(struct in6_addr)];
  char ip = new char[strlen(args[0])];
  ip = strcpy(ip, args[0]);
  if (uv_inet_pton(AF_INET, ip, &address_buffer).code == UV_OK) {
    //length = sizeof(struct in_addr);
    //family = AF_INET;
    delete ip;
    return scope.Close(4);
  } else if (uv_inet_pton(AF_INET6, ip, &address_buffer).code == UV_OK) {
    //length = sizeof(struct in6_addr);
    //family = AF_INET;
    delete ip;
    return scope.Close(6);
  } else {
    delete ip;
    return scope.Close(0);
  }
  delete ip;
  return scope.Close(0);
}

//in static void Initialize()
NODE_SET_METHOD(target, "inet_pton", Inet_PToN);
Re: [nodejs] Re: A quest for a more accurate net.isIP() for IPv6 snoj 10/7/12 8:23 PM
Wee! Cider and learning c++ can go together! The code from the last post doesn't work and I needed some more learning of Node, V8 and C++ to get it running. Also, Eclipse helped a lot.

Anyway, adding a inet_pton wrapper to cares_wrap.cc is a success! Its speed is sometimes comparable to what net.isIP had delivered, but the accuracy is through the roof. Which was my aim. net.isIP now even supports IPv6 with IPv4 dotted notation! Probably should have included some tests for that though.

https://github.com/snoj/node/commit/275878d2cdc584a3827e83452612c6b0f6c2d63f