The problem is that I cannot know what is A (text, object, reg exp,
array) and what is B (text, object, array).
Also, I must assume that B may be an array or a key/value object and
that A may be anywhere (including 'children' objects).
This is usually the case when I have a JSON response from an external
source and want to find something in there but without hard-coding any
specific methods or search cases.
I think the best way to achieve this is to have a method loop through
the objects and arrays that are passed (and found) and call it self
until there only two string/number/regexp values to be compared
against.
What I've done up to now is herehttp://jsbin.com/ahezi/edit- if you
haven't understood what I just said perhaps the example on this page
will help.
Even tho it works as expected it is quite slow and I have no idea
where the problem is. So my question is this: Is there a better way to
achieve this result, and if not, how can I make the Find function
faster?
Cheers
> I have an array of objects and I want to retrieve specific elements
> that match certain criteria, i.e. find value A in an object B
>
> The problem is that I cannot know what is A (text, object, reg exp,
> array) and what is B (text, object, array).
>
> Also, I must assume that B may be an array or a key/value object and
> that A may be anywhere (including 'children' objects).
Do you know *anything* about A or B? About the criteria used to find
A in B? How can A be a regexp and be found in B, if B can't contain
regexps?
> This is usually the case when I have a JSON response from an external
> source and want to find something in there but without hard-coding any
> specific methods or search cases.
Can we assume that B is a parsed JSON value? I.e., only objects, arrays
and primitive values.
> I think the best way to achieve this is to have a method loop through
> the objects and arrays that are passed (and found) and call it self
> until there only two string/number/regexp values to be compared
> against.
That sounds like a recursive match (i.e., {x:42} is found in [{y:1},{x:42}]).
You should also ensure that you don't match {x:42,z:"foo"} - unless that
is what you want.
> What I've done up to now is herehttp://jsbin.com/ahezi/edit- if you
> haven't understood what I just said perhaps the example on this page
> will help.
>
> Even tho it works as expected it is quite slow and I have no idea
> where the problem is.
It is a computationally hard problem. You call recursively both to
match sub-patterns of the pattern you are searching for, and to search
for matches.
I haven't completely grokked your code yet, but it seems to me that
the sameArrays function uses Find on its elements, not comparing the
values directly.
That means that [{x:2}] matches [{y:[{x:2}]}]. Should it? It definitly
increases the computational requirements.
I also spot a few places where you test for "undefined" after having
taken the "constructor" property of the value. That would crash if it
actually was undefined :)
What I would do is:
function find(pattern, value) {
function match(pattern, value) {
if (pattern == undefined || value == undefined) {
return pattern === value;
}
if (pattern == null || value == null) {
return pattern === value;
}
if (pattern.constructor == Object) {
return (value.constructor == Object && matchObject(pattern, value));
}
if (pattern.constructor == Array) {
return (value.constructor == Array && matchArray(pattern, value));
}
if (pattern === value) { return true; }
if (pattern.constructor == RegExp) {
return pattern.test(value);
}
if (pattern.constructor == Function) {
return Boolean(pattern(value));
}
return String(pattern) == String(value);
}
function matchObject(objectPattern, object) {
for (var prop in objectPattern) {
var propPattern = objectPattern[prop];
var propValue = object[prop];
if (!match(propPattern, propValue)) { return false; }
}
return true;
}
function matchArray(arrayPattern, array) {
if (arrayPattern.length != array.length) { return false; }
for (var i = 0; i < array.length; i++) {
if (!match(arrayPattern[i], array[i])) { return false; }
}
return true;
}
function findPattern(pattern, value, path) {
path = path || [];
if (match(pattern, value)) { return {result: value, path: path}; }
if (value.constructor == Object || value.constructor == Array) {
var path_index = path.length;
for (var prop in value) {
path[path_index] = prop;
var found = findPattern(pattern, value[prop], path);
if (found) return found;
}
path.length = path_index;
}
return null;
}
return findPattern(pattern, value);
}
(I assume using the constructor to differentiate will be sufficient
for this case, since that's what your original code uses).
A quick test:
function test(n) { return n < 10; }
find({x:[test,test]}, [{x:[2,12]},{y:{x:[1,9]}}])
returns
{result:{x:[1,9]}, path:["1","y"]}
> So my question is this: Is there a better way to
> achieve this result, and if not, how can I make the Find function
> faster?
I wouldn't try to be general. If you know something about the JSON
value you are looking in, use that. If you don't, how did you come
up with the pattern to begin with?
And try to be very specific about how you want to match. I think your
code is trying to be more general than it needs to (but I can't tell
without a specific description of what it should do).
/L
--
Lasse Reichstein Holst Nielsen
'Javascript frameworks is a disruptive technology'
[snip]
> What I would do is:
>
> function find(pattern, value) {
> function match(pattern, value) {
> if (pattern == undefined || value == undefined) {
> return pattern === value;
> }
> if (pattern == null || value == null) {
> return pattern === value;
> }
This (`null`) branch will never be entered, will it?
[snip]
--
kangax
"How can A be a regexp and be found in B, if B can't contain
regexps?"
If A is a regexp, then a match is found if A matches B's value. I'm
not trying to compare like with like and from what I understood, you
say this is reason performance is bad.
Yes, B is a JSON object, so no regexps and functions found there.
"That sounds like a recursive match (i.e., {x:42} is found in [{y:1},
{x:42}]).
You should also ensure that you don't match {x:42,z:"foo"} - unless
that
is what you want. "
Yes I want to be able to match {x:42} (A) in {x:42,z:"foo"} (B). The B
would be a JSON item, and so I want to know which of the items has the
value 42 in key x, for example.
"That means that [{x:2}] matches [{y:[{x:2}]}]. Should it? It
definitly
increases the computational requirements."
Yes it should (see above). The reason I use the Find found for each
item in the sameArray function is because the items in the array may
be of same type (constructor).
"I also spot a few places where you test for "undefined" after having
taken the "constructor" property of the value. That would crash if it
actually was undefined :)"
Heh, good point :)
An example of how this would be applied:
I have a JSON response from an external server to display to the user.
I want the user to be able to find something in that data, for example
those whose name contains a user-provided term.
So I would do Find({ name : new RegExp( text_from_input_field,
'gi' ) }, myJSON.items) to find those items.
In fact, I actually 'copy' the original JSON object to another one
which only contains the fields that I want to search in, e.g. name and
description. So each myJSON.item has only two keys, name and
description. This does help performance but its not really acceptable.
Thank you for your reply, I will check your code and post back.
On Feb 20, 7:57 pm, Lasse Reichstein Nielsen <lrn.unr...@gmail.com>
wrote:
> Nicolas R <ruda...@googlemail.com> writes:
> > I have an array of objects and I want to retrieve specific elements
> > that match certain criteria, i.e. find value A in an object B
>
> > The problem is that I cannot know what is A (text, object, reg exp,
> > array) and what is B (text, object, array).
>
> > Also, I must assume that B may be an array or a key/value object and
> > that A may be anywhere (including 'children' objects).
>
> Do you know *anything* about A or B? About the criteria used to find
> A in B? How can A be a regexp and be found in B, if B can't contain
> regexps?
>
> > This is usually the case when I have a JSON response from an external
> > source and want to find something in there but without hard-coding any
> > specific methods or search cases.
>
> Can we assume that B is a parsed JSON value? I.e., only objects, arrays
> and primitive values.
>
> > I think the best way to achieve this is to have a method loop through
> > the objects and arrays that are passed (and found) and call it self
> > until there only two string/number/regexp values to be compared
> > against.
>
> That sounds like a recursive match (i.e., {x:42} is found in [{y:1},{x:42}]).
> You should also ensure that you don't match {x:42,z:"foo"} - unless that
> is what you want.
>
> > What I've done up to now is herehttp://jsbin.com/ahezi/edit-if you
> "That means that [{x:2}] matches [{y:[{x:2}]}]. Should it? It
> definitly
> increases the computational requirements."
> Yes it should (see above). The reason I use the Find found for each
> item in the sameArray function is because the items in the array may
> be of same type (constructor).
Stupid example, sorry.
The point should have been that the array matches on the outside, but
the {x:...} inside. I.e.,
[{x:2}] matches [{y:{z:{x:2}}}] (all of it).
Which value is it that is matched?
I see. Then, it seems like a second branch can be removed altogether
(letting first non-strict operator to "catch" both `null` and `undefined`)
--
kangax
> I see. Then, it seems like a second branch can be removed altogether
> (letting first non-strict operator to "catch" both `null` and
> `undefined`)
That should work, yes.
I am still open for general ideas, suggestions and code of course.
Thank you very much for your time.
On Feb 21, 7:42 am, Lasse Reichstein Nielsen <lrn.unr...@gmail.com>
wrote:
> I tried your code and although it is faster than mine it achieves
> different results. From what I've seen you answer the question of 'is
> A in B', and if there are many A's in that B (e.g. { name : 'adam' })
> you don't return all of them.
That is correct. I can see your code dos collect them all.
It's a fairly simple fix, though.
function find(pattern, value) {
function match(pattern, value) {
if (pattern === value) { return true; }
if (pattern == undefined || value == undefined) {
// rule out null or undefined.
return false;
}
if (pattern.constructor == Object) {
return (value.constructor == Object && matchObject(pattern, value));
}
if (pattern.constructor == Array) {
return (value.constructor == Array && matchArray(pattern, value));
}
if (pattern.constructor == RegExp) {
return typeof value == "string" && pattern.test(value);
}
if (pattern.constructor == Function) {
return Boolean(pattern(value));
}
return String(pattern) == String(value);
}
function matchObject(objectPattern, object) {
for (var prop in objectPattern) {
var propPattern = objectPattern[prop];
var propValue = object[prop];
if (!match(propPattern, propValue)) { return false; }
}
return true;
}
function matchArray(arrayPattern, array) {
if (arrayPattern.length != array.length) { return false; }
for (var i = 0; i < array.length; i++) {
if (!match(arrayPattern[i], array[i])) { return false; }
}
return true;
}
var results = [];
// findPattern travereses the value structure to find sub-parts
// matching pattern.
function findPattern(value) {
if (match(pattern, value)) { results.push(value); }
if (value.constructor == Object || value.constructor == Array) {
for (var prop in value) {
findPattern(value[prop]);
}
}
}
findPattern(value);
return results;
}
(also removed the path computation, which wasn't in your version anyway,
and restricted regexps to work on strings - it's a mess if you convert
arrays to strings and then match with a regexp).
Best of luck
Yours took ~16 ms, mine ~21, for 227 items (key/value objects) with 6
properties each.
I noticed in your code that you never called the find function within
it self. You defined a findPattern function that does the 'init'
stuff, and a match function that does the matching. In your code you
never called find called, you called match instead.
So I wrapped up all my init code (that sits at the bottom of the Find
function) in a match function, and call match instead of Find. The
performance improvements are incredible, I am astonished.
For the same set of items, it now takes 8 ms instead of 21. That's
almost 3x faster. Incredible. Thanks for the hint!