I tried the multi-map/reduce, but I seem to have run into a logistical problem. It works fine if Person<->Company was a 1:1 relationship, but I'm not sure how I would get it to work given the fact that multiple People can be associated to a single Company.
Just for simplicity, assume I have 4 People and 2 Companies. In my reduce, if I group by CompanyId then I obviously only get 2 People back after the reduce. I want all 4 People with their Company information present. And I can't group by PersonId, because then all the company information is lost in the reduce (since Company doesn't know anything about Person).
AddMap<Company>(companies => from c in companies
select new
{
PersonId = (string)null, // Here lies the issue
c.CompanyId,
// All Company fields populated here
// All Person fields defined here and set to null
});
AddMap<Person>(people => from p in people
select new
{
PersonId = p.Id,
p.CompanyId,
// All Person fields populated here
// All Company fields defined here and set to null
});
Reduce = results => from result in results
group result by result.PersonId
into g
select new
{
PersonId = g.Key,
FirstName = g.Select(x => x.FirstName).FirstOrDefault(x => x != null),
// etc.
CompanyId = g.Select(x => x.CompanyId).FirstOrDefault(x => x != null),
Industry = g.Select(x => x.Industry).FirstOrDefault(x => x != null),
// etc.
}
Does this mean I need to keep a list of PersonIds in the Company document? So my AddMap<Company> would use 2 "from" statements in it (one for Company and the other for the list of PersonIds)? That's really the only thing I can think of. Do I have any other options? ...other than cramming all this in one document called CompanyPerson. That would cause more headaches for me rather than just keep a list of PersonIds in Company.
Also, just as an aside, I tried out the facets, and to my surprise, they performed quite well. And I'm quite looking forward to trying out the facet approach proposed here: