There was a point mentioned by Jaroslav Semancik about cases were
representatives would have the same name.
Here is what he have wrote then :
> It is very likely, that in case of many representatives there will be different ones having completely the same name. It's usefull to introduce a "disambiguation" column into table Persons, like Wikipedia does - a suffix to distinguish the ones with the same name.
> Examples:
> George Bush (jr.)
> George Bush (sr.)
> Danko Nikolic (MP)
> Danko Nikolic (Zajecar)
> or so.
>
> Then the tuple (first_name, last_name, disambigation) or possibly (first_name, middle_names, last_name, disambigation) should have a unique constraint on the table.
I have run into exactly this same issue, when i started implementing
web-based data upload mechanism. I am uploading new representatives as
csv file, and then importing changed / new data. I have run into
several (~13) issues of duplicate representative name. I am trying to
think-up some solution for this.
Jaroslavs idea of disambiguation column looks good, and i am thinking
of implementing that. The only real problem to this is where to get
the values for disambiguation column. It might be a manual process -
upload csv file to the site, see the diff and any errors generating
that. Then download the diff or update original csv file with new
data. Does that sound OK to everyone?
You can have a look at a half-baked diff solution here:
http://parasykjiems.lt/data/update/civilparish/
http://parasykjiems.lt/data/update/mayor/
or use the upload form, and supply your own csv file:
http://parasykjiems.lt/data/update/upload/
Let me know what you think.
Darius