Did you find any information about the license of the data? I couldn't.
1. The GeoNameId is implemented as a database sequence and geoNameIds
will not be reused even if a record is deleted.We are aware that there
is a potential problem with deleting and reinserting the same record
with a different geonameid. This is the reason it needs more rights to
delete a record on the GeoName wiki interface. We want to make sure
inexperienced users don't accidentally delete records. My impression is
that this is not a serious problem so far. To deal further with it, we
keep the history of the records and could implement a redirection for
deleted records to the replacing record if we see that there is need for
it. What sometimes happens is that we (geoname users) discover
duplicates and delete them. Up to now it happens rarely. In this case we
write the geonameId of the record we keep in the comments of the deleted
record. It would be nice to have a more formal log and handling of the
process for duplicate elimination. As for the 'guarantee' of a
'permanent' id by yahoo. It is kind of funny to be claimed by a company
whose days are obviously numbered ;-)
2. Technically it should certainly be possible to map a great number of
ids and I don't think it would be a legal problem to store the ids. What
is more interesting, however, is the question of what are we allowed to
do with the data itself (lat/lng, polygons, i18n, admin division, etc).
As I said I could only find some very generic terms of use and this
could mean that we are not allowed to use the data for anything real.
Otherwise, with a liberal license, it could become an interesting
additional source of data for the GeoNames project.
Marc
Finding a place with given lat/lng coordinates is called 'reverse
geocoding'. GeoNames has some reverse geocoding services:
http://www.geonames.org/export/web-services.html#findNearbyPlaceName
The closest point will not be enough to match GeoNames toponyms with
Yahoo toponyms. You will also have to compare feature codes and name
similarity. For name similarity I normally use a combination of
Levenshtein distance (edit distance) and letter-pair similarity.
Best
Marc