Something that has been in the back of our minds when sites integrate Geograph images, is that sometimes a small sample obtained via the API can be often dominated by one type of image (be it the contributor, the type of shot, the subject, season etc etc), where in many situations a 'broad range' of images could be useful...
Below is a possible strategy for discussion, if you are not interested in this, you can stop reading now :)
The board outline I been considering is:
Assigning a score to each image, based upon its uniqueness in each of the following:
Title Words (wordnet - is our internal name)
Category or Tags
Exact location (or centisquare - 100m sized square)
For each image loop though all images within (say) a 5km radius, and count up the number of images that match on each condition (a separate tally for each condition).
At it simplest just add-up all the tallies to make a overall score, which can then be ascendingly sorted on (as arbitrary units), or a more complex would be to have some sort of non-linear scale to create points score (probably higher for more unique - being something like a 1-10 scale would be more familiar units)
...including the contributor in the list, is perhaps not required but IMHO would make the results in 'API snapshots' more interesting as each contributor will have different 'take' on the area.
...I guess we could not use/reduce the 'geo' factor (with the radius) but as we have this detail while not exploit it!
A few questions:
* Any other suggestions? (of improvements, a different method, or even a better way of achieving the same goal)
* Would Integrators be interesting in using such a score (or having the API results sorted by) - but it would always be optional!
* or is it a 'non issue'...
Thanks for listening, (and apologies for length!)
... a thought that just occurred to me is this is a Geograph version of the Patent Pending Flickr Interestingness system ;) [there is a lot of parallels between the two sites, we are both amassing a huge archive of images, and finding the best starts to become a problem, - yes I know there are other photo-sharing sites, but arguably flickr leads the pack]