I gave this a little thought and this is what I think would be very simple:
Currently you have this document representing each job:
{
"_id" : "job 1",
"host" : [
{
"timestamp" : [
"2012-10-02 00:58:15", "2012-10-02 01:58:15"
],
"_id" : "10.0.0.1",
"fqhn" : "
host1.example.com"
},
{
"timestamp" : [
"2012-10-02 00:58:40", "2012-10-02 01:58:40",
],
"_id" : "10.0.0.2",
"fqhn" : "
host2.example.com"
}
]
}
I'm suggesting changing it to have each document represent job/host combination
so now the above single document would instead be these two:
{
"jobName" : "job 1",
"host" : "10.0.0.1",
"timestamp" : [
"2012-10-02 00:58:15", "2012-10-02 01:58:15"
],
"jobName" : "job 1",
"host" : "10.0.0.2",
"timestamp" : [
"2012-10-02 00:58:40", "2012-10-02 01:58:40",
],
db.collection.update({"jobName":"job 5", "host": 10.0.0.2}, {$push:{timestamp:"new time here"}}, true)
This takes care of new job (upserts), existing job/new host (upserts), existing job, existing host: updates existing document with new timestamp.
Assuming these are the only interactions (and later you query only by job name) you would need an index on {jobName:1, host:1} - I would just let the _id be generated by the driver.
If you really want to have _id unique index used to enforce uniqueness of job/host combination you can make your _id field {jobName:"name", host: "ip"} - but be careful, you can still use this index to query on {_id:{jobName:"x",host:"y"}} but you *cannot* use it to query on {"_id.jobName":"Z"} since that's not indexed.
Asya