I've been noticing in unit testing that TupleSolrOutputFormat doesn't clean some temp files.
/tmp/solr2311067538494006727zip
/tmp/d75e66bf-2c88-40d4-8ec3-c06b0f94ab46.solr.zip
Inspecting the code i found that in TupleSolrOutputFormat (line 149) the first file '/tmp/solr2311067538494006727zip' is created by :
File tmpZip = File.createTempFile("solr", "zip");
This local file is inmediately copied to an hdfs file, so this could be safely removed afterwards ?
In the other hand, the file '/tmp/d75e66bf-2c88-40d4-8ec3-c06b0f94ab46.solr.zip' is treated by SolrRecordWriter accessing it via DC (it doesn't receive the full path, only : d75e66bf-2c88-40d4-8ec3-c06b0f94ab46.solr.zip).
Should be SolrRecordWriter responsible of cleaning this in SolrRecordWriter.close() ?
If so, i think that it would be safe to reconstruct the full path inside SolrRecordWriter prepending /tmp to it.
What do you think guys? I have a patch for this.
Regards