Hi, I'm a new developer of Lucene inexperienced , I turn to you after a series of failed attempts .
My problem is to create a simple Java program that calculates the precision recall of a series of documents .
Of file system I have a collection of documents that I index using the command prompt with the command:
java -cp C:/lucene/jar/lucene-core-4.3.1.jar;C:/lucene/jar/lucene-analyzers-common-4.3.1.jar;C:/lucene/jar/lucene-demo-4.3.1.jar org.apache.lucene.demo.IndexFiles -index C:/lucene/indice -docs C:/lucene/confusion_track/original
Here's an example of one of the topics in the file C:/lucene/topics.txt :
<top>
<num> Number: CF11
<title> apache
<desc> Description:
I am looking for a document about the dismissal of a lawsuit Involving
Adventist Health Systems.
<narr> Narrative:
</ top>
Also place the Java code , in order to help you easily identify my error :
public static void main(String[] args) throws IOException {
File topicsFile = new File("C:/lucene/topics.txt");
File qrelsFile = new File("C:/lucene/confusion.known_items");
Directory fsDir = FSDirectory.open(new File("C:/lucene/indice"));
IndexSearcher searcher = new IndexSearcher(IndexReader.open(fsDir));
String docNameField = "Federal";
PrintWriter logger = new PrintWriter(System.out, true);
TrecTopicsReader qReader = new TrecTopicsReader();//#1
QualityQuery qqs[] = qReader.readQueries(new BufferedReader(new FileReader(topicsFile)));
Judge judge = new TrecJudge(new BufferedReader(new FileReader(qrelsFile)));//#2
judge.validateData(qqs, logger);//#3
QualityQueryParser qqParser = new SimpleQQParser("title", "description");//#4
QualityBenchmark qrun = new QualityBenchmark(qqs, qqParser, searcher, docNameField);
SubmissionReport submitLog = null;
QualityStats[] stats = qrun.execute(judge, submitLog, logger);//#5
QualityStats avg = QualityStats.average(stats);//#6
avg.log("SUMMARY", 2, logger, " ");
fsDir.close();
}
The result obtained is the following:
CF11 - description:apache
CF11 Stats:
Search Seconds: 0.014
DocName Seconds: 0.000
Num Points: 0.000
Num Good Points: 0.000
Max Good Points: 1.000
Average Precision: 0.000
MRR: 0.000
Recall: 0.000
CF12 - description:apache
CF12 Stats:
Search Seconds: 0.000
DocName Seconds: 0.000
Num Points: 0.000
Num Good Points: 0.000
Max Good Points: 1.000
Average Precision: 0.000
MRR: 0.000
Recall: 0.000
CF13 - description:apache
CF13 Stats:
Search Seconds: 0.001
DocName Seconds: 0.000
Num Points: 0.000
Num Good Points: 0.000
Max Good Points: 1.000
Average Precision: 0.000
MRR: 0.000
Recall: 0.000
SUMMARY
Search Seconds: 0.003
DocName Seconds: 0.000
Num Points: 0.000
Num Good Points: 0.000
Max Good Points: 1.000
Average Precision: 0.000
MRR: 0.000
Recall: 0.000
Thank you all for the availability and the immense cooperation.