Does anyone know at which time points these formats change?
> I saw some discussion on when fields such as clean_html and
> clean_visible are filled in. Now I noticed a few other idiosyncrasies.
> Can someone tell me whether this is expected, or a bug in the thrift
> protocol (I used the java bindings).
Did you use this java example?
https://github.com/trec-kba/streamcorpus/blob/master/java/src/test/ReadThrift.java
Thank you very much for your kind consideration! We are looking forward for your answers!
Best,
wim
> I saw some discussion on when fields such as clean_html and
> clean_visible are filled in. Now I noticed a few other idiosyncrasies.
> Can someone tell me whether this is expected, or a bug in the thrift
> protocol (I used the java bindings).
Did you use this java example?
https://github.com/trec-kba/streamcorpus/blob/master/java/src/test/ReadThrift.java
...try {s.read(protocol)successful = true} catch {case e:java.lang.OutOfMemoryError => logError("OOM Error: %s".format(e.getStackTrace.mkString("\n"))); Nonecase e:TTransportException => e.getType match {case TTransportException.END_OF_FILE => logDebug("mkstream Finished."); Nonecase TTransportException.ALREADY_OPEN => logError("mkstream already opened."); Nonecase TTransportException.NOT_OPEN => logError("mkstream not open."); Nonecase TTransportException.TIMED_OUT => logError("mkstream timed out."); Nonecase TTransportException.UNKNOWN => logError("mkstream unknown."); Nonecase e => logError("Error in mkStreamItem: %s".format(e.toString)); None}case e: Exception => logDebug("Error in mkStreamItem"); None...