The easiest would be to write a curation task. It's a well-integrated
interface in DSpace that will let you call your OCR plugin from
command line or GUI on any specified set of DSpace objects (items,
collections, communities or the whole site). In theory, you can write
a curation task in any language that runs on JVM. In practice, Java,
Jython, JRuby and Groovy have been tried.
Your curation task will be served items by the curation system, one at
a time. Your code should take the item's PDF bitstream, call the OCR
library, take its output as a text file and store it into the item's
TEXT bundle.
https://wiki.duraspace.org/display/DSDOC5x/Curation+System
https://wiki.duraspace.org/display/DSPACE/Curation+Task+Cookbook
Regarding COM, this may be useful:
ftp://ftp.tuwien.ac.at/.vhost/tutorialbox.com/tutors/J++/ch16.htm
Regards,
~~helix84
Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette