I understand best place to limit or change sizes is at: nutch-site.xml
so I'm not passing large in-memory chunks from one place to another
until indexerModule.properties crops it.
Is that correct? Do you have any other comments or thoughts?
--------- /hounder/crawler/conf/nutch-site.xml ---------
<property>
<name>file.content.limit</name>
<value>65536</value>
</property>
--------- /hounder/crawler/conf/indexerModule.properties ---------
page.text.max.length=65536
Thanks a lot!
Gustavo Arjones