Solr Field value removes white space

1,088 views
Skip to first unread message

Siva

unread,
Jun 28, 2011, 10:21:38 AM6/28/11
to SolrNet
Hi,
my Solr Schem file i have defined the fields as below
<field name="producttype" type="textgen" indexed="true" stored="true"
omitNorms="true" />


My Solor Field type in my app is
[SolrField("producttype")]
public string ProductType { get; set; }

I have sent the ProductType Value is "Pen Plastic" to Solr ti Index.

When I below query
http://localhost:8983/solr/sc4/select/?fl=producttype&rows=0&q=*:*&facet=true&facet.field=producttype&facet.limit=20&facet.mincount=1&facet.prefix=pen&wt=xml

i am getting
<int name="penplastic">3516</int>

Pen Plastic => penplastic

Please any idea why..

The Field type "txtgen" is as below on the schema file
<fieldType name="textgen" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>

Thanks
Siva

Mauricio Scheffer

unread,
Jun 28, 2011, 10:31:18 AM6/28/11
to sol...@googlegroups.com
Hi Siva,
This seems to be a question about Solr, not SolrNet. Please use the Solr mailing list for questions about Solr.

--
Mauricio




--
You received this message because you are subscribed to the Google Groups "SolrNet" group.
To post to this group, send email to sol...@googlegroups.com.
To unsubscribe from this group, send email to solrnet+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/solrnet?hl=en.


Ken Foster

unread,
Jun 28, 2011, 1:05:40 PM6/28/11
to sol...@googlegroups.com
As Mauricio pointed out, Solr specific questions are better answered in the Solr user forum, it is very active and helpful, however since I'm here, your problem is trying to Facet on a Text or Textgen field (a tokenized field). If you are faceting on a field you really should use a String or maybe Lowercase field type. Text based fields will create parts of words, and in your case, combine multiple words into 1. Notice in WordDelimiterFilterFactory the setting catenateWords="1" ? That tells the Indexer to combine words into new words if possible.

You may want to index to a field of type String called product_type_facet or product_type_s (for string) or something like that. Index to that field, and have a <copyField> directive in your schema to copy the value to your producttype field. Then you can search against producttype but facet on product_type_s and get the best of both worlds. 

Ishanika sivakumar

unread,
Jun 29, 2011, 5:02:26 AM6/29/11
to sol...@googlegroups.com
Hi Ken/Mauricio,
Thanks for your reply.
Ken, Thanks..I followed as you said and it is now working fine..

Thanks
Siva
Reply all
Reply to author
Forward
0 new messages