.net - Lucene PorterStemmer question -
given following code:
dim stemmer new lucene.net.analysis.porterstemmer() response.write(stemmer.stem("mattress table") & "<br />") // outputs: mattress t response.write(stemmer.stem("mattress") & "<br />") // outputs: mattress response.write(stemmer.stem("table") & "<br />") // outputs: tabl
could explain why porterstemmer produces different results when there space in word? expecting 'mattress table' stemmed 'mattress tabl'.
also, further confusing following code:
dim parser lucene.net.queryparsers.queryparser = new lucene.net.queryparsers.queryparser("myfield", new porterstemmeranalyzer) dim q lucene.net.search.query = parser.parse("mattress table") response.write(q.tostring & "<br />") // outputs: myfield:mattress myfield: tabl q = parser.parse("""mattress table""") response.write(q.tostring & "<br />") // outputs field:"mattress tabl"
could explain why getting different results queryparser() , stem() function same word(s) using same analyzer?
thanks, kyle
porterstemmeranalyzer composed of series of tokenizers , filters. porterstemmer 1 of filters tokenstream generated. if want verify that, try changing case of query. queryparser output in lowercase due lowercasefilter on tokenstream.
some sample code custom analyzer can checked here. give peek inside analyzer.
Comments
Post a Comment