Class Ferret::Analysis::StandardAnalyzer
In: ext/r_analysis.c
Parent: Ferret::Analysis::Analyzer

Summary

The StandardAnalyzer is the most advanced of the available analyzers. If it were implemented in Ruby it would look like this;

  class StandardAnalyzer
    def initialize(stop_words = ENGLISH_STOP_WORDS, lower = true)
      @lower = lower
      @stop_words = stop_words
    end

    def token_stream(field, str)
      ts = StandardTokenizer.new(str)
      ts = LowerCaseFilter.new(ts) if @lower
      ts = StopFilter.new(ts, @stop_words)
      ts = HyphenFilter.new(ts)
    end
  end

As you can see it makes use of the StandardTokenizer and you can also add your own list of stopwords if you wish.

Methods

new  

Public Class methods

Create a new StandardAnalyzer which downcases tokens by default but can optionally leave case as is. Lowercasing will be done based on the current locale. You can also set the list of stop-words to be used by the StopFilter.

lower:set to false if you don‘t want the field‘s tokens to be downcased
stop_words:list of stop-words to pass to the StopFilter

[Validate]