Class | Ferret::Index::IndexReader |
In: |
ext/r_index.c
|
Parent: | Object |
IndexReader is used for reading data from the index. This class is usually used directly for more advanced tasks like iterating through terms in an index, accessing term-vectors or deleting documents by document id. It is also used internally by IndexSearcher.
Create a new IndexReader. You can either pass a string path to a file-system directory or an actual Ferret::Store::Directory object. For example;
dir = RAMDirectory.new() iw = IndexReader.new(dir) dir = FSDirectory.new("/path/to/index") iw = IndexReader.new(dir) iw = IndexReader.new("/path/to/index")
You can also create a what used to be known as a MultiReader by passing an array of IndexReader objects, Ferret::Store::Directory objects or file-system paths;
iw = IndexReader.new([dir, dir2, dir3]) iw = IndexReader.new([reader1, reader2, reader3]) iw = IndexReader.new(["/path/to/index1", "/path/to/index2"])
Retrieve a document from the index. See LazyDoc for more details on the document returned. Documents are referenced internally by document ids which are returned by the Searchers search methods.
Close the IndexReader. This method also commits any deletions made by this IndexReader. This method will be called explicitly by the garbage collector but you should call it explicitly to commit any changes as soon as possible and to close any locks held by the object to prevent locking errors.
Commit any deletes made by this particular IndexReader to the index. This will use open a Commit lock.
Delete document referenced internally by document id doc_id. The document_id is the number used to reference documents in the index and is returned by search methods.
Returns an array of field names in the index. This can be used to pass to the QueryParser so that the QueryParser knows how to expand the "*" wild-card to all fields in the index. A list of field names can also be gathered from the FieldInfos object.
Returns an array of field names in the index. This can be used to pass to the QueryParser so that the QueryParser knows how to expand the "*" wild-card to all fields in the index. A list of field names can also be gathered from the FieldInfos object.
Retrieve a document from the index. See LazyDoc for more details on the document returned. Documents are referenced internally by document ids which are returned by the Searchers search methods.
Return true if the index has any deletions, either uncommitted by this IndexReader or committed by any other IndexReader.
Return true if the index version referenced by this IndexReader is the latest version of the index. If it isn‘t you should close and reopen the index to search the latest documents added to the index.
Returns 1 + the maximum document id in the index. It is the document_id that will be used by the next document added to the index. If there are no deletions, this number also refers to the number of documents in the index.
Expert: Returns a string containing the norm values for a field. The string length will be equal to the number of documents in the index and it could have null bytes.
Returns the number of accessible (not deleted) documents in the index. This will be equal to IndexReader#max_doc if there have been no documents deleted from the index.
Expert: change the boost value for a field in document at doc_id. val should be an integer in the range 0..255 which corresponds to an encoded float value.
Builds a TermDocEnum (term-document enumerator) for the index. You can use this object to iterate through the documents in which certain terms occur. See TermDocEnum for more info.
Builds a TermDocEnum to iterate through the documents that contain the term term in the field field. See TermDocEnum for more info.
Same as IndexReader#term_docs except the TermDocEnum will also allow you to scan through the positions at which a term occurs. See TermDocEnum for more info.
Same as IndexReader#term_docs_for(field, term) except the TermDocEnum will also allow you to scan through the positions at which a term occurs. See TermDocEnum for more info.
Return the TermVector for the field field in the document at doc_id in the index. Return nil of no such term_vector exists. See TermVector.
Return the TermVectors for the document at doc_id in the index. The value returned is a hash of the TermVectors for each field in the document and they are referenced by field names (as symbols).
Returns a term enumerator which allows you to iterate through all the terms in the field field in the index.
Same as IndexReader#terms(fields) except that it starts the enumerator off at term term.
Returns an array of field names of all of the tokenized fields in the index. This can be used to pass to the QueryParser so that the QueryParser knows how to expand the "*" wild-card to all fields in the index. A list of field names can also be gathered from the FieldInfos object.
Undelete all deleted documents in the index. This is kind of like a rollback feature. Not that once an index is committed or a merge happens during index, deletions will be committed and undelete_all will have no effect on these documents.