public class UniformSplitTerms extends Terms implements Accountable
Terms
based on the Uniform Split technique.
The index dictionary
is lazy loaded only when
TermsEnum.seekCeil(org.apache.lucene.util.BytesRef)
or TermsEnum.seekExact(org.apache.lucene.util.BytesRef)
are called
(it is not loaded for a direct terms enumeration).
UniformSplitTermsWriter
Modifier and Type | Field and Description |
---|---|
private static long |
BASE_RAM_USAGE |
protected BlockDecoder |
blockDecoder |
protected IndexInput |
blockInput |
protected DictionaryBrowserSupplier |
dictionaryBrowserSupplier |
protected FieldMetadata |
fieldMetadata |
protected PostingsReaderBase |
postingsReader |
EMPTY_ARRAY
Modifier | Constructor and Description |
---|---|
protected |
UniformSplitTerms(IndexInput blockInput,
FieldMetadata fieldMetadata,
PostingsReaderBase postingsReader,
BlockDecoder blockDecoder,
DictionaryBrowserSupplier dictionaryBrowserSupplier) |
protected |
UniformSplitTerms(IndexInput dictionaryInput,
IndexInput blockInput,
FieldMetadata fieldMetadata,
PostingsReaderBase postingsReader,
BlockDecoder blockDecoder) |
Modifier and Type | Method and Description |
---|---|
protected void |
checkIntersectAutomatonType(CompiledAutomaton automaton) |
long |
getDictionaryRamBytesUsed() |
int |
getDocCount()
Returns the number of documents that have at least one
term for this field.
|
BytesRef |
getMax()
Returns the largest term (in lexicographic order) in the field.
|
long |
getSumDocFreq()
Returns the sum of
TermsEnum.docFreq() for
all terms in this field. |
long |
getSumTotalTermFreq()
Returns the sum of
TermsEnum.totalTermFreq() for
all terms in this field. |
boolean |
hasFreqs()
Returns true if documents in this field store
per-document term frequency (
PostingsEnum.freq() ). |
boolean |
hasOffsets()
Returns true if documents in this field store offsets.
|
boolean |
hasPayloads()
Returns true if documents in this field store payloads.
|
boolean |
hasPositions()
Returns true if documents in this field store positions.
|
TermsEnum |
intersect(CompiledAutomaton compiled,
BytesRef startTerm)
Returns a TermsEnum that iterates over all terms and
documents that are accepted by the provided
CompiledAutomaton . |
TermsEnum |
iterator()
Returns an iterator that will step through all
terms.
|
long |
ramBytesUsed()
Return the memory usage of this object in bytes.
|
long |
ramBytesUsedWithoutDictionary() |
long |
size()
Returns the number of terms for this field, or -1 if this
measure isn't stored by the codec.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getChildResources
private static final long BASE_RAM_USAGE
protected final IndexInput blockInput
protected final FieldMetadata fieldMetadata
protected final PostingsReaderBase postingsReader
protected final BlockDecoder blockDecoder
protected final DictionaryBrowserSupplier dictionaryBrowserSupplier
protected UniformSplitTerms(IndexInput dictionaryInput, IndexInput blockInput, FieldMetadata fieldMetadata, PostingsReaderBase postingsReader, BlockDecoder blockDecoder) throws java.io.IOException
blockDecoder
- Optional block decoder, may be null if none. It can be used for decompression or decryption.java.io.IOException
protected UniformSplitTerms(IndexInput blockInput, FieldMetadata fieldMetadata, PostingsReaderBase postingsReader, BlockDecoder blockDecoder, DictionaryBrowserSupplier dictionaryBrowserSupplier)
blockDecoder
- Optional block decoder, may be null if none. It can be used for decompression or decryption.public TermsEnum iterator() throws java.io.IOException
Terms
public TermsEnum intersect(CompiledAutomaton compiled, BytesRef startTerm) throws java.io.IOException
Terms
CompiledAutomaton
. If the startTerm
is
provided then the returned enum will only return terms
> startTerm
, but you still must call
next() first to get to the first term. Note that the
provided startTerm
must be accepted by
the automaton.
This is an expert low-level API and will only work
for NORMAL
compiled automata. To handle any
compiled automata you should instead use
CompiledAutomaton.getTermsEnum(org.apache.lucene.index.Terms)
instead.
NOTE: the returned TermsEnum cannot seek
.protected void checkIntersectAutomatonType(CompiledAutomaton automaton)
public BytesRef getMax()
Terms
public long size()
Terms
public long getSumTotalTermFreq()
Terms
TermsEnum.totalTermFreq()
for
all terms in this field. Note that, just like other term
measures, this measure does not take deleted documents
into account.getSumTotalTermFreq
in class Terms
public long getSumDocFreq()
Terms
TermsEnum.docFreq()
for
all terms in this field. Note that, just like other term
measures, this measure does not take deleted documents
into account.getSumDocFreq
in class Terms
public int getDocCount()
Terms
getDocCount
in class Terms
public boolean hasFreqs()
Terms
PostingsEnum.freq()
).public boolean hasOffsets()
Terms
hasOffsets
in class Terms
public boolean hasPositions()
Terms
hasPositions
in class Terms
public boolean hasPayloads()
Terms
hasPayloads
in class Terms
public long ramBytesUsed()
Accountable
ramBytesUsed
in interface Accountable
public long ramBytesUsedWithoutDictionary()
public long getDictionaryRamBytesUsed()