public class CitationParser extends AbstractParser
| Modifier and Type | Field and Description |
|---|---|
Lexicon |
lexicon |
analyzer, cntManager| Constructor and Description |
|---|
CitationParser(EngineParsers parsers) |
CitationParser(EngineParsers parsers,
CntManager cntManager) |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
BiblioItem |
consolidateCitation(BiblioItem resCitation,
java.lang.String rawCitation,
int consolidate)
Consolidate an existing list of recognized citations based on access to
external internet bibliographic databases.
|
BiblioItem |
processing(java.util.List<LayoutToken> tokens,
int consolidate) |
BiblioItem |
processing(java.lang.String input,
int consolidate) |
java.util.List<BibDataSet> |
processingReferenceSection(Document doc,
ReferenceSegmenter referenceSegmenter,
int consolidate) |
java.util.List<BibDataSet> |
processingReferenceSection(DocumentSource documentSource,
ReferenceSegmenter referenceSegmenter,
int consolidate) |
java.util.List<BibDataSet> |
processingReferenceSection(java.io.File input,
ReferenceSegmenter referenceSegmenter,
int consolidate) |
java.util.List<BibDataSet> |
processingReferenceSection(java.lang.String referenceTextBlock,
ReferenceSegmenter referenceSegmenter) |
BiblioItem |
resultExtractionLayoutTokens(java.lang.String result,
boolean volumePostProcess,
java.util.List<LayoutToken> tokenizations)
Extract results from a labeled sequence.
|
java.lang.StringBuilder |
trainingExtraction(java.util.List<java.lang.String> inputs)
Extract results from a list of citation strings in the training format
without any string modification.
|
label, labelpublic Lexicon lexicon
public CitationParser(EngineParsers parsers, CntManager cntManager)
public CitationParser(EngineParsers parsers)
public BiblioItem processing(java.lang.String input, int consolidate)
public BiblioItem processing(java.util.List<LayoutToken> tokens, int consolidate)
public java.util.List<BibDataSet> processingReferenceSection(java.lang.String referenceTextBlock, ReferenceSegmenter referenceSegmenter)
public java.util.List<BibDataSet> processingReferenceSection(Document doc, ReferenceSegmenter referenceSegmenter, int consolidate)
public java.util.List<BibDataSet> processingReferenceSection(java.io.File input, ReferenceSegmenter referenceSegmenter, int consolidate)
public java.util.List<BibDataSet> processingReferenceSection(DocumentSource documentSource, ReferenceSegmenter referenceSegmenter, int consolidate)
public BiblioItem resultExtractionLayoutTokens(java.lang.String result, boolean volumePostProcess, java.util.List<LayoutToken> tokenizations)
result - resultvolumePostProcess - whether post process volumetokenizations - list of tokenspublic BiblioItem consolidateCitation(BiblioItem resCitation, java.lang.String rawCitation, int consolidate)
resCitation - citationpublic java.lang.StringBuilder trainingExtraction(java.util.List<java.lang.String> inputs)
inputs - list of input datapublic void close()
throws java.io.IOException
close in interface java.io.Closeableclose in interface java.lang.AutoCloseableclose in class AbstractParserjava.io.IOException