public class PatentDocument extends Document
Modifier and Type | Field and Description |
---|---|
static java.util.regex.Pattern |
FamilyMembers |
static java.util.regex.Pattern |
searchReport |
acknowledgementBlocks, analyzer, beginBody, beginReferences, bibDataSets, blockDocumentHeaders, blockFigures, blockFooters, blockHeaders, blockHeadFigures, blockHeadTables, blockReferences, blocks, blockSectionTitles, blockTables, clusters, documentLenghtChar, documentSource, equations, featureFactory, figures, images, imagesPerPage, labeledBlocks, lang, LOGGER, m, MAX_FIG_BOX_DISTANCE, maxBlockSpacing, maxCharacterDensity, metadata, MIN_DISTANCE, minBlockSpacing, minCharacterDensity, nbBins, outlineRoot, pages, pathXML, pdfAnnotations, referenceMarkerMatcher, resHeader, serialVersionUID, tables, tei, teiIdToBibDataSets, titleMatchNum, tokenizations, validGraphicObjectPredicate
Constructor and Description |
---|
PatentDocument(DocumentSource documentSource) |
Modifier and Type | Method and Description |
---|---|
int |
getBeginBlockPAReport() |
java.lang.String |
getWOPriorArtBlocks()
Return all blocks corresponding to the prior art report of a WO patent publication
|
void |
setBeginBlockPAReport(int begin) |
addBlock, addPage, addTokenizedDocument, assignGraphicObjectsToFigures, calculateTeiIdToBibDataSets, createFromText, fromText, getAllBlocksClean, getAnalyzer, getBibDataSetByTeiId, getBibDataSets, getBlockDocumentHeaders, getBlocks, getBody, getClusters, getConnectedGraphics, getCoordItem, getDocumentLenghtChar, getDocumentPart, getDocumentPartText, getDocumentPieceText, getDocumentPieceText, getDocumentPieceTokenization, getDocumentSource, getDOIMatches, getEquations, getFigureLayoutTokens, getFigures, getHeader, getHeaderByIntroduction, getHeaderFeatured, getHeaderLastHope, getImages, getLabeledBlocks, getLanguage, getMaxBlockSpacing, getMaxCharacterDensity, getMetadata, getMinBlockSpacing, getMinCharacterDensity, getOutlineRoot, getPage, getPages, getPDFAnnotations, getReferenceMarkerMatcher, getResHeader, getTables, getTei, getTokenizationParts, getTokenizations, getTokenizationsFulltext, getTokenizationsHeader, getTokenizationsReferences, getTokens, getTokensFrom, glueImagesIfNecessary, isTitleMatchNum, isValidBitmapGraphicObject, postProcessTables, produceStatistics, recalculateVectorBoxCoords, setAcknowledgementBlocks, setAnalyzer, setBibDataSets, setBlockDocumentHeaders, setBlockFigures, setBlockFooters, setBlockHeaders, setBlockHeadFigures, setBlockHeadTables, setBlockReferences, setBlockSectionTitles, setBlockTables, setClusters, setConnectedGraphics2, setEquations, setFigures, setImages, setLabeledBlocks, setLanguage, setOutlineRoot, setPages, setPathXML, setResHeader, setTables, setTei, setTitleMatchNum
public static java.util.regex.Pattern searchReport
public static java.util.regex.Pattern FamilyMembers
public PatentDocument(DocumentSource documentSource)