FullTextParser

java.lang.Object
- org.grobid.core.engines.AbstractParser
- - org.grobid.core.engines.FullTextParser

All Implemented Interfaces:

java.io.Closeable, java.lang.AutoCloseable, GenericTagger
```
public class FullTextParser
extends AbstractParser
```

Field Summary
- Fields inherited from class org.grobid.core.engines.AbstractParser
  analyzer, cntManager

Constructor Summary

Constructors
Constructor and Description

FullTextParser(EngineParsers parsers)
TODO some documentation...

Constructors
Constructor and Description
`FullTextParser(EngineParsers parsers)` TODO some documentation...

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`close()`
`Document`	`createTraining(java.io.File inputFile, java.lang.String pathFullText, java.lang.String pathTEI, int id)` Process the specified pdf and format the result as training data for all the models.
`static <any>`	`getBodyTextFeatured(Document doc, java.util.SortedSet<DocumentPiece> documentBodyParts)`
`static java.util.List<LayoutTokenization>`	`getDocumentFullTextTokens(java.util.List<TaggingLabel> labels, java.lang.String labeledResult, java.util.List<LayoutToken> tokenizations)`
`protected static java.lang.String`	`postProcessLabeledAbstract(java.lang.String labeledAbstract)`
`Document`	`processing(DocumentSource documentSource, GrobidAnalysisConfig config)` Machine-learning recognition of the complete full text structures.
`Document`	`processing(java.io.File inputPdf, GrobidAnalysisConfig config)`
`<any>`	`processShort(java.util.List<LayoutToken> tokens, Document doc)`
`<any>`	`processShortNew(java.util.List<LayoutToken> tokens, Document doc)` Process a simple segment of layout tokens with the full text model.
`static boolean`	`writeField(java.lang.StringBuilder buffer, java.lang.String s1, java.lang.String lastTag0, java.lang.String s2, java.lang.String field, java.lang.String outField, boolean addSpace, int nbIndent, boolean generateIDs)` TODO some documentation...
`static boolean`	`writeFieldBeginEnd(java.lang.StringBuilder buffer, java.lang.String s1, java.lang.String lastTag0, java.lang.String s2, java.lang.String field, java.lang.String outField, boolean addSpace, int nbIndent, boolean generateIDs)` This is for writing fields for fields where begin and end of field matter, like paragraph or item

Methods inherited from class org.grobid.core.engines.AbstractParser
label, label

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail
- FullTextParser
```
public FullTextParser(EngineParsers parsers)
```
  TODO some documentation...

Method Detail

processing

public Document processing(java.io.File inputPdf,
                           GrobidAnalysisConfig config)
                    throws java.lang.Exception

Throws:: java.lang.Exception

processing
```
public Document processing(DocumentSource documentSource,
                           GrobidAnalysisConfig config)
```
Machine-learning recognition of the complete full text structures.

Parameters:

documentSource - input

config - config

Returns:

the document object with built TEI

processShortNew
```
public <any> processShortNew(java.util.List<LayoutToken> tokens,
                             Document doc)
```
Process a simple segment of layout tokens with the full text model. Return null if provided Layout Tokens is empty or if structuring failed.

processShort

public <any> processShort(java.util.List<LayoutToken> tokens,
                          Document doc)

postProcessLabeledAbstract

protected static java.lang.String postProcessLabeledAbstract(java.lang.String labeledAbstract)

getBodyTextFeatured

public static <any> getBodyTextFeatured(Document doc,
                                        java.util.SortedSet<DocumentPiece> documentBodyParts)

createTraining

public Document createTraining(java.io.File inputFile,
                               java.lang.String pathFullText,
                               java.lang.String pathTEI,
                               int id)

Process the specified pdf and format the result as training data for all the models.

Parameters:: inputFile - input file; pathFullText - path to fulltext; pathTEI - path to TEI; id - id

writeField

public static boolean writeField(java.lang.StringBuilder buffer,
                                 java.lang.String s1,
                                 java.lang.String lastTag0,
                                 java.lang.String s2,
                                 java.lang.String field,
                                 java.lang.String outField,
                                 boolean addSpace,
                                 int nbIndent,
                                 boolean generateIDs)

TODO some documentation...

Parameters:: buffer - buffer; s1 -; lastTag0 -; s2 -; field -; outField -; addSpace -; nbIndent -
Returns:

writeFieldBeginEnd

public static boolean writeFieldBeginEnd(java.lang.StringBuilder buffer,
                                         java.lang.String s1,
                                         java.lang.String lastTag0,
                                         java.lang.String s2,
                                         java.lang.String field,
                                         java.lang.String outField,
                                         boolean addSpace,
                                         int nbIndent,
                                         boolean generateIDs)

This is for writing fields for fields where begin and end of field matter, like paragraph or item

Parameters:: buffer -; s1 -; lastTag0 -; s2 -; field -; outField -; addSpace -; nbIndent -
Returns:

getDocumentFullTextTokens

public static java.util.List<LayoutTokenization> getDocumentFullTextTokens(java.util.List<TaggingLabel> labels,
                                                                           java.lang.String labeledResult,
                                                                           java.util.List<LayoutToken> tokenizations)

close
```
public void close()
           throws java.io.IOException
```
Specified by:

close in interface java.io.Closeable

Specified by:

close in interface java.lang.AutoCloseable

Overrides:

close in class AbstractParser

Throws:

java.io.IOException

Class FullTextParser

Field Summary

Fields inherited from class org.grobid.core.engines.AbstractParser

Constructor Summary

Method Summary

Methods inherited from class org.grobid.core.engines.AbstractParser

Methods inherited from class java.lang.Object

Constructor Detail

FullTextParser

Method Detail

processing

processing

processShortNew

processShort

postProcessLabeledAbstract

getBodyTextFeatured

createTraining

writeField

writeFieldBeginEnd

getDocumentFullTextTokens

close