com.norconex.importer.parser
Class DefaultDocumentParserFactory
java.lang.Object
com.norconex.importer.parser.DefaultDocumentParserFactory
- All Implemented Interfaces:
- IXMLConfigurable, IDocumentParserFactory, Serializable
public class DefaultDocumentParserFactory
- extends Object
- implements IDocumentParserFactory, IXMLConfigurable
Uses Apacke Tika for all its supported content types. For unknown
content types, falls back to Tika generic media detector/parser.
XML configuration usage (not required since default):
<documentParserFactory class="com.norconex.importer.parser.DefaultDocumentParserFactory" format="text|xml" />
- Author:
- Pascal Essiembre
- See Also:
- Serialized Form
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DEFAULT_FORMAT
public static final String DEFAULT_FORMAT
- See Also:
- Constant Field Values
DefaultDocumentParserFactory
public DefaultDocumentParserFactory()
- Creates a new document parser factory of "text" format.
DefaultDocumentParserFactory
public DefaultDocumentParserFactory(String format)
- Creates a new document parser factory of the given format.
- Parameters:
format
- dependent on parser expectations but typically, one
of "text" or "xml"
getParser
public final IDocumentParser getParser(String documentReference,
ContentType contentType)
- Gets a parser based on content type, regardless of document reference
(ignoring it).
- Specified by:
getParser
in interface IDocumentParserFactory
- Parameters:
documentReference
- document referencecontentType
- content type
- Returns:
- document parser
getFormat
public String getFormat()
setFormat
public void setFormat(String format)
registerNamedParser
protected final void registerNamedParser(ContentType contentType,
IDocumentParser parser)
registerFallbackParser
protected final void registerFallbackParser(IDocumentParser parser)
getFallbackParser
protected final IDocumentParser getFallbackParser()
loadFromXML
public void loadFromXML(Reader in)
throws IOException
- Specified by:
loadFromXML
in interface IXMLConfigurable
- Throws:
IOException
saveToXML
public void saveToXML(Writer out)
throws IOException
- Specified by:
saveToXML
in interface IXMLConfigurable
- Throws:
IOException
Copyright © 2009-2013 Norconex Inc.. All Rights Reserved.