Package com.ctc.wstx.io
Class StreamBootstrapper
java.lang.Object
com.ctc.wstx.io.InputBootstrapper
com.ctc.wstx.io.StreamBootstrapper
Input bootstrap class used with streams, when encoding is not known
(when encoding is specified by application, a reader is constructed,
and then reader-based bootstrapper is used).
Encoding used for an entity (including main document entity) is determined using algorithms suggested in XML 1.0#3 spec, appendix F
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) booleanprivate byte[](package private) boolean(package private) intFor most encodings, number of physical characters needed for decoding xml declaration characters (which for variable length encodings like UTF-8 will be 1).(package private) booleanSpecial case for 1-byte encodings: EBCDIC is problematic as it's not 7-bit ascii compatible.(package private) boolean(package private) final InputStreamUnderlying InputStream to use for reading content.(package private) static final intLet's size buffer at least big enough to contain the longest possible prefix of a document needed to positively identify it starts with the XML declaration.(package private) Stringprivate intprivate intprivate final booleanWhether byte buffer is recyclable or not(package private) int[]For single-byte non-ascii-compatible encodings (ok ok, really just EBCDIC), we'll have to use a lookup table.Fields inherited from class com.ctc.wstx.io.InputBootstrapper
BYTE_CR, BYTE_LF, BYTE_NULL, CHAR_CR, CHAR_LF, CHAR_NEL, CHAR_NULL, CHAR_SPACE, ERR_XMLDECL_END_MARKER, ERR_XMLDECL_EXP_ATTRVAL, ERR_XMLDECL_EXP_EQ, ERR_XMLDECL_EXP_SPACE, ERR_XMLDECL_KW_ENCODING, ERR_XMLDECL_KW_STANDALONE, ERR_XMLDECL_KW_VERSION, mDeclaredXmlVersion, mFoundEncoding, mInputProcessed, mInputRow, mInputRowStart, mKeywordBuffer, mPublicId, mStandalone, mSystemId, mXml11Handling -
Constructor Summary
ConstructorsModifierConstructorDescriptionprivateStreamBootstrapper(String pubId, SystemId sysId, byte[] data, int start, int end) privateStreamBootstrapper(String pubId, SystemId sysId, InputStream in) -
Method Summary
Modifier and TypeMethodDescriptionbootstrapInput(ReaderConfig cfg, boolean mainDoc, int xmlVersion) protected intcheckKeyword(String exp) protected intcheckMbKeyword(String expected) protected intcheckSbKeyword(String expected) protected intcheckTranslatedKeyword(String expected) protected booleanensureLoaded(int minimum) intSince this class only gets used when encoding is not explicitly passed, need use the encoding that was auto-detected...intstatic StreamBootstrappergetInstance(String pubId, SystemId sysId, byte[] data, int start, int end) Factory method used when the underlying data provider is a pre-allocated block source, and no stream is used.static StreamBootstrappergetInstance(String pubId, SystemId sysId, InputStream in) Factory method used when the underlying data provider is an actual stream.protected Locationprotected intgetNext()protected intgetNextAfterWs(boolean reqWs) protected booleanprotected voidloadMore()protected bytenextByte()protected intprotected intprotected voidpushback()protected intreadQuotedValue(char[] outputBuffer, int quoteChar) private voidreportWeirdUCS4(String type) protected voidMethod called to try to figure out physical encoding the underlying input stream uses.protected voidskipMbLF(int lf) protected intskipMbWs()protected voidskipSbLF(byte lfByte) protected intskipSbWs()protected voidskipTranslatedLF(int lf) protected intprivate voidverifyEncoding(String id, int bpc) private voidverifyEncoding(String id, int bpc, boolean bigEndian) protected StringverifyXmlEncoding(String enc) Methods inherited from class com.ctc.wstx.io.InputBootstrapper
declaredXml11, getDeclaredEncoding, getDeclaredVersion, getInputRow, getPublicId, getStandalone, getSystemId, initFrom, readXmlDecl, reportNull, reportUnexpectedChar, reportXmlProblem
-
Field Details
-
MIN_BUF_SIZE
static final int MIN_BUF_SIZELet's size buffer at least big enough to contain the longest possible prefix of a document needed to positively identify it starts with the XML declaration. That means having (optional) BOM, and then first 6 characters ("invalid input: '<'?xml "), in whatever encoding. With 4-byte encodings (UCS-4), that comes to 28 bytes. And for good measure, let's pad that a bit as well....- See Also:
-
mIn
Underlying InputStream to use for reading content. May be null if the actual data source is not stream-based but a block source. -
mByteBuffer
private byte[] mByteBuffer -
mRecycleBuffer
private final boolean mRecycleBufferWhether byte buffer is recyclable or not -
mInputPtr
private int mInputPtr -
mInputEnd
private int mInputEnd -
mBigEndian
boolean mBigEndian -
mHadBOM
boolean mHadBOM -
mByteSizeFound
boolean mByteSizeFound -
mBytesPerChar
int mBytesPerCharFor most encodings, number of physical characters needed for decoding xml declaration characters (which for variable length encodings like UTF-8 will be 1). Exception is EBCDIC, which while a single-byte encoding, is denoted by -1 since it needs an additional translation lookup. -
mEBCDIC
boolean mEBCDICSpecial case for 1-byte encodings: EBCDIC is problematic as it's not 7-bit ascii compatible. We can deal with it, still, but only with bit of extra state. -
mInputEncoding
String mInputEncoding -
mSingleByteTranslation
int[] mSingleByteTranslationFor single-byte non-ascii-compatible encodings (ok ok, really just EBCDIC), we'll have to use a lookup table.
-
-
Constructor Details
-
StreamBootstrapper
-
StreamBootstrapper
- Parameters:
start- Pointer to the first valid byte in the bufferend- Pointer to the offset after last valid byte in the buffer
-
-
Method Details
-
getInstance
Factory method used when the underlying data provider is an actual stream. -
getInstance
public static StreamBootstrapper getInstance(String pubId, SystemId sysId, byte[] data, int start, int end) Factory method used when the underlying data provider is a pre-allocated block source, and no stream is used. Additionally the buffer passed is not owned by the bootstrapper or Reader that is created, so it is not to be recycled. -
bootstrapInput
public Reader bootstrapInput(ReaderConfig cfg, boolean mainDoc, int xmlVersion) throws IOException, XMLStreamException - Specified by:
bootstrapInputin classInputBootstrapper- Parameters:
xmlVersion- Optional xml version identifier of the main parsed document (if not bootstrapping the main document). Currently only relevant for checking that XML 1.0 document does not include XML 1.1 external parsed entities. If null, no checks will be done; when bootstrapping parsing of the main document, null should be passed for this argument.- Throws:
IOExceptionXMLStreamException
-
getInputEncoding
Since this class only gets used when encoding is not explicitly passed, need use the encoding that was auto-detected...- Specified by:
getInputEncodingin classInputBootstrapper- Returns:
- Input encoding in use, if it could be determined or was passed by the calling application
-
getInputTotal
public int getInputTotal()- Specified by:
getInputTotalin classInputBootstrapper- Returns:
- Total number of characters read from bootstrapped input (stream, reader)
-
getInputColumn
public int getInputColumn()- Specified by:
getInputColumnin classInputBootstrapper
-
resolveStreamEncoding
Method called to try to figure out physical encoding the underlying input stream uses.- Throws:
IOExceptionWstxException
-
verifyXmlEncoding
- Returns:
- Normalized encoding name
- Throws:
WstxException
-
ensureLoaded
- Throws:
IOException
-
loadMore
- Throws:
IOExceptionWstxException
-
pushback
protected void pushback()- Specified by:
pushbackin classInputBootstrapper
-
getNext
- Specified by:
getNextin classInputBootstrapper- Throws:
IOExceptionWstxException
-
getNextAfterWs
- Specified by:
getNextAfterWsin classInputBootstrapper- Throws:
IOExceptionWstxException
-
checkKeyword
- Specified by:
checkKeywordin classInputBootstrapper- Returns:
- First character that does not match expected, if any; CHAR_NULL if match succeeded
- Throws:
IOExceptionWstxException
-
readQuotedValue
- Specified by:
readQuotedValuein classInputBootstrapper- Throws:
IOExceptionWstxException
-
hasXmlDecl
- Throws:
IOExceptionWstxException
-
getLocation
- Specified by:
getLocationin classInputBootstrapper
-
nextByte
- Throws:
IOExceptionWstxException
-
skipSbWs
- Throws:
IOExceptionWstxException
-
skipSbLF
- Throws:
IOExceptionWstxException
-
checkSbKeyword
- Returns:
- First character that does not match expected, if any; CHAR_NULL if match succeeded
- Throws:
IOExceptionWstxException
-
nextMultiByte
- Throws:
IOExceptionWstxException
-
nextTranslated
- Throws:
IOExceptionWstxException
-
skipMbWs
- Throws:
IOExceptionWstxException
-
skipTranslatedWs
- Throws:
IOExceptionWstxException
-
skipMbLF
- Throws:
IOExceptionWstxException
-
skipTranslatedLF
- Throws:
IOExceptionWstxException
-
checkMbKeyword
- Returns:
- First character that does not match expected, if any; CHAR_NULL if match succeeded
- Throws:
IOExceptionWstxException
-
checkTranslatedKeyword
- Throws:
IOExceptionWstxException
-
verifyEncoding
- Throws:
WstxException
-
verifyEncoding
- Throws:
WstxException
-
reportWeirdUCS4
- Throws:
IOException
-