org.apache.xerces.impl

Class XMLScanner

public abstract class XMLScanner extends Object implements XMLComponent

This class is responsible for holding scanning methods common to scanning the XML document structure and content as well as the DTD structure and content. Both XMLDocumentScanner and XMLDTDScanner inherit from this base class.

This component requires the following features and properties from the component manager that uses it:

INTERNAL:

Usage of this class is not supported. It may be altered or removed at any time.

Version: $Id: XMLScanner.java,v 1.51 2004/10/04 21:45:48 mrglavas Exp $

Author: Andy Clark, IBM Arnaud Le Hors, IBM Eric Ye, IBM

Field Summary
protected static booleanDEBUG_ATTR_NORMALIZATION
Debug attribute normalization.
protected static StringENTITY_MANAGER
Property identifier: entity manager.
protected static StringERROR_REPORTER
Property identifier: error reporter.
protected static StringfAmpSymbol
Symbol: "amp".
protected static StringfAposSymbol
Symbol: "apos".
protected StringfCharRefLiteral
Literal value of the last character refence scanned.
protected static StringfEncodingSymbol
Symbol: "encoding".
protected intfEntityDepth
Entity depth.
protected XMLEntityManagerfEntityManager
Entity manager.
protected XMLEntityScannerfEntityScanner
Entity scanner.
protected XMLErrorReporterfErrorReporter
Error reporter.
protected static StringfGtSymbol
Symbol: "gt".
protected static StringfLtSymbol
Symbol: "lt".
protected booleanfNamespaces
Namespaces.
protected booleanfNotifyCharRefs
Character references notification.
protected booleanfParserSettings
Internal parser-settings feature
protected static StringfQuotSymbol
Symbol: "quot".
protected booleanfReportEntity
Report entity boundary.
protected XMLResourceIdentifierImplfResourceIdentifier
protected booleanfScanningAttribute
Scanning attribute.
protected static StringfStandaloneSymbol
Symbol: "standalone".
protected SymbolTablefSymbolTable
Symbol table.
protected booleanfValidation
Validation.
protected static StringfVersionSymbol
Symbol: "version".
protected static StringNAMESPACES
Feature identifier: namespaces.
protected static StringNOTIFY_CHAR_REFS
Feature identifier: notify character references.
protected static StringPARSER_SETTINGS
protected static StringSYMBOL_TABLE
Property identifier: symbol table.
protected static StringVALIDATION
Feature identifier: validation.
Method Summary
voidendEntity(String name, Augmentations augs)
This method notifies the end of an entity.
booleangetFeature(String featureId)
protected StringgetVersionNotSupportedKey()
protected booleanisInvalid(int value)
protected booleanisInvalidLiteral(int value)
protected intisUnchangedByNormalization(XMLString value)
Checks whether this string would be unchanged by normalization.
protected booleanisValidNameChar(int value)
protected booleanisValidNameStartChar(int value)
protected booleanisValidNameStartHighSurrogate(int value)
protected booleanisValidNCName(int value)
protected voidnormalizeWhitespace(XMLString value)
Normalize whitespace in an XMLString converting all whitespace characters to space characters.
protected voidnormalizeWhitespace(XMLString value, int fromIndex)
Normalize whitespace in an XMLString converting all whitespace characters to space characters.
protected voidreportFatalError(String msgId, Object[] args)
Convenience function used in all XML scanners.
voidreset(XMLComponentManager componentManager)
protected voidreset()
protected booleanscanAttributeValue(XMLString value, XMLString nonNormalizedValue, String atName, boolean checkEntities, String eleName)
Scans an attribute value and normalizes whitespace converting all whitespace characters to space characters.
protected intscanCharReferenceValue(XMLStringBuffer buf, XMLStringBuffer buf2)
Scans a character reference and append the corresponding chars to the specified buffer.
protected voidscanComment(XMLStringBuffer text)
Scans a comment.
protected voidscanExternalID(String[] identifiers, boolean optionalSystemId)
Scans External ID and return the public and system IDs.
protected voidscanPI()
Scans a processing instruction.
protected voidscanPIData(String target, XMLString data)
Scans a processing data.
StringscanPseudoAttribute(boolean scanningTextDecl, XMLString value)
Scans a pseudo attribute.
protected booleanscanPubidLiteral(XMLString literal)
Scans public ID literal.
protected booleanscanSurrogates(XMLStringBuffer buf)
Scans surrogates and append them to the specified buffer.
protected voidscanXMLDeclOrTextDecl(boolean scanningTextDecl, String[] pseudoAttributeValues)
Scans an XML or text declaration.
voidsetFeature(String featureId, boolean value)
voidsetProperty(String propertyId, Object value)
Sets the value of a property during parsing.
voidstartEntity(String name, XMLResourceIdentifier identifier, String encoding, Augmentations augs)
This method notifies of the start of an entity.
protected booleanversionSupported(String version)

Field Detail

DEBUG_ATTR_NORMALIZATION

protected static final boolean DEBUG_ATTR_NORMALIZATION
Debug attribute normalization.

ENTITY_MANAGER

protected static final String ENTITY_MANAGER
Property identifier: entity manager.

ERROR_REPORTER

protected static final String ERROR_REPORTER
Property identifier: error reporter.

fAmpSymbol

protected static final String fAmpSymbol
Symbol: "amp".

fAposSymbol

protected static final String fAposSymbol
Symbol: "apos".

fCharRefLiteral

protected String fCharRefLiteral
Literal value of the last character refence scanned.

fEncodingSymbol

protected static final String fEncodingSymbol
Symbol: "encoding".

fEntityDepth

protected int fEntityDepth
Entity depth.

fEntityManager

protected XMLEntityManager fEntityManager
Entity manager.

fEntityScanner

protected XMLEntityScanner fEntityScanner
Entity scanner.

fErrorReporter

protected XMLErrorReporter fErrorReporter
Error reporter.

fGtSymbol

protected static final String fGtSymbol
Symbol: "gt".

fLtSymbol

protected static final String fLtSymbol
Symbol: "lt".

fNamespaces

protected boolean fNamespaces
Namespaces.

fNotifyCharRefs

protected boolean fNotifyCharRefs
Character references notification.

fParserSettings

protected boolean fParserSettings
Internal parser-settings feature

fQuotSymbol

protected static final String fQuotSymbol
Symbol: "quot".

fReportEntity

protected boolean fReportEntity
Report entity boundary.

fResourceIdentifier

protected XMLResourceIdentifierImpl fResourceIdentifier

fScanningAttribute

protected boolean fScanningAttribute
Scanning attribute.

fStandaloneSymbol

protected static final String fStandaloneSymbol
Symbol: "standalone".

fSymbolTable

protected SymbolTable fSymbolTable
Symbol table.

fValidation

protected boolean fValidation
Validation. This feature identifier is: http://xml.org/sax/features/validation

fVersionSymbol

protected static final String fVersionSymbol
Symbol: "version".

NAMESPACES

protected static final String NAMESPACES
Feature identifier: namespaces.

NOTIFY_CHAR_REFS

protected static final String NOTIFY_CHAR_REFS
Feature identifier: notify character references.

PARSER_SETTINGS

protected static final String PARSER_SETTINGS

SYMBOL_TABLE

protected static final String SYMBOL_TABLE
Property identifier: symbol table.

VALIDATION

protected static final String VALIDATION
Feature identifier: validation.

Method Detail

endEntity

public void endEntity(String name, Augmentations augs)
This method notifies the end of an entity. The document entity has the pseudo-name of "[xml]" the DTD has the pseudo-name of "[dtd]" parameter entity names start with '%'; and general entities are just specified by their name.

Parameters: name The name of the entity. augs Additional information that may include infoset augmentations

Throws: XNIException Thrown by handler to signal an error.

getFeature

public boolean getFeature(String featureId)

getVersionNotSupportedKey

protected String getVersionNotSupportedKey()

isInvalid

protected boolean isInvalid(int value)

isInvalidLiteral

protected boolean isInvalidLiteral(int value)

isUnchangedByNormalization

protected int isUnchangedByNormalization(XMLString value)
Checks whether this string would be unchanged by normalization.

Returns: -1 if the value would be unchanged by normalization, otherwise the index of the first whitespace character which would be transformed.

isValidNameChar

protected boolean isValidNameChar(int value)

isValidNameStartChar

protected boolean isValidNameStartChar(int value)

isValidNameStartHighSurrogate

protected boolean isValidNameStartHighSurrogate(int value)

isValidNCName

protected boolean isValidNCName(int value)

normalizeWhitespace

protected void normalizeWhitespace(XMLString value)
Normalize whitespace in an XMLString converting all whitespace characters to space characters.

normalizeWhitespace

protected void normalizeWhitespace(XMLString value, int fromIndex)
Normalize whitespace in an XMLString converting all whitespace characters to space characters.

reportFatalError

protected void reportFatalError(String msgId, Object[] args)
Convenience function used in all XML scanners.

reset

public void reset(XMLComponentManager componentManager)

Parameters: componentManager The component manager.

Throws: SAXException Throws exception if required features and properties cannot be found.

reset

protected void reset()

scanAttributeValue

protected boolean scanAttributeValue(XMLString value, XMLString nonNormalizedValue, String atName, boolean checkEntities, String eleName)
Scans an attribute value and normalizes whitespace converting all whitespace characters to space characters. [10] AttValue ::= '"' ([^<&"] | Reference)* '"' | "'" ([^<&'] | Reference)* "'"

Parameters: value The XMLString to fill in with the value. nonNormalizedValue The XMLString to fill in with the non-normalized value. atName The name of the attribute being parsed (for error msgs). checkEntities true if undeclared entities should be reported as VC violation, false if undeclared entities should be reported as WFC violation. eleName The name of element to which this attribute belongs.

Returns: true if the non-normalized and normalized value are the same Note: This method uses fStringBuffer2, anything in it at the time of calling is lost.

scanCharReferenceValue

protected int scanCharReferenceValue(XMLStringBuffer buf, XMLStringBuffer buf2)
Scans a character reference and append the corresponding chars to the specified buffer.

 [66] CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';'
 
Note: This method uses fStringBuffer, anything in it at the time of calling is lost.

Parameters: buf the character buffer to append chars to buf2 the character buffer to append non-normalized chars to

Returns: the character value or (-1) on conversion failure

scanComment

protected void scanComment(XMLStringBuffer text)
Scans a comment.

 [15] Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'
 

Note: Called after scanning past '<!--' Note: This method uses fString, anything in it at the time of calling is lost.

Parameters: text The buffer to fill in with the text.

scanExternalID

protected void scanExternalID(String[] identifiers, boolean optionalSystemId)
Scans External ID and return the public and system IDs.

Parameters: identifiers An array of size 2 to return the system id, and public id (in that order). optionalSystemId Specifies whether the system id is optional. Note: This method uses fString and fStringBuffer, anything in them at the time of calling is lost.

scanPI

protected void scanPI()
Scans a processing instruction.

 [16] PI ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'
 [17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' | 'l'))
 
Note: This method uses fString, anything in it at the time of calling is lost.

scanPIData

protected void scanPIData(String target, XMLString data)
Scans a processing data. This is needed to handle the situation where a document starts with a processing instruction whose target name starts with "xml". (e.g. xmlfoo) Note: This method uses fStringBuffer, anything in it at the time of calling is lost.

Parameters: target The PI target data The string to fill in with the data

scanPseudoAttribute

public String scanPseudoAttribute(boolean scanningTextDecl, XMLString value)
Scans a pseudo attribute.

Parameters: scanningTextDecl True if scanning this pseudo-attribute for a TextDecl; false if scanning XMLDecl. This flag is needed to report the correct type of error. value The string to fill in with the attribute value.

Returns: The name of the attribute Note: This method uses fStringBuffer2, anything in it at the time of calling is lost.

scanPubidLiteral

protected boolean scanPubidLiteral(XMLString literal)
Scans public ID literal. [12] PubidLiteral ::= '"' PubidChar* '"' | "'" (PubidChar - "'")* "'" [13] PubidChar::= #x20 | #xD | #xA | [a-zA-Z0-9] | [-'()+,./:=?;!*#@$_%] The returned string is normalized according to the following rule, from http://www.w3.org/TR/REC-xml#dt-pubid: Before a match is attempted, all strings of white space in the public identifier must be normalized to single space characters (#x20), and leading and trailing white space must be removed.

Parameters: literal The string to fill in with the public ID literal.

Returns: True on success. Note: This method uses fStringBuffer, anything in it at the time of calling is lost.

scanSurrogates

protected boolean scanSurrogates(XMLStringBuffer buf)
Scans surrogates and append them to the specified buffer.

Note: This assumes the current char has already been identified as a high surrogate.

Parameters: buf The StringBuffer to append the read surrogates to.

Returns: True if it succeeded.

scanXMLDeclOrTextDecl

protected void scanXMLDeclOrTextDecl(boolean scanningTextDecl, String[] pseudoAttributeValues)
Scans an XML or text declaration.

 [23] XMLDecl ::= ''
 [24] VersionInfo ::= S 'version' Eq (' VersionNum ' | " VersionNum ")
 [80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' |  "'" EncName "'" )
 [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
 [32] SDDecl ::= S 'standalone' Eq (("'" ('yes' | 'no') "'")
                 | ('"' ('yes' | 'no') '"'))

 [77] TextDecl ::= ''
 

Parameters: scanningTextDecl True if a text declaration is to be scanned instead of an XML declaration. pseudoAttributeValues An array of size 3 to return the version, encoding and standalone pseudo attribute values (in that order). Note: This method uses fString, anything in it at the time of calling is lost.

setFeature

public void setFeature(String featureId, boolean value)

setProperty

public void setProperty(String propertyId, Object value)
Sets the value of a property during parsing.

Parameters: propertyId value

startEntity

public void startEntity(String name, XMLResourceIdentifier identifier, String encoding, Augmentations augs)
This method notifies of the start of an entity. The document entity has the pseudo-name of "[xml]" the DTD has the pseudo-name of "[dtd]" parameter entity names start with '%'; and general entities are just specified by their name.

Parameters: name The name of the entity. identifier The resource identifier. encoding The auto-detected IANA encoding name of the entity stream. This value will be null in those situations where the entity encoding is not auto-detected (e.g. internal entities or a document entity that is parsed from a java.io.Reader). augs Additional information that may include infoset augmentations

Throws: XNIException Thrown by handler to signal an error.

versionSupported

protected boolean versionSupported(String version)
Copyright B) 1999-2005 Apache XML Project. All Rights Reserved.