Class XMLReader

  • All Implemented Interfaces:
    org.xml.sax.XMLReader

    public class XMLReader
    extends java.lang.Object
    implements org.xml.sax.XMLReader
    SAX parser. Generates callbacks on the ContentHandler based on encountered nodes.
    Preliminary.
     org.xml.sax.XMLReader reader = org.xml.sax.helpers.XMLReaderFactory.createXMLReader ("org.htmlparser.sax.XMLReader");
     org.xml.sax.ContentHandler content = new MyContentHandler ();
     reader.setContentHandler (content);
     org.xml.sax.ErrorHandler errors = new MyErrorHandler ();
     reader.setErrorHandler (errors);
     reader.parse ("http://cbc.ca");
     
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected org.xml.sax.ContentHandler mContentHandler
      The content callback object.
      protected org.xml.sax.DTDHandler mDTDHandler
      not implemented
      protected org.xml.sax.EntityResolver mEntityResolver
      not implemented
      protected org.xml.sax.ErrorHandler mErrorHandler
      The error handler object.
      protected boolean mNameSpacePrefixes
      Determines if namespace prefix handling is on.
      protected boolean mNameSpaces
      Determines if namespace handling is on.
      protected Parser mParser
      The underlying DOM parser.
      protected java.lang.String[] mParts
      Qualified name parts.
      protected org.xml.sax.helpers.NamespaceSupport mSupport
      Namspace utility object.
    • Constructor Summary

      Constructors 
      Constructor Description
      XMLReader()
      Create an SAX parser.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected void doSAX​(Node node)
      Process nodes recursively on the DocumentHandler.
      org.xml.sax.ContentHandler getContentHandler()
      Return the current content handler.
      org.xml.sax.DTDHandler getDTDHandler()
      Return the current DTD handler.
      org.xml.sax.EntityResolver getEntityResolver()
      Return the current entity resolver.
      org.xml.sax.ErrorHandler getErrorHandler()
      Return the current error handler.
      boolean getFeature​(java.lang.String name)
      Look up the value of a feature flag.
      java.lang.Object getProperty​(java.lang.String name)
      Look up the value of a property.
      void parse​(java.lang.String systemId)
      Parse an XML document from a system identifier (URI).
      void parse​(org.xml.sax.InputSource input)
      Parse an XML document.
      void setContentHandler​(org.xml.sax.ContentHandler handler)
      Allow an application to register a content event handler.
      void setDTDHandler​(org.xml.sax.DTDHandler handler)
      Allow an application to register a DTD event handler.
      void setEntityResolver​(org.xml.sax.EntityResolver resolver)
      Allow an application to register an entity resolver.
      void setErrorHandler​(org.xml.sax.ErrorHandler handler)
      Allow an application to register an error event handler.
      void setFeature​(java.lang.String name, boolean value)
      Set the value of a feature flag.
      void setProperty​(java.lang.String name, java.lang.Object value)
      Set the value of a property.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • mNameSpaces

        protected boolean mNameSpaces
        Determines if namespace handling is on. All XMLReaders are required to recognize the feature names:
        • http://xml.org/sax/features/namespaces - a value of "true" indicates namespace URIs and unprefixed local names for element and attribute names will be available
        • http://xml.org/sax/features/namespace-prefixes - a value of "true" indicates that XML qualified names (with prefixes) and attributes (including xmlns* attributes) will be available.
      • mNameSpacePrefixes

        protected boolean mNameSpacePrefixes
        Determines if namespace prefix handling is on.
        See Also:
        mNameSpaces
      • mEntityResolver

        protected org.xml.sax.EntityResolver mEntityResolver
        not implemented
      • mDTDHandler

        protected org.xml.sax.DTDHandler mDTDHandler
        not implemented
      • mContentHandler

        protected org.xml.sax.ContentHandler mContentHandler
        The content callback object.
      • mErrorHandler

        protected org.xml.sax.ErrorHandler mErrorHandler
        The error handler object.
      • mParser

        protected Parser mParser
        The underlying DOM parser.
      • mSupport

        protected org.xml.sax.helpers.NamespaceSupport mSupport
        Namspace utility object.
      • mParts

        protected java.lang.String[] mParts
        Qualified name parts.
    • Constructor Detail

      • XMLReader

        public XMLReader()
        Create an SAX parser.
    • Method Detail

      • getFeature

        public boolean getFeature​(java.lang.String name)
                           throws org.xml.sax.SAXNotRecognizedException,
                                  org.xml.sax.SAXNotSupportedException
        Look up the value of a feature flag.

        The feature name is any fully-qualified URI. It is possible for an XMLReader to recognize a feature name but temporarily be unable to return its value. Some feature values may be available only in specific contexts, such as before, during, or after a parse. Also, some feature values may not be programmatically accessible. (In the case of an adapter for SAX1 Parser, there is no implementation-independent way to expose whether the underlying parser is performing validation, expanding external entities, and so forth.)

        All XMLReaders are required to recognize the http://xml.org/sax/features/namespaces and the http://xml.org/sax/features/namespace-prefixes feature names.

        Typical usage is something like this:

         XMLReader r = new MySAXDriver();
        
                                 // try to activate validation
         try {
           r.setFeature("http://xml.org/sax/features/validation", true);
         } catch (SAXException e) {
           System.err.println("Cannot activate validation."); 
         }
        
                                 // register event handlers
         r.setContentHandler(new MyContentHandler());
         r.setErrorHandler(new MyErrorHandler());
        
                                 // parse the first document
         try {
           r.parse("http://www.foo.com/mydoc.xml");
         } catch (IOException e) {
           System.err.println("I/O exception reading XML document");
         } catch (SAXException e) {
           System.err.println("XML exception reading document.");
         }
         

        Implementors are free (and encouraged) to invent their own features, using names built on their own URIs.

        Specified by:
        getFeature in interface org.xml.sax.XMLReader
        Parameters:
        name - The feature name, which is a fully-qualified URI.
        Returns:
        The current value of the feature (true or false).
        Throws:
        org.xml.sax.SAXNotRecognizedException - If the feature value can't be assigned or retrieved.
        org.xml.sax.SAXNotSupportedException - When the XMLReader recognizes the feature name but cannot determine its value at this time.
        See Also:
        setFeature(java.lang.String, boolean)
      • setFeature

        public void setFeature​(java.lang.String name,
                               boolean value)
                        throws org.xml.sax.SAXNotRecognizedException,
                               org.xml.sax.SAXNotSupportedException
        Set the value of a feature flag.

        The feature name is any fully-qualified URI. It is possible for an XMLReader to expose a feature value but to be unable to change the current value. Some feature values may be immutable or mutable only in specific contexts, such as before, during, or after a parse.

        All XMLReaders are required to support setting http://xml.org/sax/features/namespaces to true and http://xml.org/sax/features/namespace-prefixes to false.

        Specified by:
        setFeature in interface org.xml.sax.XMLReader
        Parameters:
        name - The feature name, which is a fully-qualified URI.
        value - The requested value of the feature (true or false).
        Throws:
        org.xml.sax.SAXNotRecognizedException - If the feature value can't be assigned or retrieved.
        org.xml.sax.SAXNotSupportedException - When the XMLReader recognizes the feature name but cannot set the requested value.
        See Also:
        getFeature(java.lang.String)
      • getProperty

        public java.lang.Object getProperty​(java.lang.String name)
                                     throws org.xml.sax.SAXNotRecognizedException,
                                            org.xml.sax.SAXNotSupportedException
        Look up the value of a property.

        The property name is any fully-qualified URI. It is possible for an XMLReader to recognize a property name but temporarily be unable to return its value. Some property values may be available only in specific contexts, such as before, during, or after a parse.

        XMLReaders are not required to recognize any specific property names, though an initial core set is documented for SAX2.

        Implementors are free (and encouraged) to invent their own properties, using names built on their own URIs.

        Specified by:
        getProperty in interface org.xml.sax.XMLReader
        Parameters:
        name - The property name, which is a fully-qualified URI.
        Returns:
        The current value of the property.
        Throws:
        org.xml.sax.SAXNotRecognizedException - If the property value can't be assigned or retrieved.
        org.xml.sax.SAXNotSupportedException - When the XMLReader recognizes the property name but cannot determine its value at this time.
        See Also:
        setProperty(java.lang.String, java.lang.Object)
      • setProperty

        public void setProperty​(java.lang.String name,
                                java.lang.Object value)
                         throws org.xml.sax.SAXNotRecognizedException,
                                org.xml.sax.SAXNotSupportedException
        Set the value of a property.

        The property name is any fully-qualified URI. It is possible for an XMLReader to recognize a property name but to be unable to change the current value. Some property values may be immutable or mutable only in specific contexts, such as before, during, or after a parse.

        XMLReaders are not required to recognize setting any specific property names, though a core set is defined by SAX2.

        This method is also the standard mechanism for setting extended handlers.

        Specified by:
        setProperty in interface org.xml.sax.XMLReader
        Parameters:
        name - The property name, which is a fully-qualified URI.
        value - The requested value for the property.
        Throws:
        org.xml.sax.SAXNotRecognizedException - If the property value can't be assigned or retrieved.
        org.xml.sax.SAXNotSupportedException - When the XMLReader recognizes the property name but cannot set the requested value.
      • setEntityResolver

        public void setEntityResolver​(org.xml.sax.EntityResolver resolver)
        Allow an application to register an entity resolver.

        If the application does not register an entity resolver, the XMLReader will perform its own default resolution.

        Applications may register a new or different resolver in the middle of a parse, and the SAX parser must begin using the new resolver immediately.

        Specified by:
        setEntityResolver in interface org.xml.sax.XMLReader
        Parameters:
        resolver - The entity resolver.
        See Also:
        getEntityResolver()
      • getEntityResolver

        public org.xml.sax.EntityResolver getEntityResolver()
        Return the current entity resolver.
        Specified by:
        getEntityResolver in interface org.xml.sax.XMLReader
        Returns:
        The current entity resolver, or null if none has been registered.
        See Also:
        setEntityResolver(org.xml.sax.EntityResolver)
      • setDTDHandler

        public void setDTDHandler​(org.xml.sax.DTDHandler handler)
        Allow an application to register a DTD event handler.

        If the application does not register a DTD handler, all DTD events reported by the SAX parser will be silently ignored.

        Applications may register a new or different handler in the middle of a parse, and the SAX parser must begin using the new handler immediately.

        Specified by:
        setDTDHandler in interface org.xml.sax.XMLReader
        Parameters:
        handler - The DTD handler.
        See Also:
        getDTDHandler()
      • getDTDHandler

        public org.xml.sax.DTDHandler getDTDHandler()
        Return the current DTD handler.
        Specified by:
        getDTDHandler in interface org.xml.sax.XMLReader
        Returns:
        The current DTD handler, or null if none has been registered.
        See Also:
        setDTDHandler(org.xml.sax.DTDHandler)
      • setContentHandler

        public void setContentHandler​(org.xml.sax.ContentHandler handler)
        Allow an application to register a content event handler.

        If the application does not register a content handler, all content events reported by the SAX parser will be silently ignored.

        Applications may register a new or different handler in the middle of a parse, and the SAX parser must begin using the new handler immediately.

        Specified by:
        setContentHandler in interface org.xml.sax.XMLReader
        Parameters:
        handler - The content handler.
        See Also:
        getContentHandler()
      • getContentHandler

        public org.xml.sax.ContentHandler getContentHandler()
        Return the current content handler.
        Specified by:
        getContentHandler in interface org.xml.sax.XMLReader
        Returns:
        The current content handler, or null if none has been registered.
        See Also:
        setContentHandler(org.xml.sax.ContentHandler)
      • setErrorHandler

        public void setErrorHandler​(org.xml.sax.ErrorHandler handler)
        Allow an application to register an error event handler.

        If the application does not register an error handler, all error events reported by the SAX parser will be silently ignored; however, normal processing may not continue. It is highly recommended that all SAX applications implement an error handler to avoid unexpected bugs.

        Applications may register a new or different handler in the middle of a parse, and the SAX parser must begin using the new handler immediately.

        Specified by:
        setErrorHandler in interface org.xml.sax.XMLReader
        Parameters:
        handler - The error handler.
        See Also:
        getErrorHandler()
      • getErrorHandler

        public org.xml.sax.ErrorHandler getErrorHandler()
        Return the current error handler.
        Specified by:
        getErrorHandler in interface org.xml.sax.XMLReader
        Returns:
        The current error handler, or null if none has been registered.
        See Also:
        setErrorHandler(org.xml.sax.ErrorHandler)
      • parse

        public void parse​(org.xml.sax.InputSource input)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse an XML document.

        The application can use this method to instruct the XML reader to begin parsing an XML document from any valid input source (a character stream, a byte stream, or a URI).

        Applications may not invoke this method while a parse is in progress (they should create a new XMLReader instead for each nested XML document). Once a parse is complete, an application may reuse the same XMLReader object, possibly with a different input source. Configuration of the XMLReader object (such as handler bindings and values established for feature flags and properties) is unchanged by completion of a parse, unless the definition of that aspect of the configuration explicitly specifies other behavior. (For example, feature flags or properties exposing characteristics of the document being parsed.)

        During the parse, the XMLReader will provide information about the XML document through the registered event handlers.

        This method is synchronous: it will not return until parsing has ended. If a client application wants to terminate parsing early, it should throw an exception.

        Specified by:
        parse in interface org.xml.sax.XMLReader
        Parameters:
        input - The input source for the top-level of the XML document.
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
        See Also:
        InputSource, parse(java.lang.String), setEntityResolver(org.xml.sax.EntityResolver), setDTDHandler(org.xml.sax.DTDHandler), setContentHandler(org.xml.sax.ContentHandler), setErrorHandler(org.xml.sax.ErrorHandler)
      • parse

        public void parse​(java.lang.String systemId)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse an XML document from a system identifier (URI).

        This method is a shortcut for the common case of reading a document from a system identifier. It is the exact equivalent of the following:

         parse(new InputSource(systemId));
         

        If the system identifier is a URL, it must be fully resolved by the application before it is passed to the parser.

        Specified by:
        parse in interface org.xml.sax.XMLReader
        Parameters:
        systemId - The system identifier (URI).
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
        See Also:
        parse(org.xml.sax.InputSource)
      • doSAX

        protected void doSAX​(Node node)
                      throws ParserException,
                             org.xml.sax.SAXException
        Process nodes recursively on the DocumentHandler. Calls methods on the handler based on the type and whether it's an end tag. Processes composite tags recursively. Does rudimentary namespace processing according to the state of mNameSpaces and mNameSpacePrefixes.
        Parameters:
        node - The htmlparser node to traverse.
        Throws:
        ParserException - If a parse error occurs.
        org.xml.sax.SAXException - If a SAX error occurs.