Package org.htmlparser.visitors
Class NodeVisitor
- java.lang.Object
-
- org.htmlparser.visitors.NodeVisitor
-
- Direct Known Subclasses:
HtmlPage
,LinkFindingVisitor
,ObjectFindingVisitor
,StringBean
,StringFindingVisitor
,TagFindingVisitor
,TextExtractingVisitor
,UrlModifyingVisitor
public abstract class NodeVisitor extends java.lang.Object
The base class for the 'Visitor' pattern. Classes that wish to usevisitAllNodesWith()
will subclass this class and provide implementations for methods they are interested in processing.The operation of
visitAllNodesWith()
is to callbeginParsing()
, thenvisitXXX()
according to the types of nodes encountered in depth-first order and finallyfinishedParsing()
.Typical code to print all the link tags:
import org.htmlparser.Parser; import org.htmlparser.Tag; import org.htmlparser.Text; import org.htmlparser.util.ParserException; import org.htmlparser.visitors.NodeVisitor; public class MyVisitor extends NodeVisitor { public MyVisitor () { } public void visitTag (Tag tag) { System.out.println ("\n" + tag.getTagName () + tag.getStartPosition ()); } public void visitStringNode (Text string) { System.out.println (string); } public static void main (String[] args) throws ParserException { Parser parser = new Parser ("http://cbc.ca"); Visitor visitor = new MyVisitor (); parser.visitAllNodesWith (visitor); } }
If you want to handle more than one tag type with the same visitor you will need to check the tag type in the visitTag method. You can do that by either checking the tag name:public void visitTag (Tag tag) { if (tag.getName ().equals ("BODY")) ... do something with the BODY tag else if (tag.getName ().equals ("FRAME")) ... do something with the FRAME tag }
or you can useinstanceof
if all the tags you want to handle have aregistered
tag (i.e. they are generated by the NodeFactory):public void visitTag (Tag tag) { if (tag instanceof BodyTag) { BodyTag body = (BodyTag)tag; ... do something with body } else if (tag instanceof FrameTag) { FrameTag frame = (FrameTag)tag; ... do something with frame } else // other specific tags and generic TagNode objects { } }
-
-
Constructor Summary
Constructors Constructor Description NodeVisitor()
Creates a node visitor that recurses itself and it's children.NodeVisitor(boolean recurseChildren)
Creates a node visitor that recurses itself and it's children only ifrecurseChildren
istrue
.NodeVisitor(boolean recurseChildren, boolean recurseSelf)
Creates a node visitor that recurses itself only ifrecurseSelf
istrue
and it's children only ifrecurseChildren
istrue
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
beginParsing()
Override this method if you wish to do special processing prior to the start of parsing.void
finishedParsing()
Override this method if you wish to do special processing upon completion of parsing.boolean
shouldRecurseChildren()
Depth traversal predicate.boolean
shouldRecurseSelf()
Self traversal predicate.void
visitEndTag(Tag tag)
Called for eachTag
visited that is an end tag.void
visitRemarkNode(Remark remark)
Called for eachRemarkNode
visited.void
visitStringNode(Text string)
Called for eachStringNode
visited.void
visitTag(Tag tag)
Called for eachTag
visited.
-
-
-
Constructor Detail
-
NodeVisitor
public NodeVisitor()
Creates a node visitor that recurses itself and it's children.
-
NodeVisitor
public NodeVisitor(boolean recurseChildren)
Creates a node visitor that recurses itself and it's children only ifrecurseChildren
istrue
.- Parameters:
recurseChildren
- Iftrue
, the visitor will visit children, otherwise only the top level nodes are recursed.
-
NodeVisitor
public NodeVisitor(boolean recurseChildren, boolean recurseSelf)
Creates a node visitor that recurses itself only ifrecurseSelf
istrue
and it's children only ifrecurseChildren
istrue
.- Parameters:
recurseChildren
- Iftrue
, the visitor will visit children, otherwise only the top level nodes are recursed.recurseSelf
- Iftrue
, the visitor will visit the top level node.
-
-
Method Detail
-
beginParsing
public void beginParsing()
Override this method if you wish to do special processing prior to the start of parsing.
-
visitTag
public void visitTag(Tag tag)
Called for eachTag
visited.- Parameters:
tag
- The tag being visited.
-
visitEndTag
public void visitEndTag(Tag tag)
Called for eachTag
visited that is an end tag.- Parameters:
tag
- The end tag being visited.
-
visitStringNode
public void visitStringNode(Text string)
Called for eachStringNode
visited.- Parameters:
string
- The string node being visited.
-
visitRemarkNode
public void visitRemarkNode(Remark remark)
Called for eachRemarkNode
visited.- Parameters:
remark
- The remark node being visited.
-
finishedParsing
public void finishedParsing()
Override this method if you wish to do special processing upon completion of parsing.
-
shouldRecurseChildren
public boolean shouldRecurseChildren()
Depth traversal predicate.- Returns:
true
if children are to be visited.
-
shouldRecurseSelf
public boolean shouldRecurseSelf()
Self traversal predicate.- Returns:
true
if a node itself is to be visited.
-
-