Class StringExtractor
- java.lang.Object
-
- org.htmlparser.parserapplications.StringExtractor
-
public class StringExtractor extends java.lang.ObjectExtract plaintext strings from a web page. Illustrative program to gather the textual contents of a web page. Uses aStringBeanto accumulate the user visible text (what a browser would display) into a single string.
-
-
Constructor Summary
Constructors Constructor Description StringExtractor(java.lang.String resource)Construct a StringExtractor to read from the given resource.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.StringextractStrings(boolean links)Extract the text from a page.static voidmain(java.lang.String[] args)Mainline.
-
-
-
Method Detail
-
extractStrings
public java.lang.String extractStrings(boolean links) throws ParserExceptionExtract the text from a page.- Parameters:
links- iftrueinclude hyperlinks in output.- Returns:
- The textual contents of the page.
- Throws:
ParserException- If a parse error occurs.
-
main
public static void main(java.lang.String[] args)
Mainline.- Parameters:
args- The command line arguments.
-
-