Class Attribute
- java.lang.Object
-
- org.htmlparser.Attribute
-
- All Implemented Interfaces:
java.io.Serializable
- Direct Known Subclasses:
PageAttribute
public class Attribute extends java.lang.Object implements java.io.Serializable
An attribute within a tag. Holds the name, assignment string, value and quote character.This class was made deliberately simple. Except for
RawValue
, the properties are completely orthogonal, that is: each property is independant of the others. This means you have enough rope here to hang yourself, and it's very easy to create malformed HTML. Where it's obvious, warnings and notes have been provided in the setters javadocs, but it is up to you -- the programmer -- to ensure that the contents of the four fields will yield valid HTML (if that's what you want).Be especially mindful of quotes and assignment strings. These are handled by the constructors where it's obvious, but in general, you need to set them explicitly when building an attribute. For example to construct the attribute
label="A multi word value."
you could use:attribute = new Attribute (); attribute.setName ("label"); attribute.setAssignment ("="); attribute.setValue ("A multi word value."); attribute.setQuote ('"');
orattribute = new Attribute (); attribute.setName ("label"); attribute.setAssignment ("="); attribute.setRawValue ("A multi word value.");
orattribute = new Attribute ("label", "A multi word value.");
Note that the assignment value and quoting need to be set separately when building the attribute from scratch using the properties.Valid States for Attributes. Description toString() Name Assignment Value Quote whitespace attribute value null
null
"value" 0
standalone attribute name "name" null
null
0
empty attribute name= "name" "=" null
0
empty single quoted attribute name='' "name" "=" null
'
empty double quoted attribute name="" "name" "=" null
"
naked attribute name=value "name" "=" "value" 0
single quoted attribute name='value' "name" "=" "value" '
double quoted attribute name="value" "name" "=" "value" "
In words:
If Name is null, and Assignment is null, and Quote is zero, it's whitepace and Value has the whitespace text -- value
If Name is not null, and both Assignment and Value are null it's a standalone attribute -- name
If Name is not null, and Assignment is an equals sign, and Quote is zero it's an empty attribute -- name=
If Name is not null, and Assignment is an equals sign, and Value is "" or null, and Quote is ' it's an empty single quoted attribute -- name=''
If Name is not null, and Assignment is an equals sign, and Value is "" or null, and Quote is " it's an empty double quoted attribute -- name=""
If Name is not null, and Assignment is an equals sign, and Value is something, and Quote is zero it's a naked attribute -- name=value
If Name is not null, and Assignment is an equals sign, and Value is something, and Quote is ' it's a single quoted attribute -- name='value'
If Name is not null, and Assignment is an equals sign, and Value is something, and Quote is " it's a double quoted attribute -- name="value"
All other states are invalid HTML.From the HTML 4.01 Specification, W3C Recommendation 24 December 1999 http://www.w3.org/TR/html4/intro/sgmltut.html#h-3.2.2:
3.2.2 Attributes
Elements may have associated properties, called attributes, which may have values (by default, or set by authors or scripts). Attribute/value pairs appear before the final ">" of an element's start tag. Any number of (legal) attribute value pairs, separated by spaces, may appear in an element's start tag. They may appear in any order.
In this example, the id attribute is set for an H1 element:
{@.html
This is an identified heading thanks to the id attribute
}In certain cases, authors may specify the value of an attribute without any quotation marks. The attribute value may only contain letters (a-z and A-Z), digits (0-9), hyphens (ASCII decimal 45), periods (ASCII decimal 46), underscores (ASCII decimal 95), and colons (ASCII decimal 58). We recommend using quotation marks even when it is possible to eliminate them.
Attribute names are always case-insensitive.
Attribute values are generally case-insensitive. The definition of each attribute in the reference manual indicates whether its value is case-insensitive.
All the attributes defined by this specification are listed in the attribute index.
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected java.lang.String
mAssignment
The assignment string of the attribute.protected java.lang.String
mName
The name of this attribute.protected char
mQuote
The quote, if any, surrounding the value of the attribute, if any.protected java.lang.String
mValue
The value of the attribute.
-
Constructor Summary
Constructors Constructor Description Attribute()
Create an empty attribute.Attribute(java.lang.String value)
Create a whitespace attribute with the value given.Attribute(java.lang.String name, java.lang.String value)
Create an attribute with the name and value given.Attribute(java.lang.String name, java.lang.String value, char quote)
Create an attribute with the name, value and quote given.Attribute(java.lang.String name, java.lang.String assignment, java.lang.String value)
Create an attribute with the name, assignment string and value given.Attribute(java.lang.String name, java.lang.String assignment, java.lang.String value, char quote)
Create an attribute with the name, assignment, value and quote given.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
getAssignment()
Get the assignment string of this attribute.void
getAssignment(java.lang.StringBuffer buffer)
Get the assignment string of this attribute.int
getLength()
Get the length of the string value of this attribute.java.lang.String
getName()
Get the name of this attribute.void
getName(java.lang.StringBuffer buffer)
Get the name of this attribute.char
getQuote()
Get the quote, if any, surrounding the value of the attribute, if any.void
getQuote(java.lang.StringBuffer buffer)
Get the quote, if any, surrounding the value of the attribute, if any.java.lang.String
getRawValue()
Get the raw value of the attribute.void
getRawValue(java.lang.StringBuffer buffer)
Get the raw value of the attribute.java.lang.String
getValue()
Get the value of the attribute.void
getValue(java.lang.StringBuffer buffer)
Get the value of the attribute.boolean
isEmpty()
Predicate to determine if this attribute has an equals sign but no value.boolean
isStandAlone()
Predicate to determine if this attribute has no equals sign (or value).boolean
isValued()
Predicate to determine if this attribute has a value.boolean
isWhitespace()
Predicate to determine if this attribute is whitespace.void
setAssignment(java.lang.String assignment)
Set the assignment string of this attribute.void
setName(java.lang.String name)
Set the name of this attribute.void
setQuote(char quote)
Set the quote surrounding the value of the attribute.void
setRawValue(java.lang.String value)
Set the value of the attribute and the quote character.void
setValue(java.lang.String value)
Set the value of the attribute.java.lang.String
toString()
Get a text representation of this attribute.void
toString(java.lang.StringBuffer buffer)
Get a text representation of this attribute.
-
-
-
Field Detail
-
mName
protected java.lang.String mName
The name of this attribute. The part before the equals sign, or the stand-alone attribute. This will benull
if the attribute is whitespace.
-
mAssignment
protected java.lang.String mAssignment
The assignment string of the attribute. The equals sign. This will benull
if the attribute is a stand-alone attribute.
-
mValue
protected java.lang.String mValue
The value of the attribute. The part after the equals sign. This will benull
if the attribute is an empty or stand-alone attribute.
-
mQuote
protected char mQuote
The quote, if any, surrounding the value of the attribute, if any. This will be zero if there are no quotes around the value.
-
-
Constructor Detail
-
Attribute
public Attribute(java.lang.String name, java.lang.String assignment, java.lang.String value, char quote)
Create an attribute with the name, assignment, value and quote given. If the quote value is zero, assigns the value usingsetRawValue(java.lang.String)
which sets the quote character to a proper value if necessary.- Parameters:
name
- The name of this attribute.assignment
- The assignment string of this attribute.value
- The value of this attribute.quote
- The quote around the value of this attribute.
-
Attribute
public Attribute(java.lang.String name, java.lang.String value, char quote)
Create an attribute with the name, value and quote given. Uses an equals sign as the assignment string if the value is notnull
, and callssetRawValue(java.lang.String)
to get the correct quoting ifquote
is zero.- Parameters:
name
- The name of this attribute.value
- The value of this attribute.quote
- The quote around the value of this attribute.
-
Attribute
public Attribute(java.lang.String value) throws java.lang.IllegalArgumentException
Create a whitespace attribute with the value given.- Parameters:
value
- The value of this attribute.- Throws:
java.lang.IllegalArgumentException
- if the value contains other than whitespace. To set a real value useAttribute(String,String)
.
-
Attribute
public Attribute(java.lang.String name, java.lang.String value)
Create an attribute with the name and value given. Uses an equals sign as the assignment string if the value is notnull
, and callssetRawValue(java.lang.String)
to get the correct quoting.- Parameters:
name
- The name of this attribute.value
- The value of this attribute.
-
Attribute
public Attribute(java.lang.String name, java.lang.String assignment, java.lang.String value)
Create an attribute with the name, assignment string and value given. CallssetRawValue(java.lang.String)
to get the correct quoting.- Parameters:
name
- The name of this attribute.assignment
- The assignment string of this attribute.value
- The value of this attribute.
-
Attribute
public Attribute()
Create an empty attribute. This will provide "" from thetoString()
andtoString(StringBuffer)
methods.
-
-
Method Detail
-
getName
public java.lang.String getName()
Get the name of this attribute. The part before the equals sign, or the contents of the stand-alone attribute.- Returns:
- The name, or
null
if it's just a whitepace 'attribute'. - See Also:
setName(java.lang.String)
-
getName
public void getName(java.lang.StringBuffer buffer)
Get the name of this attribute.- Parameters:
buffer
- The buffer to place the name in.- See Also:
getName()
,setName(java.lang.String)
-
setName
public void setName(java.lang.String name)
Set the name of this attribute. Set the part before the equals sign, or the contents of the stand-alone attribute. WARNING: Setting this tonull
can result in malformed HTML if the assignment string is notnull
.- Parameters:
name
- The new name.- See Also:
getName()
,getName(StringBuffer)
-
getAssignment
public java.lang.String getAssignment()
Get the assignment string of this attribute. This is usually just an equals sign, but in poorly formed attributes it can include whitespace on either or both sides of an equals sign.- Returns:
- The assignment string.
- See Also:
setAssignment(java.lang.String)
-
getAssignment
public void getAssignment(java.lang.StringBuffer buffer)
Get the assignment string of this attribute.- Parameters:
buffer
- The buffer to place the assignment string in.- See Also:
getAssignment()
,setAssignment(java.lang.String)
-
setAssignment
public void setAssignment(java.lang.String assignment)
Set the assignment string of this attribute. WARNING: Setting this property to other than an equals sign ornull
will result in malformed HTML. In the case of anull
, thevalue
should also be set tonull
.- Parameters:
assignment
- The new assignment string.- See Also:
getAssignment()
,getAssignment(StringBuffer)
-
getValue
public java.lang.String getValue()
Get the value of the attribute. The part after the equals sign, or the text if it's just a whitepace 'attribute'. NOTE: This does not include any quotes that may have enclosed the value when it was read. To get the un-stripped value usegetRawValue()
.- Returns:
- The value, or
null
if it's a stand-alone or empty attribute, or the text if it's just a whitepace 'attribute'. - See Also:
setValue(java.lang.String)
-
getValue
public void getValue(java.lang.StringBuffer buffer)
Get the value of the attribute.- Parameters:
buffer
- The buffer to place the value in.- See Also:
getValue()
,setValue(java.lang.String)
-
setValue
public void setValue(java.lang.String value)
Set the value of the attribute. The part after the equals sign, or the text if it's a whitepace 'attribute'. WARNING: Setting this property to a value that needs to be quoted without also setting the quote character will result in malformed HTML.- Parameters:
value
- The new value.- See Also:
getValue()
,getValue(StringBuffer)
-
getQuote
public char getQuote()
Get the quote, if any, surrounding the value of the attribute, if any.- Returns:
- Either ' or " if the attribute value was quoted, or zero if there are no quotes around it.
- See Also:
setQuote(char)
-
getQuote
public void getQuote(java.lang.StringBuffer buffer)
Get the quote, if any, surrounding the value of the attribute, if any.- Parameters:
buffer
- The buffer to place the quote in.- See Also:
getQuote()
,setQuote(char)
-
setQuote
public void setQuote(char quote)
Set the quote surrounding the value of the attribute. WARNING: Setting this property to zero will result in malformed HTML if thevalue
needs to be quoted (i.e. contains whitespace).- Parameters:
quote
- The new quote value.- See Also:
getQuote()
,getQuote(StringBuffer)
-
getRawValue
public java.lang.String getRawValue()
Get the raw value of the attribute. The part after the equals sign, or the text if it's just a whitepace 'attribute'. This includes the quotes around the value if any.- Returns:
- The value, or
null
if it's a stand-alone attribute, or the text if it's just a whitepace 'attribute'. - See Also:
setRawValue(java.lang.String)
-
getRawValue
public void getRawValue(java.lang.StringBuffer buffer)
Get the raw value of the attribute. The part after the equals sign, or the text if it's just a whitepace 'attribute'. This includes the quotes around the value if any.- Parameters:
buffer
- The string buffer to append the attribute value to.- See Also:
getRawValue()
,setRawValue(java.lang.String)
-
setRawValue
public void setRawValue(java.lang.String value)
Set the value of the attribute and the quote character. If the value is pure whitespace, assign it 'as is' and reset the quote character. If not, check for leading and trailing double or single quotes, and if found use this as the quote character and the inner contents ofvalue
as the real value. Otherwise, examine the string to determine if quotes are needed and an appropriate quote character if so. This may involve changing double quotes within the string to character references.- Parameters:
value
- The new value.- See Also:
getRawValue()
,getRawValue(StringBuffer)
-
isWhitespace
public boolean isWhitespace()
Predicate to determine if this attribute is whitespace.- Returns:
true
if this attribute is whitespace,false
if it is a real attribute.
-
isStandAlone
public boolean isStandAlone()
Predicate to determine if this attribute has no equals sign (or value).- Returns:
true
if this attribute is a standalone attribute.false
if has an equals sign.
-
isEmpty
public boolean isEmpty()
Predicate to determine if this attribute has an equals sign but no value.- Returns:
true
if this attribute is an empty attribute.false
if has an equals sign and a value.
-
isValued
public boolean isValued()
Predicate to determine if this attribute has a value.- Returns:
true
if this attribute has a value.false
if it is empty or standalone.
-
getLength
public int getLength()
Get the length of the string value of this attribute.- Returns:
- The number of characters required to express this attribute.
-
toString
public java.lang.String toString()
Get a text representation of this attribute. Suitable for insertion into a tag, the output is one of the forms:value name name= name=value name='value' name="value"
- Overrides:
toString
in classjava.lang.Object
- Returns:
- A string that can be used within a tag.
-
toString
public void toString(java.lang.StringBuffer buffer)
Get a text representation of this attribute.- Parameters:
buffer
- The accumulator for placing the text into.- See Also:
toString()
-
-