Top Description Fields Constructors Methods
com.sun.org.apache.xerces.internal.util

public Class XMLChar

extends Object
Class Inheritance
Imports
java.util.Arrays

This class defines the basic XML character properties. The data in this class can be used to verify that a character is a valid XML character or if the character is a space, name start, or name character.

A series of convenience methods are supplied to ease the burden of the developer. Because inlining the checks can improve per character performance, the tables of character properties are public. Using the character as an index into the CHARS array and applying the appropriate mask flag (e.g. MASK_VALID), yields the same results as calling the convenience methods. There is one exception: check the comments for the isValid method for details.

Authors
Glenn Marcy, IBM, Andy Clark, IBM, Eric Ye, IBM, Arnaud Le Hors, IBM, Michael Glavassevich, IBM, Rahul Srivastava, Sun Microsystems Inc.

Field Summary

Modifier and TypeField and Description
private static final byte[]
CHARS

Character flags.

public static final int
MASK_CONTENT

Content character mask.

public static final int
MASK_NAME

Name character mask.

public static final int
MASK_NAME_START

Name start character mask.

public static final int
MASK_NCNAME

NCName character mask.

public static final int
MASK_NCNAME_START

NCName start character mask.

public static final int
MASK_PUBID

Pubid character mask.

public static final int
MASK_SPACE

Space character mask.

public static final int
MASK_VALID

Valid character mask.

Constructor Summary

AccessConstructor and Description
public

Method Summary

Modifier and TypeMethod and Description
public static char
highSurrogate(int
The supplemental character to "split".
c
)

Returns the high surrogate of a supplemental character

public static boolean
isContent(int
The character to check.
c
)

Returns true if the specified character can be considered content.

public static boolean
isHighSurrogate(int
The character to check.
c
)

Returns whether the given character is a high surrogate

public static boolean
isInvalid(int
The character to check.
c
)

Returns true if the specified character is invalid.

public static boolean
isLowSurrogate(int
The character to check.
c
)

Returns whether the given character is a low surrogate

public static boolean
isMarkup(int
The character to check.
c
)

Returns true if the specified character can be considered markup.

public static boolean
isName(int
The character to check.
c
)

Returns true if the specified character is a valid name character as defined by production [4] in the XML 1.0 specification.

public static boolean
isNameStart(int
The character to check.
c
)

Returns true if the specified character is a valid name start character as defined by production [5] in the XML 1.0 specification.

public static boolean
isNCName(int
The character to check.
c
)

Returns true if the specified character is a valid NCName character as defined by production [5] in Namespaces in XML recommendation.

public static boolean
isNCNameStart(int
The character to check.
c
)

Returns true if the specified character is a valid NCName start character as defined by production [4] in Namespaces in XML recommendation.

public static boolean
isPubid(int
The character to check.
c
)

Returns true if the specified character is a valid Pubid character as defined by production [13] in the XML 1.0 specification.

public static boolean
isSpace(int
The character to check.
c
)

Returns true if the specified character is a space character as defined by production [3] in the XML 1.0 specification.

public static boolean
isSupplemental(int
The character to check.
c
)

Returns true if the specified character is a supplemental character.

public static boolean
isValid(int
The character to check.
c
)

Returns true if the specified character is valid.

public static boolean
isValidIANAEncoding(String
The IANA encoding name.
ianaEncoding
)

Returns true if the encoding name is a valid IANA encoding.

public static boolean
isValidJavaEncoding(String
The Java encoding name.
javaEncoding
)

Returns true if the encoding name is a valid Java encoding.

public static boolean

Returns:

true if name is a valid Name
isValidName
(String
string to check
name
)

Check to see if a string is a valid Name according to [5] in the XML 1.0 Recommendation

public static boolean

Returns:

true if name is a valid NCName
isValidNCName
(String
string to check
ncName
)

Check to see if a string is a valid NCName according to [4] from the XML Namespaces 1.0 Recommendation

public static boolean

Returns:

true if nmtoken is a valid Nmtoken
isValidNmtoken
(String
string to check
nmtoken
)

Check to see if a string is a valid Nmtoken according to [7] in the XML 1.0 Recommendation

public static char
lowSurrogate(int
The supplemental character to "split".
c
)

Returns the low surrogate of a supplemental character

public static int
supplemental(char
The high surrogate.
h
,
char
The low surrogate.
l
)

Returns true the supplemental character corresponding to the given surrogates.

public static String

Returns:

the given string with the space characters trimmed from both ends
trim
(String
the string to be trimmed
value
)

Trims space characters as defined by production [3] in the XML 1.0 specification from both ends of the given string.

Inherited from java.lang.Object:
cloneequalsfinalizegetClasshashCodenotifynotifyAlltoStringwaitwaitwait

Field Detail

CHARSback to summary
private static final byte[] CHARS

Character flags.

MASK_CONTENTback to summary
public static final int MASK_CONTENT

Content character mask. Special characters are those that can be considered the start of markup, such as '<' and '&'. The various newline characters are considered special as well. All other valid XML characters can be considered content.

This is an optimization for the inner loop of character scanning.

MASK_NAMEback to summary
public static final int MASK_NAME

Name character mask.

MASK_NAME_STARTback to summary
public static final int MASK_NAME_START

Name start character mask.

MASK_NCNAMEback to summary
public static final int MASK_NCNAME

NCName character mask.

MASK_NCNAME_STARTback to summary
public static final int MASK_NCNAME_START

NCName start character mask.

MASK_PUBIDback to summary
public static final int MASK_PUBID

Pubid character mask.

MASK_SPACEback to summary
public static final int MASK_SPACE

Space character mask.

MASK_VALIDback to summary
public static final int MASK_VALID

Valid character mask.

Constructor Detail

XMLCharback to summary
public XMLChar()

Method Detail

highSurrogateback to summary
public static char highSurrogate(int c)

Returns the high surrogate of a supplemental character

Parameters
c:int

The supplemental character to "split".

isContentback to summary
public static boolean isContent(int c)

Returns true if the specified character can be considered content.

Parameters
c:int

The character to check.

isHighSurrogateback to summary
public static boolean isHighSurrogate(int c)

Returns whether the given character is a high surrogate

Parameters
c:int

The character to check.

isInvalidback to summary
public static boolean isInvalid(int c)

Returns true if the specified character is invalid.

Parameters
c:int

The character to check.

isLowSurrogateback to summary
public static boolean isLowSurrogate(int c)

Returns whether the given character is a low surrogate

Parameters
c:int

The character to check.

isMarkupback to summary
public static boolean isMarkup(int c)

Returns true if the specified character can be considered markup. Markup characters include '<', '&', and '%'.

Parameters
c:int

The character to check.

isNameback to summary
public static boolean isName(int c)

Returns true if the specified character is a valid name character as defined by production [4] in the XML 1.0 specification.

Parameters
c:int

The character to check.

isNameStartback to summary
public static boolean isNameStart(int c)

Returns true if the specified character is a valid name start character as defined by production [5] in the XML 1.0 specification.

Parameters
c:int

The character to check.

isNCNameback to summary
public static boolean isNCName(int c)

Returns true if the specified character is a valid NCName character as defined by production [5] in Namespaces in XML recommendation.

Parameters
c:int

The character to check.

isNCNameStartback to summary
public static boolean isNCNameStart(int c)

Returns true if the specified character is a valid NCName start character as defined by production [4] in Namespaces in XML recommendation.

Parameters
c:int

The character to check.

isPubidback to summary
public static boolean isPubid(int c)

Returns true if the specified character is a valid Pubid character as defined by production [13] in the XML 1.0 specification.

Parameters
c:int

The character to check.

isSpaceback to summary
public static boolean isSpace(int c)

Returns true if the specified character is a space character as defined by production [3] in the XML 1.0 specification.

Parameters
c:int

The character to check.

isSupplementalback to summary
public static boolean isSupplemental(int c)

Returns true if the specified character is a supplemental character.

Parameters
c:int

The character to check.

isValidback to summary
public static boolean isValid(int c)

Returns true if the specified character is valid. This method also checks the surrogate character range from 0x10000 to 0x10FFFF.

If the program chooses to apply the mask directly to the CHARS array, then they are responsible for checking the surrogate character range.

Parameters
c:int

The character to check.

isValidIANAEncodingback to summary
public static boolean isValidIANAEncoding(String ianaEncoding)

Returns true if the encoding name is a valid IANA encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an IANA encoding name.

Parameters
ianaEncoding:String

The IANA encoding name.

isValidJavaEncodingback to summary
public static boolean isValidJavaEncoding(String javaEncoding)

Returns true if the encoding name is a valid Java encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an Java encoding name.

Parameters
javaEncoding:String

The Java encoding name.

isValidNameback to summary
public static boolean isValidName(String name)

Check to see if a string is a valid Name according to [5] in the XML 1.0 Recommendation

Parameters
name:String

string to check

Returns:boolean

true if name is a valid Name

isValidNCNameback to summary
public static boolean isValidNCName(String ncName)

Check to see if a string is a valid NCName according to [4] from the XML Namespaces 1.0 Recommendation

Parameters
ncName:String

string to check

Returns:boolean

true if name is a valid NCName

isValidNmtokenback to summary
public static boolean isValidNmtoken(String nmtoken)

Check to see if a string is a valid Nmtoken according to [7] in the XML 1.0 Recommendation

Parameters
nmtoken:String

string to check

Returns:boolean

true if nmtoken is a valid Nmtoken

lowSurrogateback to summary
public static char lowSurrogate(int c)

Returns the low surrogate of a supplemental character

Parameters
c:int

The supplemental character to "split".

supplementalback to summary
public static int supplemental(char h, char l)

Returns true the supplemental character corresponding to the given surrogates.

Parameters
h:char

The high surrogate.

l:char

The low surrogate.

trimback to summary
public static String trim(String value)

Trims space characters as defined by production [3] in the XML 1.0 specification from both ends of the given string.

Parameters
value:String

the string to be trimmed

Returns:String

the given string with the space characters trimmed from both ends