Top Description Inners Fields Constructors Methods
com.sun.org.apache.xml.internal.serializer

pack-priv final Class CharInfo

extends Object
Class Inheritance
Imports
com.sun.org.apache.xml.internal.serializer.utils.MsgKey, .SystemIDResolver, .Utils, .WrappedRuntimeException, java.io.BufferedReader, .InputStream, .InputStreamReader, .UnsupportedEncodingException, java.net.URL, java.util.Enumeration, .HashMap, .Locale, .Map, .PropertyResourceBundle, .ResourceBundle, javax.xml.transform.TransformerException, jdk.xml.internal.SecuritySupport

This class provides services that tell if a character should have special treatement, such as entity reference substitution or normalization of a newline character. It also provides character to entity reference lookup. DEVELOPERS: See Known Issue in the constructor.

Nested and Inner Type Summary

Modifier and TypeClass and Description
private static class
CharInfo.CharKey

Simple class for fast lookup of char values, when used with hashtables.

Field Summary

Modifier and TypeField and Description
private int[]
array_of_bits

An array of bits to record if the character is in the set.

private static final int
ASCII_MAX

Copy the first 0,1 ...

private int
public static final String
HTML_ENTITIES_RESOURCE

The name of the HTML entities file.

private boolean[]
private boolean[]
isSpecialAttrASCII

Array of values is faster access than a set of bits to quickly check ASCII characters in attribute values.

private boolean[]
isSpecialTextASCII

Array of values is faster access than a set of bits to quickly check ASCII characters in text nodes.

private static final int
private Map<CharInfo.CharKey, String>
m_charToString

Given a character, lookup a String to output (e.g. a decorated entity reference).

private static Map<String, CharInfo>
m_getCharInfoCache

Table of user-specified char infos.

pack-priv final boolean
onlyQuotAmpLtGt

This flag is an optimization for HTML entities.

public static final char
S_CARRIAGERETURN

The carriage return character, which the parser should always normalize.

public static final char
S_HORIZONAL_TAB

The horizontal tab character, which the parser should always normalize.

public static final char
S_LINEFEED

The linefeed character, which the parser should always normalize.

private static final int
public static final String
XML_ENTITIES_RESOURCE

The name of the XML entities file.

Constructor Summary

AccessConstructor and Description
private
CharInfo(String
Name of properties or resource file that should be loaded, which describes that mapping of characters to entity references.
entitiesResource
,
String method)

Constructor that reads in a resource file that describes the mapping of characters to entity references.

private
CharInfo(String entitiesResource, String method, boolean internal)

Method Summary

Modifier and TypeMethod and Description
private static int
arrayIndex(int
the integer that might be in the set of integers
i
)

Returns the array element holding the bit value for the given integer

private static int
bit(int i)

For a given integer in the set it returns the single bit value used within a given word that represents whether the integer is in the set or not.

private int[]
createEmptySetOfIntegers(int
the maximum integer to be in the set.
max
)

Creates a new empty set of integers (characters)

private void
defineChar2StringMapping(String outputString, char inputChar)

private void
defineEntity(String
The entity's name
name
,
char
The entity's value
value
)

Defines a new character reference.

private boolean

Returns:

true if the entity
extraEntity
(int
The value of the character that has an entity defined for it.
entityValue
)

private final boolean
get(int
an integer that is tested to see if it is the set of integers, or not.
i
)

Return true if the integer (character)is in the set of integers.

pack-priv static CharInfo

Returns:

an instance of CharInfo
getCharInfo
(String
Name of entities resource file that should be loaded, which describes the mapping of characters to entity references.
entitiesFileName
,
String
the output method type, which should be one of "xml", "html", and "text".
method
)

Constructs a CharInfo object using the following process to try reading the entitiesFileName parameter: 1) attempt to load it as a ResourceBundle 2) try using the class loader to find the specified file 3) try opening it as an URI In case of 2 and 3, the resource file must be encoded in UTF-8 and have the following format:

# First char # is a comment
Entity numericValue
quot 34
amp 38
pack-priv static CharInfo

Returns:

an instance of CharInfo
getCharInfoInternal
(String
Name of entities resource file that should be loaded, which describes the mapping of characters to entity references.
entitiesFileName
,
String
the output method type, which should be one of "xml", "html", and "text".
method
)

Read an internal resource file that describes the mapping of characters to entity references; Construct a CharInfo object.

pack-priv String

Returns:

The String that the character is mapped to, or null if not found.
getOutputStringForChar
(char
The character that should be resolved to a String, e.g. resolve '>' to "<".
value
)

Map a character to a String.

pack-priv final boolean

Returns:

true if the character should have any special treatment, such as when writing out attribute values, or entity references.
isSpecialAttrChar
(int
the value of a character that is in an attribute value
value
)

Tell if the character argument that is from an attribute value should have special treatment.

pack-priv final boolean

Returns:

true if the character should have any special treatment, such as when writing out attribute values, or entity references.
isSpecialTextChar
(int
the value of a character that is in a text node
value
)

Tell if the character argument that is from a text node should have special treatment.

pack-priv final boolean

Returns:

true if the character can go to the writer as-is
isTextASCIIClean
(int
the character to check (0 to 127).
value
)

This method is used to determine if an ASCII character in a text node (not an attribute value) is "clean".

private final void
set(int
the integer to add to the set, valid values are 0, 1, 2 ... up to the maximum that was specified at the creation of the set.
i
)

Adds the integer (character) to the set of integers.

private void
setASCIIclean(int j)

If the character is a printable ASCII character then mark it as and not needing replacement with a String on output.

private void
setASCIIdirty(int j)

If the character is a printable ASCII character then mark it as not clean and needing replacement with a String on output.

Inherited from java.lang.Object:
cloneequalsfinalizegetClasshashCodenotifynotifyAlltoStringwaitwaitwait

Field Detail

array_of_bitsback to summary
private int[] array_of_bits

An array of bits to record if the character is in the set. Although information in this array is complete, the isSpecialAttrASCII array is used first because access to its values is common and faster.

ASCII_MAXback to summary
private static final int ASCII_MAX

Copy the first 0,1 ... ASCII_MAX values into an array

firstWordNotUsedback to summary
private int firstWordNotUsed
HTML_ENTITIES_RESOURCEback to summary
public static final String HTML_ENTITIES_RESOURCE

The name of the HTML entities file. If specified, the file will be resource loaded with the default class loader.

isCleanTextASCIIback to summary
private boolean[] isCleanTextASCII
isSpecialAttrASCIIback to summary
private boolean[] isSpecialAttrASCII

Array of values is faster access than a set of bits to quickly check ASCII characters in attribute values.

isSpecialTextASCIIback to summary
private boolean[] isSpecialTextASCII

Array of values is faster access than a set of bits to quickly check ASCII characters in text nodes.

LOW_ORDER_BITMASKback to summary
private static final int LOW_ORDER_BITMASK
m_charToStringback to summary
private Map<CharInfo.CharKey, String> m_charToString

Given a character, lookup a String to output (e.g. a decorated entity reference).

m_getCharInfoCacheback to summary
private static Map<String, CharInfo> m_getCharInfoCache

Table of user-specified char infos.

onlyQuotAmpLtGtback to summary
pack-priv final boolean onlyQuotAmpLtGt

This flag is an optimization for HTML entities. It false if entities other than quot (34), amp (38), lt (60) and gt (62) are defined in the range 0 to 127.

S_CARRIAGERETURNback to summary
public static final char S_CARRIAGERETURN

The carriage return character, which the parser should always normalize.

S_HORIZONAL_TABback to summary
public static final char S_HORIZONAL_TAB

The horizontal tab character, which the parser should always normalize.

S_LINEFEEDback to summary
public static final char S_LINEFEED

The linefeed character, which the parser should always normalize.

SHIFT_PER_WORDback to summary
private static final int SHIFT_PER_WORD
XML_ENTITIES_RESOURCEback to summary
public static final String XML_ENTITIES_RESOURCE

The name of the XML entities file. If specified, the file will be resource loaded with the default class loader.

Constructor Detail

CharInfoback to summary
private CharInfo(String entitiesResource, String method)

Constructor that reads in a resource file that describes the mapping of characters to entity references. This constructor is private, just to force the use of the getCharInfo(entitiesResource) factory Resource files must be encoded in UTF-8 and can either be properties files with a .properties extension assumed. Alternatively, they can have the following form, with no particular extension assumed:

# First char # is a comment
Entity numericValue
quot 34
amp 38
Parameters
entitiesResource:String

Name of properties or resource file that should be loaded, which describes that mapping of characters to entity references.

CharInfoback to summary
private CharInfo(String entitiesResource, String method, boolean internal)

Method Detail

arrayIndexback to summary
private static int arrayIndex(int i)

Returns the array element holding the bit value for the given integer

Parameters
i:int

the integer that might be in the set of integers

bitback to summary
private static int bit(int i)

For a given integer in the set it returns the single bit value used within a given word that represents whether the integer is in the set or not.

createEmptySetOfIntegersback to summary
private int[] createEmptySetOfIntegers(int max)

Creates a new empty set of integers (characters)

Parameters
max:int

the maximum integer to be in the set.

defineChar2StringMappingback to summary
private void defineChar2StringMapping(String outputString, char inputChar)
defineEntityback to summary
private void defineEntity(String name, char value)

Defines a new character reference. The reference's name and value are supplied. Nothing happens if the character reference is already defined.

Unlike internal entities, character references are a string to single character mapping. They are used to map non-ASCII characters both on parsing and printing, primarily for HTML documents. '<amp;' is an example of a character reference.

Parameters
name:String

The entity's name

value:char

The entity's value

extraEntityback to summary
private boolean extraEntity(int entityValue)
Parameters
entityValue:int

The value of the character that has an entity defined for it.

Returns:boolean

true if the entity

getback to summary
private final boolean get(int i)

Return true if the integer (character)is in the set of integers. This implementation uses an array of integers with 32 bits per integer. If a bit is set to 1 the corresponding integer is in the set of integers.

Parameters
i:int

an integer that is tested to see if it is the set of integers, or not.

getCharInfoback to summary
pack-priv static CharInfo getCharInfo(String entitiesFileName, String method)

Constructs a CharInfo object using the following process to try reading the entitiesFileName parameter: 1) attempt to load it as a ResourceBundle 2) try using the class loader to find the specified file 3) try opening it as an URI In case of 2 and 3, the resource file must be encoded in UTF-8 and have the following format:

# First char # is a comment
Entity numericValue
quot 34
amp 38
Parameters
entitiesFileName:String

Name of entities resource file that should be loaded, which describes the mapping of characters to entity references.

method:String

the output method type, which should be one of "xml", "html", and "text".

Returns:CharInfo

an instance of CharInfo

getCharInfoInternalback to summary
pack-priv static CharInfo getCharInfoInternal(String entitiesFileName, String method)

Read an internal resource file that describes the mapping of characters to entity references; Construct a CharInfo object.

Parameters
entitiesFileName:String

Name of entities resource file that should be loaded, which describes the mapping of characters to entity references.

method:String

the output method type, which should be one of "xml", "html", and "text".

Returns:CharInfo

an instance of CharInfo

getOutputStringForCharback to summary
pack-priv String getOutputStringForChar(char value)

Map a character to a String. For example given the character '>' this method would return the fully decorated entity name "<". Strings for entity references are loaded from a properties file, but additional mappings defined through calls to defineChar2String() are possible. Such entity reference mappings could be over-ridden. This is reusing a stored key object, in an effort to avoid heap activity. Unfortunately, that introduces a threading risk. Simplest fix for now is to make it a synchronized method, or to give up the reuse; I see very little performance difference between them. Long-term solution would be to replace the hashtable with a sparse array keyed directly from the character's integer value; see DTM's string pool for a related solution.

Parameters
value:char

The character that should be resolved to a String, e.g. resolve '>' to "<".

Returns:String

The String that the character is mapped to, or null if not found.

isSpecialAttrCharback to summary
pack-priv final boolean isSpecialAttrChar(int value)

Tell if the character argument that is from an attribute value should have special treatment.

Parameters
value:int

the value of a character that is in an attribute value

Returns:boolean

true if the character should have any special treatment, such as when writing out attribute values, or entity references.

isSpecialTextCharback to summary
pack-priv final boolean isSpecialTextChar(int value)

Tell if the character argument that is from a text node should have special treatment.

Parameters
value:int

the value of a character that is in a text node

Returns:boolean

true if the character should have any special treatment, such as when writing out attribute values, or entity references.

isTextASCIICleanback to summary
pack-priv final boolean isTextASCIIClean(int value)

This method is used to determine if an ASCII character in a text node (not an attribute value) is "clean".

Parameters
value:int

the character to check (0 to 127).

Returns:boolean

true if the character can go to the writer as-is

setback to summary
private final void set(int i)

Adds the integer (character) to the set of integers.

Parameters
i:int

the integer to add to the set, valid values are 0, 1, 2 ... up to the maximum that was specified at the creation of the set.

setASCIIcleanback to summary
private void setASCIIclean(int j)

If the character is a printable ASCII character then mark it as and not needing replacement with a String on output.

setASCIIdirtyback to summary
private void setASCIIdirty(int j)

If the character is a printable ASCII character then mark it as not clean and needing replacement with a String on output.

com.sun.org.apache.xml.internal.serializer back to summary

private Class CharInfo.CharKey

extends Object
Class Inheritance

Simple class for fast lookup of char values, when used with hashtables. You can set the char, then use it as a key. This class is a copy of the one in com.sun.org.apache.xml.internal.utils. It exists to cut the serializers dependancy on that package.

Field Summary

Modifier and TypeField and Description
private char
m_char

String value

Constructor Summary

AccessConstructor and Description
public
CharKey(char
char value of this object.
key
)

Constructor CharKey

public
CharKey()

Default constructor for a CharKey.

Method Summary

Modifier and TypeMethod and Description
public final boolean

Returns:

True if this object equals this string value
equals
(Object
to compare to
obj
)

Overrides java.lang.Object.equals.

Override of equals() for this object

public final int

Returns:

hash value of the character.
hashCode
()

Overrides java.lang.Object.hashCode.

Get the hash value of the character.

public final void

Returns:

hash value of the character.
setChar
(char c)

Get the hash value of the character.

Inherited from java.lang.Object:
clonefinalizegetClassnotifynotifyAlltoStringwaitwaitwait

Field Detail

m_charback to summary
private char m_char

String value

Constructor Detail

CharKeyback to summary
public CharKey(char key)

Constructor CharKey

Parameters
key:char

char value of this object.

CharKeyback to summary
public CharKey()

Default constructor for a CharKey.

Method Detail

equalsback to summary
public final boolean equals(Object obj)

Overrides java.lang.Object.equals.

Override of equals() for this object

Parameters
obj:Object

to compare to

Returns:boolean

True if this object equals this string value

hashCodeback to summary
public final int hashCode()

Overrides java.lang.Object.hashCode.

Get the hash value of the character.

Returns:int

hash value of the character.

setCharback to summary
public final void setChar(char c)

Get the hash value of the character.

Returns:void

hash value of the character.