Modifier and Type | Class and Description |
---|---|
private static class | CharInfo.
Simple class for fast lookup of char values, when used with hashtables. |
Modifier and Type | Field and Description |
---|---|
private int[] | array_of_bits
An array of bits to record if the character is in the set. |
private static final int | ASCII_MAX
Copy the first 0,1 ... |
private int | |
public static final String | HTML_ENTITIES_RESOURCE
The name of the HTML entities file. |
private boolean[] | |
private boolean[] | isSpecialAttrASCII
Array of values is faster access than a set of bits to quickly check ASCII characters in attribute values. |
private boolean[] | isSpecialTextASCII
Array of values is faster access than a set of bits to quickly check ASCII characters in text nodes. |
private static final int | |
private Map | m_charToString
Given a character, lookup a String to output (e.g. a decorated entity reference). |
private static Map | m_getCharInfoCache
Table of user-specified char infos. |
pack-priv final boolean | onlyQuotAmpLtGt
This flag is an optimization for HTML entities. |
public static final char | S_CARRIAGERETURN
The carriage return character, which the parser should always normalize. |
public static final char | S_HORIZONAL_TAB
The horizontal tab character, which the parser should always normalize. |
public static final char | S_LINEFEED
The linefeed character, which the parser should always normalize. |
private static final int | |
public static final String | XML_ENTITIES_RESOURCE
The name of the XML entities file. |
Access | Constructor and Description |
---|---|
private | |
private |
Modifier and Type | Method and Description |
---|---|
private static int | arrayIndex(int
the integer that might be in the set of integers i)Returns the array element holding the bit value for the given integer |
private static int | bit(int i)
For a given integer in the set it returns the single bit value used within a given word that represents whether the integer is in the set or not. |
private int[] | createEmptySetOfIntegers(int
the maximum integer to be in the set. max)Creates a new empty set of integers (characters) |
private void | |
private void | defineEntity(String
The entity's name name, char The entity's value value)Defines a new character reference. |
private boolean | Returns: true if the entityThe value of the character that has an entity defined
for it. entityValue)
|
private final boolean | get(int
an integer that is tested to see if it is the
set of integers, or not. i)Return true if the integer (character)is in the set of integers. |
pack-priv static CharInfo | Returns: an instance of CharInfoName of entities resource file that should
be loaded, which describes the mapping of characters to entity references. entitiesFileName, String the output method type, which should be one of "xml", "html", and "text". method)Constructs a CharInfo object using the following process to try reading the entitiesFileName parameter: 1) attempt to load it as a ResourceBundle 2) try using the class loader to find the specified file 3) try opening it as an URI In case of 2 and 3, the resource file must be encoded in UTF-8 and have the following format: # First char # is a comment Entity numericValue quot 34 amp 38 |
pack-priv static CharInfo | Returns: an instance of CharInfoName of entities resource file that should
be loaded, which describes the mapping of characters to entity references. entitiesFileName, String the output method type, which should be one of "xml", "html", and "text". method)Read an internal resource file that describes the mapping of characters to entity references; Construct a CharInfo object. |
pack-priv String | Returns: The String that the character is mapped to, or null if not found.The character that should be resolved to
a String, e.g. resolve '>' to "<". value)Map a character to a String. |
pack-priv final boolean | Returns: true if the character should have any special treatment, such as when writing out attribute values, or entity references.the value of a character that is in an attribute value value)Tell if the character argument that is from an attribute value should have special treatment. |
pack-priv final boolean | Returns: true if the character should have any special treatment, such as when writing out attribute values, or entity references.the value of a character that is in a text node value)Tell if the character argument that is from a text node should have special treatment. |
pack-priv final boolean | Returns: true if the character can go to the writer as-isthe character to check (0 to 127). value)This method is used to determine if an ASCII character in a text node (not an attribute value) is "clean". |
private final void | set(int
the integer to add to the set, valid values are
0, 1, 2 ... up to the maximum that was specified at
the creation of the set. i)Adds the integer (character) to the set of integers. |
private void | setASCIIclean(int j)
If the character is a printable ASCII character then mark it as and not needing replacement with a String on output. |
private void | setASCIIdirty(int j)
If the character is a printable ASCII character then mark it as not clean and needing replacement with a String on output. |
array_of_bits | back to summary |
---|---|
private int[] array_of_bits An array of bits to record if the character is in the set. Although information in this array is complete, the isSpecialAttrASCII array is used first because access to its values is common and faster. |
ASCII_MAX | back to summary |
---|---|
private static final int ASCII_MAX Copy the first 0,1 ... ASCII_MAX values into an array |
firstWordNotUsed | back to summary |
---|---|
private int firstWordNotUsed |
HTML_ENTITIES_RESOURCE | back to summary |
---|---|
public static final String HTML_ENTITIES_RESOURCE The name of the HTML entities file. If specified, the file will be resource loaded with the default class loader. |
isCleanTextASCII | back to summary |
---|---|
private boolean[] isCleanTextASCII |
isSpecialAttrASCII | back to summary |
---|---|
private boolean[] isSpecialAttrASCII Array of values is faster access than a set of bits to quickly check ASCII characters in attribute values. |
isSpecialTextASCII | back to summary |
---|---|
private boolean[] isSpecialTextASCII Array of values is faster access than a set of bits to quickly check ASCII characters in text nodes. |
LOW_ORDER_BITMASK | back to summary |
---|---|
private static final int LOW_ORDER_BITMASK |
m_charToString | back to summary |
---|---|
private Map<CharInfo. Given a character, lookup a String to output (e.g. a decorated entity reference). |
m_getCharInfoCache | back to summary |
---|---|
private static Map<String, CharInfo> m_getCharInfoCache Table of user-specified char infos. |
onlyQuotAmpLtGt | back to summary |
---|---|
pack-priv final boolean onlyQuotAmpLtGt This flag is an optimization for HTML entities. It false if entities other than quot (34), amp (38), lt (60) and gt (62) are defined in the range 0 to 127. |
S_CARRIAGERETURN | back to summary |
---|---|
public static final char S_CARRIAGERETURN The carriage return character, which the parser should always normalize. |
S_HORIZONAL_TAB | back to summary |
---|---|
public static final char S_HORIZONAL_TAB The horizontal tab character, which the parser should always normalize. |
S_LINEFEED | back to summary |
---|---|
public static final char S_LINEFEED The linefeed character, which the parser should always normalize. |
SHIFT_PER_WORD | back to summary |
---|---|
private static final int SHIFT_PER_WORD |
XML_ENTITIES_RESOURCE | back to summary |
---|---|
public static final String XML_ENTITIES_RESOURCE The name of the XML entities file. If specified, the file will be resource loaded with the default class loader. |
CharInfo | back to summary |
---|---|
private CharInfo(String entitiesResource, String method) Constructor that reads in a resource file that describes the mapping of characters to entity references. This constructor is private, just to force the use of the getCharInfo(entitiesResource) factory Resource files must be encoded in UTF-8 and can either be properties files with a .properties extension assumed. Alternatively, they can have the following form, with no particular extension assumed: # First char # is a comment Entity numericValue quot 34 amp 38
|
CharInfo | back to summary |
---|---|
private CharInfo(String entitiesResource, String method, boolean internal) |
arrayIndex | back to summary |
---|---|
private static int arrayIndex(int i) Returns the array element holding the bit value for the given integer
|
bit | back to summary |
---|---|
private static int bit(int i) For a given integer in the set it returns the single bit value used within a given word that represents whether the integer is in the set or not. |
createEmptySetOfIntegers | back to summary |
---|---|
private int[] createEmptySetOfIntegers(int max) Creates a new empty set of integers (characters)
|
defineChar2StringMapping | back to summary |
---|---|
private void defineChar2StringMapping(String outputString, char inputChar) |
defineEntity | back to summary |
---|---|
private void defineEntity(String name, char value) Defines a new character reference. The reference's name and value are supplied. Nothing happens if the character reference is already defined. Unlike internal entities, character references are a string to single character mapping. They are used to map non-ASCII characters both on parsing and printing, primarily for HTML documents. '<amp;' is an example of a character reference.
|
extraEntity | back to summary |
---|---|
private boolean extraEntity(int entityValue)
|
get | back to summary |
---|---|
private final boolean get(int i) Return true if the integer (character)is in the set of integers. This implementation uses an array of integers with 32 bits per integer. If a bit is set to 1 the corresponding integer is in the set of integers.
|
getCharInfo | back to summary |
---|---|
pack-priv static CharInfo getCharInfo(String entitiesFileName, String method) Constructs a CharInfo object using the following process to try reading the entitiesFileName parameter: 1) attempt to load it as a ResourceBundle 2) try using the class loader to find the specified file 3) try opening it as an URI In case of 2 and 3, the resource file must be encoded in UTF-8 and have the following format: # First char # is a comment Entity numericValue quot 34 amp 38 |
getCharInfoInternal | back to summary |
---|---|
pack-priv static CharInfo getCharInfoInternal(String entitiesFileName, String method) Read an internal resource file that describes the mapping of characters to entity references; Construct a CharInfo object. |
getOutputStringForChar | back to summary |
---|---|
pack-priv String getOutputStringForChar(char value) Map a character to a String. For example given the character '>' this method would return the fully decorated entity name "<". Strings for entity references are loaded from a properties file, but additional mappings defined through calls to defineChar2String() are possible. Such entity reference mappings could be over-ridden. This is reusing a stored key object, in an effort to avoid heap activity. Unfortunately, that introduces a threading risk. Simplest fix for now is to make it a synchronized method, or to give up the reuse; I see very little performance difference between them. Long-term solution would be to replace the hashtable with a sparse array keyed directly from the character's integer value; see DTM's string pool for a related solution.
|
isSpecialAttrChar | back to summary |
---|---|
pack-priv final boolean isSpecialAttrChar(int value) Tell if the character argument that is from an attribute value should have special treatment.
|
isSpecialTextChar | back to summary |
---|---|
pack-priv final boolean isSpecialTextChar(int value) Tell if the character argument that is from a text node should have special treatment.
|
isTextASCIIClean | back to summary |
---|---|
pack-priv final boolean isTextASCIIClean(int value) This method is used to determine if an ASCII character in a text node (not an attribute value) is "clean".
|
set | back to summary |
---|---|
private final void set(int i) Adds the integer (character) to the set of integers.
|
setASCIIclean | back to summary |
---|---|
private void setASCIIclean(int j) If the character is a printable ASCII character then mark it as and not needing replacement with a String on output. |
setASCIIdirty | back to summary |
---|---|
private void setASCIIdirty(int j) If the character is a printable ASCII character then mark it as not clean and needing replacement with a String on output. |
Modifier and Type | Field and Description |
---|---|
private char | m_char
String value |
Access | Constructor and Description |
---|---|
public | |
public |
Modifier and Type | Method and Description |
---|---|
public final boolean | Returns: True if this object equals this string valueto compare to obj)Overrides java. Override of equals() for this object |
public final int | Returns: hash value of the character.Overrides java. Get the hash value of the character. |
public final void |
m_char | back to summary |
---|---|
private char m_char String value |
CharKey | back to summary |
---|---|
public CharKey(char key) Constructor CharKey
|
CharKey | back to summary |
---|---|
public CharKey() Default constructor for a CharKey. |
equals | back to summary |
---|---|
public final boolean equals(Object obj) Overrides java. Override of equals() for this object
|
hashCode | back to summary |
---|---|
public final int hashCode() Overrides java. Get the hash value of the character.
|
setChar | back to summary |
---|---|
public final void setChar(char c) Get the hash value of the character.
|