DOM2DTM
class serves up a DOM's contents via the
DTM API.
Note that it doesn't necessarily represent a full Document
tree. You can wrap a DOM2DTM around a specific node and its subtree
and the right things should happen. (I don't _think_ we currently
support DocumentFrgment nodes as roots, though that might be worth
considering.)
Note too that we do not currently attempt to track document
mutation. If you alter the DOM after wrapping DOM2DTM around it,
all bets are off.
Modifier and Type | Class and Description |
---|---|
public static interface |
Modifier and Type | Field and Description |
---|---|
pack-priv static final boolean | |
pack-priv static final boolean | |
private int | m_last_kid
The current position in the DTM tree. |
private int | m_last_parent
The current position in the DTM tree. |
protected List | m_nodes
The node objects. |
private transient boolean | m_nodesAreProcessed
true if ALL the nodes in the m_root subtree have been processed; false if our incremental build has not yet finished scanning the DOM tree. |
private transient Node | m_pos
The current position in the DOM tree. |
pack-priv boolean | m_processedFirstElement
True iff the first element has been processed. |
private transient Node | m_root
The top of the subtree. |
pack-priv TreeWalker | |
pack-priv static final String | NAMESPACE_DECL_NS
Manefest constant |
Access | Constructor and Description |
---|---|
public | DOM2DTM(DTMManager
The DTMManager who owns this DTM. mgr, DOMSource the DOM source that this DTM will wrap. domSource, int The DTM identity ID for this DTM. dtmIdentity, DTMWSFilter The white space filter for this DTM, which may
be null. whiteSpaceFilter, XMLStringFactory XMLString factory for creating character content. xstringfactory, boolean true if the caller considers it worth it to use
indexing schemes. doIndexing)Construct a DOM2DTM object from a DOM node. |
Modifier and Type | Method and Description |
---|---|
protected int | Returns: The index identity of the node that was added.The node that is to be added to the DTM. node, int The current parent index. parentIndex, int The previous sibling index. previousSibling, int If not DTM.NULL, overrides the DOM node type.
Used to force nodes to Text rather than CDATASection when their
coalesced value includes ordinary Text nodes (current DTM behavior). forceNodeType)Construct the node map from the node. |
public void | dispatchCharactersEvents(int
The node ID. nodeHandle, ContentHandler A non-null reference to a ContentHandler. ch, boolean true if the content should be normalized according to
the rules for the XPath
normalize-space
function. normalize)Implements abstract com. Implements com. Directly call the characters method on the passed ContentHandler for the string-value of the given node (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). |
protected static void | dispatchNodeData(Node
Node whose subtree is to be walked, gathering the
contents of all Text or CDATASection nodes. node, ContentHandler ch, int depth)Retrieve the text content of a DOM subtree, appending it into a user-supplied FastStringBuffer object. |
public void | dispatchToEvents(int
The node ID. nodeHandle, ContentHandler A non-null reference to a ContentHandler. ch)Implements abstract com. Implements com. Directly create SAX parser events from a subtree. |
public int | Returns: The attribute node handle with the specified name (nodeName ) or DTM.NULL if there is no such
attribute.int Handle of the node upon which to look up this attribute.. nodeHandle, String The namespace URI of the attribute to
retrieve, or null. namespaceURI, String The local name of the attribute to
retrieve. name)Implements abstract com. Implements com. Retrieves an attribute node by by qualified name and namespace URI. |
public ContentHandler | Returns: null if this model doesn't respond to SAX events, "this" if the DTM object has a built-in SAX ContentHandler, the IncrmentalSAXSource if we're bound to one and should receive the SAX stream via it for incremental build purposes...Implements com. getContentHandler returns "our SAX builder" -- the thing that someone else should send SAX events to in order to extend this DTM model. |
public DeclHandler | Returns: null if this model doesn't respond to SAX Decl events.Implements com. Return this DTM's DeclHandler. |
public String | Returns: the public identifier String object, or null if there is none.Implements abstract com. Implements com. Return the public identifier of the external subset, normalized as described in 4.2.2 External Entities [XML]. |
public String | Returns: the system identifier String object, or null if there is none.Implements abstract com. Implements com. A document type declaration information item has the following properties: 1. |
public DTDHandler | Returns: null if this model doesn't respond to SAX dtd events.Implements com. Return this DTM's DTDHandler. |
public int | Returns: The handle of the matching element.The unique elementId)id value for an element.Implements abstract com. Implements com. Returns the |
public EntityResolver | Returns: null if this model doesn't respond to SAX entity ref events.Implements com. Return this DTM's EntityResolver. |
public ErrorHandler | Returns: null if this model doesn't respond to SAX error events.Implements com. Return this DTM's ErrorHandler. |
private int | Returns: The node handle orDTM.NULL .A node, which may be null. node)Get the handle from a Node. |
public int | Returns: The node handle orDTM.NULL .A node, which may be null. node)Get the handle from a Node. |
public LexicalHandler | Returns: null if this model doesn't respond to lexical SAX events, "this" if the DTM object has a built-in SAX ContentHandler, the IncrementalSAXSource if we're bound to one and should receive the SAX stream via it for incremental build purposes...Implements com. Return this DTM's lexical handler. |
public String | Returns: String Local name of this node.the id of the node. nodeHandle)Implements abstract com. Implements com. Given a node handle, return its XPath-style localname. |
public String | Returns: String URI value of this node's namespace, or null if no namespace was resolved.the id of the node. nodeHandle)Implements abstract com. Implements com. Given a node handle, return its DOM-style namespace URI (As defined in Namespaces, this is the declared URI which this node's prefix -- or default in lieu thereof -- was mapped to.) %REVIEW% Null or ""? |
protected int | Returns: identity+1, or DTM.NULL.The node identity (index). identity)Implements abstract com. Get the next node identity value in the list, and call the iterator if it hasn't been added yet. |
public Node | Returns: A node representation of the DTM node.The node ID. nodeHandle)Overrides com. Implements com. Return an DOM node for the given node. |
protected static void | getNodeData(Node
Node whose subtree is to be walked, gathering the
contents of all Text or CDATASection nodes. node, FastStringBuffer FastStringBuffer into which the contents of the text
nodes are to be concatenated. buf)Retrieve the text content of a DOM subtree, appending it into a user-supplied FastStringBuffer object. |
public String | Returns: String Name of this node, which may be an empty string. %REVIEW% Document when empty string is possible... %REVIEW-COMMENT% It should never be empty, should it?the id of the node. nodeHandle)Implements abstract com. Implements com. Given a node handle, return its DOM-style node name. |
public String | Returns: String Name of this node, which may be an empty string.the id of the node. nodeHandle)Overrides com. Implements com. Given a node handle, return the XPath node name. |
public String | Returns: String Value of this node, or null if not meaningful for this node type.The node id. nodeHandle)Implements abstract com. Implements com. Given a node handle, return its node value. |
public int | getNumberOfNodes()
Implements abstract com. Get the number of nodes that have been added. |
public String | Returns: String prefix of this node's name, or "" if no explicit namespace prefix was given.the id of the node. nodeHandle)Implements abstract com. Implements com. Given a namespace handle, return the prefix that the namespace decl is mapping. |
public SourceLocator | Returns: nullan node)int valueImplements com. No source information is available for DOM2DTM, so return
|
public XMLString | Returns: A string object that represents the string-value of the given node.The node ID. nodeHandle)Implements abstract com. Implements com. Get the string-value of a node as a String object (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). |
public String | Returns: String containing the URI of the Unparsed Entity, or an empty string if no such entity exists.A string containing the Entity Name of the unparsed
entity. name)Implements abstract com. Implements com. The getUnparsedEntityURI function returns the URI of the unparsed entity with the specified name in the same document as the context node (see [3.3 Unparsed Entities]). |
public boolean | Returns: true if the attribute was specified;
false if it was defaulted.the attribute handle attributeHandle)Implements abstract com. Implements com. 5. |
private static boolean | Returns: =true if ch is XML whitespace; otherwise =false.Character to check as XML whitespace. ch)Returns whether the specified ch conforms to the XML 1.0 definition of whitespace. |
public boolean | Returns: Return true if the given node is whitespace.The node Handle. nodeHandle)Determine if the string-value of a node is whitespace |
private Node | logicalNextDOMTextNode(Node n)
Utility function: Given a DOM Text node, determine whether it is logically followed by another Text or CDATASection node. |
protected Node | |
public boolean | Returns: true iff we're building this model incrementally (eg we're partnered with a IncrementalSAXSource) and thus require that the transformation and the parse run simultaneously. Guidance to the DTMManager.Implements com.
|
protected boolean | Returns: The true if a next node is found or false if there are no more nodes.Implements abstract com. This method iterates to the next node that will be added to the table. |
public void | setIncrementalSAXSource(IncrementalSAXSource
The IncrementalSAXSource that we want to recieve events from
on demand. source)Bind an IncrementalSAXSource to this DTM. |
public void | setProperty(String
a property, Object String valuean value)Object valueImplements com. For the moment all the run time properties are ignored by this class. |
JJK_DEBUG | back to summary |
---|---|
pack-priv static final boolean JJK_DEBUG Hides com. |
JJK_NEWCODE | back to summary |
---|---|
pack-priv static final boolean JJK_NEWCODE |
m_last_kid | back to summary |
---|---|
private int m_last_kid The current position in the DTM tree. Who children reference as their previous sib. |
m_last_parent | back to summary |
---|---|
private int m_last_parent The current position in the DTM tree. Who children get appended to. |
m_nodes | back to summary |
---|---|
protected List<Node> m_nodes The node objects. The instance part of the handle indexes directly into this vector. Each DTM node may actually be composed of several DOM nodes (for example, if logically-adjacent Text/CDATASection nodes in the DOM have been coalesced into a single DTM Text node); this table points only to the first in that sequence. |
m_nodesAreProcessed | back to summary |
---|---|
private transient boolean m_nodesAreProcessed true if ALL the nodes in the m_root subtree have been processed; false if our incremental build has not yet finished scanning the DOM tree. |
m_pos | back to summary |
---|---|
private transient Node m_pos The current position in the DOM tree. Last node examined for possible copying to DTM. |
m_processedFirstElement | back to summary |
---|---|
pack-priv boolean m_processedFirstElement True iff the first element has been processed. This is used to control synthesis of the implied xml: namespace declaration node. |
m_root | back to summary |
---|---|
private transient Node m_root The top of the subtree. %REVIEW%: 'may not be the same as m_context if "//foo" pattern.' |
m_walker | back to summary |
---|---|
pack-priv TreeWalker m_walker |
NAMESPACE_DECL_NS | back to summary |
---|---|
pack-priv static final String NAMESPACE_DECL_NS Manefest constant |
DOM2DTM | back to summary |
---|---|
public DOM2DTM(DTMManager mgr, DOMSource domSource, int dtmIdentity, DTMWSFilter whiteSpaceFilter, XMLStringFactory xstringfactory, boolean doIndexing) Construct a DOM2DTM object from a DOM node.
|
addNode | back to summary |
---|---|
protected int addNode(Node node, int parentIndex, int previousSibling, int forceNodeType) Construct the node map from the node.
|
dispatchCharactersEvents | back to summary |
---|---|
public void dispatchCharactersEvents(int nodeHandle, ContentHandler ch, boolean normalize) throws SAXException Implements abstract com. Implements com. Directly call the characters method on the passed ContentHandler for the string-value of the given node (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). Multiple calls to the ContentHandler's characters methods may well occur for a single call to this method.
|
dispatchNodeData | back to summary |
---|---|
protected static void dispatchNodeData(Node node, ContentHandler ch, int depth) throws SAXException Retrieve the text content of a DOM subtree, appending it into a user-supplied FastStringBuffer object. Note that attributes are not considered part of the content of an element. There are open questions regarding whitespace stripping. Currently we make no special effort in that regard, since the standard DOM doesn't yet provide DTD-based information to distinguish whitespace-in-element-context from genuine #PCDATA. Note that we should probably also consider xml:space if/when we address this. DOM Level 3 may solve the problem for us. %REVIEW% Note that as a DOM-level operation, it can be argued that this routine _shouldn't_ perform any processing beyond what the DOM already does, and that whitespace stripping and so on belong at the DTM level. If you want a stripped DOM view, wrap DTM2DOM around DOM2DTM.
|
dispatchToEvents | back to summary |
---|---|
public void dispatchToEvents(int nodeHandle, ContentHandler ch) throws SAXException Implements abstract com. Implements com. Directly create SAX parser events from a subtree.
|
getAttributeNode | back to summary |
---|---|
public int getAttributeNode(int nodeHandle, String namespaceURI, String name) Implements abstract com. Implements com. Retrieves an attribute node by by qualified name and namespace URI.
|
getContentHandler | back to summary |
---|---|
public ContentHandler getContentHandler() Implements com. getContentHandler returns "our SAX builder" -- the thing that someone else should send SAX events to in order to extend this DTM model.
|
getDeclHandler | back to summary |
---|---|
public DeclHandler getDeclHandler() Implements com. Return this DTM's DeclHandler.
|
getDocumentTypeDeclarationPublicIdentifier | back to summary |
---|---|
public String getDocumentTypeDeclarationPublicIdentifier() Implements abstract com. Implements com. Return the public identifier of the external subset, normalized as described in 4.2.2 External Entities [XML]. If there is no external subset or if it has no public identifier, this property has no value.
|
getDocumentTypeDeclarationSystemIdentifier | back to summary |
---|---|
public String getDocumentTypeDeclarationSystemIdentifier() Implements abstract com. Implements com. A document type declaration information item has the following properties: 1. [system identifier] The system identifier of the external subset, if it exists. Otherwise this property has no value.
|
getDTDHandler | back to summary |
---|---|
public DTDHandler getDTDHandler() Implements com. Return this DTM's DTDHandler.
|
getElementById | back to summary |
---|---|
public int getElementById(String elementId) Implements abstract com. Implements com. Returns the %REVIEW% Presumably IDs are still scoped to a single document, and this operation searches only within a single document, right? Wouldn't want collisions between DTMs in the same process.
|
getEntityResolver | back to summary |
---|---|
public EntityResolver getEntityResolver() Implements com. Return this DTM's EntityResolver.
|
getErrorHandler | back to summary |
---|---|
public ErrorHandler getErrorHandler() Implements com. Return this DTM's ErrorHandler.
|
getHandleFromNode | back to summary |
---|---|
private int getHandleFromNode(Node node) Get the handle from a Node. %OPT% This will be pretty slow. %OPT% An XPath-like search (walk up DOM to root, tracking path; walk down DTM reconstructing path) might be considerably faster on later nodes in large documents. That might also imply improving this call to handle nodes which would be in this DTM but have not yet been built, which might or might not be a Good Thing. %REVIEW% This relies on being able to test node-identity via object-identity. DTM2DOM proxying is a great example of a case where that doesn't work. DOM Level 3 will provide the isSameNode() method to fix that, but until then this is going to be flaky.
|
getHandleOfNode | back to summary |
---|---|
public int getHandleOfNode(Node node) Get the handle from a Node. This is a more robust version of getHandleFromNode, intended to be usable by the public. %OPT% This will be pretty slow. %REVIEW% This relies on being able to test node-identity via object-identity. DTM2DOM proxying is a great example of a case where that doesn't work. DOM Level 3 will provide the isSameNode() method to fix that, but until then this is going to be flaky.
|
getLexicalHandler | back to summary |
---|---|
public LexicalHandler getLexicalHandler() Implements com. Return this DTM's lexical handler. %REVIEW% Should this return null if constrution already done/begun?
|
getLocalName | back to summary |
---|---|
public String getLocalName(int nodeHandle) Implements abstract com. Implements com. Given a node handle, return its XPath-style localname. (As defined in Namespaces, this is the portion of the name after any colon character).
|
getNamespaceURI | back to summary |
---|---|
public String getNamespaceURI(int nodeHandle) Implements abstract com. Implements com. Given a node handle, return its DOM-style namespace URI (As defined in Namespaces, this is the declared URI which this node's prefix -- or default in lieu thereof -- was mapped to.) %REVIEW% Null or ""? -sb
|
getNextNodeIdentity | back to summary |
---|---|
protected int getNextNodeIdentity(int identity) Implements abstract com. Get the next node identity value in the list, and call the iterator if it hasn't been added yet.
|
getNode | back to summary |
---|---|
public Node getNode(int nodeHandle) Overrides com. Implements com. Return an DOM node for the given node.
|
getNodeData | back to summary |
---|---|
protected static void getNodeData(Node node, FastStringBuffer buf) Retrieve the text content of a DOM subtree, appending it into a user-supplied FastStringBuffer object. Note that attributes are not considered part of the content of an element. There are open questions regarding whitespace stripping. Currently we make no special effort in that regard, since the standard DOM doesn't yet provide DTD-based information to distinguish whitespace-in-element-context from genuine #PCDATA. Note that we should probably also consider xml:space if/when we address this. DOM Level 3 may solve the problem for us. %REVIEW% Actually, since this method operates on the DOM side of the fence rather than the DTM side, it SHOULDN'T do any special handling. The DOM does what the DOM does; if you want DTM-level abstractions, use DTM-level methods.
|
getNodeName | back to summary |
---|---|
public String getNodeName(int nodeHandle) Implements abstract com. Implements com. Given a node handle, return its DOM-style node name. This will include names such as #text or #document.
|
getNodeNameX | back to summary |
---|---|
public String getNodeNameX(int nodeHandle) Overrides com. Implements com. Given a node handle, return the XPath node name. This should be the name as described by the XPath data model, NOT the DOM-style name.
|
getNodeValue | back to summary |
---|---|
public String getNodeValue(int nodeHandle) Implements abstract com. Implements com. Given a node handle, return its node value. This is mostly as defined by the DOM, but may ignore some conveniences.
|
getNumberOfNodes | back to summary |
---|---|
public int getNumberOfNodes() Implements abstract com. Get the number of nodes that have been added.
|
getPrefix | back to summary |
---|---|
public String getPrefix(int nodeHandle) Implements abstract com. Implements com. Given a namespace handle, return the prefix that the namespace decl is mapping. Given a node handle, return the prefix used to map to the namespace. %REVIEW% Are you sure you want "" for no prefix? %REVIEW-COMMENT% I think so... not totally sure. -sb
|
getSourceLocatorFor | back to summary |
---|---|
public SourceLocator getSourceLocatorFor(int node) Implements com. No source information is available for DOM2DTM, so return
|
getStringValue | back to summary |
---|---|
public XMLString getStringValue(int nodeHandle) Implements abstract com. Implements com. Get the string-value of a node as a String object (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value).
|
getUnparsedEntityURI | back to summary |
---|---|
public String getUnparsedEntityURI(String name) Implements abstract com. Implements com. The getUnparsedEntityURI function returns the URI of the unparsed entity with the specified name in the same document as the context node (see [3.3 Unparsed Entities]). It returns the empty string if there is no such entity. XML processors may choose to use the System Identifier (if one is provided) to resolve the entity, rather than the URI in the Public Identifier. The details are dependent on the processor, and we would have to support some form of plug-in resolver to handle this properly. Currently, we simply return the System Identifier if present, and hope that it a usable URI or that our caller can map it to one. Todo Resolve Public Identifiers... or consider changing function name. If we find a relative URI reference, XML expects it to be resolved in terms of the base URI of the document. The DOM doesn't do that for us, and it isn't entirely clear whether that should be done here; currently that's pushed up to a higher level of our application. (Note that DOM Level 1 didn't store the document's base URI.) Todo Consider resolving Relative URIs. (The DOM's statement that "An XML processor may choose to completely expand entities before the structure model is passed to the DOM" refers only to parsed entities, not unparsed, and hence doesn't affect this function.) |
isAttributeSpecified | back to summary |
---|---|
public boolean isAttributeSpecified(int attributeHandle) Implements abstract com. Implements com. 5. [specified] A flag indicating whether this attribute was actually specified in the start-tag of its element, or was defaulted from the DTD.
|
isSpace | back to summary |
---|---|
private static boolean isSpace(char ch) Returns whether the specified ch conforms to the XML 1.0 definition
of whitespace. Refer to
the definition of
|
isWhitespace | back to summary |
---|---|
public boolean isWhitespace(int nodeHandle) Determine if the string-value of a node is whitespace
|
logicalNextDOMTextNode | back to summary |
---|---|
private Node logicalNextDOMTextNode(Node n) Utility function: Given a DOM Text node, determine whether it is logically followed by another Text or CDATASection node. This may involve traversing into Entity References. %REVIEW% DOM Level 3 is expected to add functionality which may allow us to retire this. |
lookupNode | back to summary |
---|---|
protected Node lookupNode(int nodeIdentity) Get a Node from an identity index. NEEDSDOC @param nodeIdentity NEEDSDOC ($objectName$) @return |
needsTwoThreads | back to summary |
---|---|
public boolean needsTwoThreads() Implements com.
|
nextNode | back to summary |
---|---|
protected boolean nextNode() Implements abstract com. This method iterates to the next node that will be added to the table. Each call to this method adds a new node to the table, unless the end is reached, in which case it returns null.
|
setIncrementalSAXSource | back to summary |
---|---|
public void setIncrementalSAXSource(IncrementalSAXSource source) Bind an IncrementalSAXSource to this DTM. NOT RELEVANT for DOM2DTM, since we're wrapped around an existing DOM.
|
setProperty | back to summary |
---|---|
public void setProperty(String property, Object value) Implements com. For the moment all the run time properties are ignored by this class. |
Modifier and Type | Method and Description |
---|---|
public void |
characters | back to summary |
---|---|
public void characters(Node node) throws SAXException |