Element.getElementsByTagName Efficiency 2003-01-22 - By Joseph Kesselman
> what is the underlying architecture of the Element class and its impact > on getElementsByTagName
Xerces is open source, and the DOM implementation isn't particularly complicated code; reading the code might be the best way to get the details you're looking for.
(I could discuss how it used to work -- basic tree-walk looking for matching nodes, plus caching for faster re-access, plus an extremely annoying cache flush to match the DOM's requirement that Nodelist be a "live view" -- but I haven't checked the implementation to see if it still works that way. I'd expect it does, though.)
You'll probably also want to look at the DOM Level 2 TreeWalker and NodeIterator as possible alternatives. With appropriate filtering they can perform the same kind of search, more efficiently if your filter understands how to skip irrelevant subtrees, and they avoid most or all of the cache/flush overhead.
______________________________________ Joe Kesselman / IBM Research
<br><font size=2 face="sans-serif">></font><font size=2><tt> what is the underlying architecture of the Element class and its impact</tt></font> <br><font size=2><tt>> on getElementsByTagName</tt></font> <br> <br><font size=2><tt>Xerces is open source, and the DOM implementation isn't particularly complicated code; reading the code might be the best way to get the details you're looking for. </tt></font> <br> <br><font size=2><tt>(I could discuss how it used to work -- basic tree-walk looking for matching nodes, plus caching for faster re-access, plus an extremely annoying cache flush to match the DOM's requirement that Nodelist be a "live view" -- but I haven't checked the implementation to see if it still works that way. I'd expect it does, though.)</tt></font> <br> <br><font size=2><tt>You'll probably also want to look at the DOM Level 2 TreeWalker and NodeIterator as possible alternatives. With appropriate filtering they can perform the same kind of search, more efficiently if your filter understands how to skip irrelevant subtrees, and they avoid most or all of the cache/flush overhead.</tt></font> <br><font size=2 face="sans-serif"><br> ______________________________________<br> Joe Kesselman / IBM Research<br> </font>
|
|