Supports XPath 1.0
Prop description
I just spent far too much time chasing this one....
When running an xpath query on a table be careful about table internal nodes (ie: <tr></tr>, and <td></td>). If the master <table> tag is missing, then query() (and likely evaluate() also) will return unexpected results.
I had a DOMNode with a structure like this:
<td>
<table></table>
<table>
<tr>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
</table>
</td>
Upon which I was trying to do a relative query (ie: <?php $xpath_obj->query('my/x/path', $relative_node); ?>).
But because of the lone outer <td></td> tags, the inner tags were being invalidated, while the nodes were still recognized. Meaning that the following query would work:
<?php $xpath_obj->query('*[2]/*[*[2]]', $relative_node); ?>
But when replacing any of the "*" tokens with the corresponding (and valid) "table", "tr", or "td" tokens the query would inexplicably break.
When using DOMXPath and having a default namespace. Consider using an intermediate function to add the default namespace to all queries:
<?php
// The default namespace: x:xmlns="http://..."
$path="/Book/Title";
$path=preg_replace("\/([a-zA-Z])","/x:$1",$path);
// Result: /x:Book/x:Title
?>
<?php
// to retrieve selected html data, try these DomXPath examples:
$file = $DOCUMENT_ROOT. "test.html";
$doc = new DOMDocument();
$doc->loadHTMLFile($file);
$xpath = new DOMXpath($doc);
// example 1: for everything with an id
//$elements = $xpath->query("//*[@id]");
// example 2: for node data in a selected id
//$elements = $xpath->query("/html/body/div[@id='yourTagIdHere']");
// example 3: same as above with wildcard
$elements = $xpath->query("*/div[@id='yourTagIdHere']");
if (!is_null($elements)) {
foreach ($elements as $element) {
echo "<br/>[". $element->nodeName. "]";
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo $node->nodeValue. "\n";
}
}
}
?>