To add to xafford's comment. When iterating a named node map collection using ->item() or using foreach, removing a attribute with DOMNode->removeAttribute() or DOMNode->removeAttributeNode() alters the collection as if it were a stack. To illustrate, the code below tries to remove all attributes from each element but only removes the first. One work around is to copy the named node map into an array before removing attributes. Using PHP 5.2.9 on Windows XP.
<?php
error_reporting(E_ALL);
$html = '<h1 id="h1test" class="h1test">Heading</h1>';
$html .= '<p align="left" class="ptest">Hello World</p>';
$doc = new DOMDocument();
$doc->loadHTML($html);
// remove attributes from the h1 element
$h1 = $doc->getElementsByTagName('h1')->item(0);
$length = $h1->attributes->length;
for ($i = 0; $i < $length; ++$i) {
$name = $h1->attributes->item($i)->name;
$h1->removeAttribute($name);
echo "h1: removed attribute `$name`<br>";
}
// remove attributes from the p element
$p = $doc->getElementsByTagName('p')->item(0);
foreach ($p->attributes as $name => $attrNode) {
$p->removeAttribute($name);
echo "p: removed attribute `$name`<br>";
}
?>
OUTPUT:
-------
h1: removed attribute `id`
Notice: Trying to get property of non-object in nodemap.php on line 13
h1: removed attribute ``
p: removed attribute `align`
I stumbled upon a problem with DOMNamedNodeMap. If you iterate over a DOMNamedNodeMap, representing the attributes of a DOMElement with foreach and you use DOMElement::removeAttributeNode only the first attribute will be handled.
Example (not complete):
<?php
/*
* Imagine you got a node like this:
* <a onclick="alert('evil')" href="http://example.com">evil</a>
* and onclick should be removed, href would not be tested.
*/
foreach ( $node->attributes as $attribute )
{
echo 'checking attribute ', $attribute->name, '<br />';
if ( ! in_array ( $attribute->name, $allowed_attributes ) )
{
$node->removeAttributeNode ( $attribute );
}
}
?>
The output would be:
checking attribute onclick
You can iterate over an instance of DOMNamedNodeMap:
<?php
// suppose $node is a valid DOMElement object
// $node->attributes will be NULL or a valid DOMNamedNodeMap object
foreach ($node->attributes as $attrName => $attrNode) {
}
?>