I just wanna add an additional to
"Append/combine/merge one simplexml to another"
The parent should has attributes too.
<?php
function append_simplexml(&$simplexml_to, &$simplexml_from)
{
static $firstLoop=true;
//Here adding attributes to parent
if( $firstLoop )
{
foreach( $simplexml_from->attributes() as $attr_key => $attr_value )
{
$simplexml_to->addAttribute($attr_key, $attr_value);
}
}
foreach ($simplexml_from->children() as $simplexml_child)
{
$simplexml_temp = $simplexml_to->addChild($simplexml_child->getName(), (string) $simplexml_child);
foreach ($simplexml_child->attributes() as $attr_key => $attr_value)
{
$simplexml_temp->addAttribute($attr_key, $attr_value);
}
$firstLoop=false;
append_simplexml($simplexml_temp, $simplexml_child);
}
unset( $firstLoop=false );
}
?>
I have aproblem to assign an object into a variable used as an object later, Then i found the code above, it solve half of my problem. I have it should have the parent name assign too but it didnt. Any better code?
SimpleXML is really nice for loading/converting XML data into native PHP data structures. I was considering crude searching of SVG for the width/height and since it really is an XML file... I found a SUPER easy method for parsing out the information I wanted:
<?php
// putting incomplete SVG data inline for readability
$RawXML = <<< XML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://www.example.com/dc"
xmlns:cc="http://www.example.com/cc"
xmlns:rdf="http://www.example.com/rdf"
xmlns:svg="http://www.example.com/svg"
xmlns="http://www.example.com/"
xmlns:sodipodi="http://www.example.com/sodipodi"
xmlns:inkscape="http://www.example.com/inkscape"
id="svg2"
sodipodi:version="0.32"
inkscape:version="0.46"
width="586.25"
height="743.75"
xml:space="preserve"
sodipodi:docname="some_svg_file.svg"
inkscape:output_extension="org.inkscape.output.svg.inkscape">
<!-- portions removed to keep this short -->
</svg>
XML;
$svg = new SimpleXMLElement($RawXML);
$height = $svg['height'];
$width = $svg['width'];
echo "The width x height is: ${width} x ${height} \n";
?>
Append/combine/merge one simplexml to another:
<?php
include 'example.php';
$el1 = new SimpleXMLElement($xmlstr);
$el2 = new SimpleXMLElement($xmlstr);
// wrong way!
// php convert $el2->movie[0] to string silently
$el1->addChild('movie', $el2->movie[0]);
// wrong way!
// php 5.1 convert $el2->movie[0] to string and generate Warning: It is not yet possible to assign complex types to properties. It is not possible to assign complex types to nodes
// php 5.2 convert $el2->movie[0] to string silently
$el1->addChild('movie');
$el1->movie[2] = $el2->movie[0];
// right way!
$el1_movie = $el1->addChild('movie');
append_simplexml($el1_movie, $el2->movie[0]);
echo "<pre>"; print_r($el1); echo "</pre>";
/**
* Add one simplexml to another
*
* @param object $simplexml_to
* @param object $simplexml_from
* @author Boris Korobkov
* @link http://www.ajaxforum.ru/
*/
function append_simplexml(&$simplexml_to, &$simplexml_from)
{
foreach ($simplexml_from->children() as $simplexml_child)
{
$simplexml_temp = $simplexml_to->addChild($simplexml_child->getName(), (string) $simplexml_child);
foreach ($simplexml_child->attributes() as $attr_key => $attr_value)
{
$simplexml_temp->addAttribute($attr_key, $attr_value);
}
append_simplexml($simplexml_temp, $simplexml_child);
}
}
?>
Just wanted to add a post as to how you can extract the value from a SimpleXMLElement. Its not as straightforward as you think. Because its a complex object you can't just access the element directly. Here is a sample of data that represents a var_dump of a SimpleXmlElement
array(1) {
[0]=>
object(SimpleXMLElement)#13 (2) {
["@attributes"]=>
array(1) {
["name"]=>
string(5) "title"
}
[0]=>
string(19) "PHP Tech Book"
}
}
If you want to extract the title of the book you have to cast the specified element to a string like so.
$newTitle = (string) $title[0];
The $title variable is the SimpleXMLElement that you have extracted from the xml document using simplexml_load_string for instance. To initially access the title element from the xml document you can do like so, using xpath.
$title = $doc->xpath('str[@name="title"]');
Hope this helps someone out there.
While SimpleXML is a powerful tool, developers using it to process and handle large XML documents & strings should take into careful consideration its memory usage requirements.
SimpleXML requires the entire XML tree to be available prior to any processing actions on that tree -- this requires the entire tree to be in memory. Ok for a 40kb XML file but when dealing with > 100MB files you will see performance degradation, especially if you have a busy server.
An alternative to processing large XML files is the XMLReader class, which operates in streaming mode, of which an excellent tutorial is presented here:
http://www.ibm.com/developerworks/library/x-pullparsingphp.html
XML is an incredibly verbose format. If you are dealing with large data structures, especially in web services, do you you actually need to represent the data as an XML tree? - other serialisation formats such as JSON, Serialised PHP, Google Protocol Buffers (http://code.google.com/p/protobuf/) and even CSV can remarkably reduce processing time, bandwidth and load when dealing with large files. A 110MB XML file can become a 65MB CSV file, with the same data in it, simply because the data identifiers (tags) are only represented once in the entire document.
Be aware of quirks like this one:
<?php
$a = new SimpleXMLElement('<a><b/></a>');
$a->b = 'test & test';
$a->c = 'test & test';
print_r($a);
/*
OUTPUT:
SimpleXMLElement Object
(
[b] => test & test
[c] => test & test
)
*/
I'm using SimpleXML to process data sent back from an API request, and I ran into the CDATA problem and the error you get from html entities, and here is a solution i came up with, don't know if it's the most practical, but it's working.
<?php
$str = <<< XML
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<menu>
<item>
<title><![CDATA[Café]]></title>
<description><![CDATA[Some description.]]></description>
</item>
</menu>
XML;
$str = preg_replace("/\<\!\[CDATA\[(.*?)\]\]\>/ies", "'[CDATA]'.base64_encode('$1').'[/CDATA]'", $str);
$xml = new SimpleXMLElement($str);
// Return item title's
foreach ($xml->item as $item) {
$tmp = (string) $item->title;
$tmp = preg_replace("/\[CDATA\](.*?)\[\/CDATA\]/ies", "base64_decode('$1')", $tmp);
echo $tmp;
}
?>
fix the function i posted before, have a wrong quote placement
function implode2($glue="", $var){
if ($var){
foreach ($var as $value){
$array[]=strval($value);
}
return implode($glue, $array);
}
else return false;
}
TIP: if you run into memory issue using simpleXML.
You may want to use PHP>5.2.5 in order to avoid that: http://bugs.php.net/bug.php?id=38604
If not possible, avoid "foreach ( $xmldata->node as $node )" like calls.
<?php
/**
* Remove node/nodes xml with xpath
*
* @param SimpleXMLElement $xml
* @param string XPath $path
* @param string ('one'|'child'|'all') $multi
*
* Use:
*
* Example xml file - http://ru2.php.net/manual/ru/ref.simplexml.php
*
* $xml = simplexml_load_file($xmlfile);
*
* //1. remove only 1 node (without child nodes)
* // $path must return only 1 (unique) node without child nodes
* removeNode($xml, '//movie/rating[@type="thumbs"]');
*
* //2. remove 1 node (with 1 child nodes)
* // $path can return any nodes - will be removed only first node
* // with all child nodes
* removeNode($xml, '//characters', 'child')
*
* //3. remove all nodes (with child nodes)
* // $path can return any nodes - will be removed all
* // with child nodes
* removeNode($xml, '//rating', 'all')
*
* $xml->asXML($xmlfile);
*
*/
function removeNode($xml, $path, $multi='one')
{
$result = $xml->xpath($path);
# for wrong $path
if (!isset($result[0])) return false;
switch ($multi) {
case 'all':
$errlevel = error_reporting(E_ALL & ~E_WARNING);
foreach ($result as $r) unset ($r[0]);
error_reporting($errlevel);
return true;
case 'child':
unset($result[0][0]);
return true;
case 'one':
if (count($result[0]->children())==0 && count($result)==1) {
unset($result[0][0]);
return true;
}
default:
return false;
}
}
?>
p.s. after use (if deleted all nodes):
<?php
function trimXml($xmlfile)
{
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
if (!$dom->load($xmlfile)) return false;
$dom->formatOutput = true;
if (!$dom->save($xmlfile)) return false;
return true;
}
?>
It does not say in the docs, but SimpleXML will convert all text into UTF-8, if the source XML declaration has another encoding. Eg, if the source has the following XML decl:
<?xml version="1.0" encoding="windows-1251" ?>
all the text in the resulting SimpleXMLElement will be in UTF-8 automatically.
If you need to do math calculations on values extracted from simplexml document, you might need to cast the value as float to prevent precision loss. Here is an example:
<?php
$objXML = new SimpleXMLElement('<test x="-123.45"></test>');
//Shows correctly
echo $objXML['x']."\n";
//We loose the decimals
echo $objXML['x'] + $objXML['x']."\n";
$x = $objXML['x'];
//This works if we cast the amounts
echo (float)$objXML['x'] + (float)$objXML['x']."\n";
//Calculated on a string, no problem
echo "-123.45" + "-123.45";
?>
This is due to the fact that $objXML['x'] is not a string (php would cast it automatically) neither a float, but a SimpleXMLElement object.
"echo var_dump($x);" will output this
~~
object(SimpleXMLElement)#3 (1) {
[0]=>
string(7) "-123.45"
}
~~
I opened a bug request on php but here is the answer they gave me:
~~
Status: Won't fix
The behavior is defined by the engine not the extension. When performing mathematical operations on objects, they are treated as integers. It is up to the user to cast the object to the appropriate type to maintain proper precision.
~~
I had a problem with entities.
My first solution:
I saved Data that way:
$ENTRY_->
addchild('Nachricht',htmlentities($_POST["blog"]));
Had Entities in the XML-File like:
<!ENTITY auml "&auml">
And I loaded the Data that way:
html_entity_decode($ENTRY->Nachname);
But after saving and
loading the xml-file the entity-entry
<!ENTITY auml "&auml">
disappeared. strange...
My second solution:
With saving the Data this way:
$ENTRY_->
addchild('Nachricht',htmlentities(htmlentities($_POST["blog"])));
I can now load it with html_entity_decode without the
entity-entry in the XML-file!
I tested it with äöü.
Hope it helpes.
When creating a new XML document and adding text with umlauts and such
<?php $SimpleXMLElement->asXML(); ?>
will silently NOT output any content with umlauts.
Use htmlentities () while adding Umlauts & co to solve the "problem"
If you're handling lots of HTML or mixed-content XML you'll probably want to use the DOM functions instead of SimpleXML. Take this for example:
<?php
$html = new SimpleXMLElement('<div><p>Some text, <a href="#">a link,</a> more text</p></div>');
echo $html->p->a,"<br>\n"; // "a link,"
echo $html->p; // "Some text, more text" (!)
?>
In the above example reconstructing the original markup is impossible because of the way SimpleXML represents the data.
Concerning SimpleXML and sessions:
When creating a SimpleXML object (to be precise: a SimpleXMLElement object) in the context of a session and storing it in $_SESSION['XMLobject'], this object does not "survive" the session!
By the time re-entering the session, print_r($_SESSION['XMLobject']) says:
['XMLobject'] => SimpleXMLElement Object
Warning: Node no longer exists in /your_php_file.php on line xyz
(
)
(Message simplified for the sake of better readability.)
Sometimes it's nice to mix up data storage types. This is a very simple SQL to XML converter. Feed it a SQL query and it outputs the result in XML.
The first paramater should be a mysql_query result
(optional)The second is the xml name for each row (i.e the second depth of XML)
(optional)The third is the name of the XML document, the root name
<?php
$result=mysql_query("SELECT * FROM users");
sql_to_xml($result,"users","members");
function sql_to_xml($mysql_result,$row_name="row",$doc_name="root")
{
$xml= new SimpleXMLElement("<$doc_name></$doc_name>");
while($line=mysql_fetch_assoc($mysql_result))
{
$row=$xml->addChild($row_name);
foreach($line as $column => $value)
{
$row->$column="$value";
}
}
return $xml->asXML();
}
?>
Memory leak when setting attributes as in example (#Example 2134. Setting values)
This probably goes unnoticed in web scripts (unless you do a LOT of xml manipulations), but I ran into this in my standalone script that processes a large number of XML files.
The following code will eat up memory quite fast:
<?php
include 'example.php';
while (true) {
$xml = new SimpleXMLElement($xmlstr);
$xml->movie[0]->characters->character[0]->name = 'Miss Coder';
}
?>
while this seems to behave correctly:
<?php
include 'example.php';
while (true) {
$xml = new SimpleXMLElement($xmlstr);
$c = $xml->xpath("//character");
$c[0]->name = 'Miss Coder';
}
?>
This looks like bug #38604, and I just confirmed that in 6.0.0-dev (on Windows at least) it is fixed. It is NOT fixed in 5.2.1 or 5.2.2-dev (2/21 build), so for 5.2 users, use the second form to avoid leaks.
It doesn't mention this anywhere, but creationg a new SimpleXMLElement object from a non-valid string throws an exception. It looks ugly in the php log as it dumps the stack in multiple lines.
The correct way to create a new SimpleXMLElement object is like so:
<?php
$xmlstr = ''; // empty to throw an exception
try {
$xml = new SimpleXMLElement($xmlstr);
} catch (Exception $e) {
// handle the error
echo '$xmlstr is not a valid xml string';
}
?>
Be careful when using var_export to debug element attributes - it won't work! Always use print() or similar for checking the contents of element attributes.
As of PHP 5.1.4, trying to iterate on a non-existent node will actually create that node.
<?
$xml = simplexml_load_string('<root></root>');
foreach($xml->b as $dummy);
echo $xml->asXML();
?>
Gives :
<?xml version="1.0"?>
<root><b/></root>
You might think it is a bug, but PHP developers seam to consider it as a feature : http://bugs.php.net/bug.php?id=39164
Here is my Simple XML to array function, it is recursive and has the benefit of maintaining key value relationships and has worked well for me.
<?php
function XMLToArray($xml)
{
if ($xml instanceof SimpleXMLElement) {
$children = $xml->children();
$return = null;
}
foreach ($children as $element => $value) {
if ($value instanceof SimpleXMLElement) {
$values = (array)$value->children();
if (count($values) > 0) {
$return[$element] = XMLToArray($value);
} else {
if (!isset($return[$element])) {
$return[$element] = (string)$value;
} else {
if (!is_array($return[$element])) {
$return[$element] = array($return[$element], (string)$value);
} else {
$return[$element][] = (string)$value;
}
}
}
}
}
if (is_array($return)) {
return $return;
} else {
return $false;
}
}
?>
SimpleXML handles namespaces, but it's not documented very well here at all.
If you wanted to parse, say, an open office manifest file or a piece of RDF/XML, you have to get elements and attributes by the namespace url.
Example:
<?php
function display($in) {
if (file_exists($in)) {
$xml = simplexml_load_file($in);
} else {
throw new Exception($in . " does not exist");
}
$manifest = $xml->children('http://openoffice.org/2001/manifest');
foreach ($manifest->xpath('//manifest:file-entry') as $file) {
foreach ($file->attributes('http://openoffice.org/2001/manifest') as $key => $value) {
print "Key:" . $key . "\n";
print "Value:" . $value . "\n";
}
}
}
?>
File: manifest.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE manifest:manifest PUBLIC "-//OpenOffice.org//DTD Manifest 1.0//EN" "Manifest.dtd">
<manifest:manifest xmlns:manifest="http://openoffice.org/2001/manifest">
<manifest:file-entry manifest:media-type="application/vnd.sun.xml.writer" manifest:full-path="/"/>
<manifest:file-entry manifest:media-type="application/vnd.sun.xml.ui.configuration" manifest:full-path="Configurations2/"/>
<manifest:file-entry manifest:media-type="application/binary" manifest:full-path="layout-cache"/>
<manifest:file-entry manifest:media-type="text/xml" manifest:full-path="content.xml"/>
<manifest:file-entry manifest:media-type="text/xml" manifest:full-path="styles.xml"/>
<manifest:file-entry manifest:media-type="text/xml" manifest:full-path="meta.xml"/>
<manifest:file-entry manifest:media-type="" manifest:full-path="Thumbnails/thumbnail.png"/>
<manifest:file-entry manifest:media-type="" manifest:full-path="Thumbnails/"/>
<manifest:file-entry manifest:media-type="text/xml" manifest:full-path="settings.xml"/>
</manifest:manifest>
Hi,
If you want to access an element that has a dash in its name, (as is common with the XML documents provided by the Library of Congress, as well as the NWS) you will need to handle it a little bit differently.
You can either use XPATH, which works fine, but will return an array of results every time, even if there is a single result.
eg.
<?php
$xml->xpath('/data/time-layout/start-valid-time');
?>
You can also choose just to encapsulate the element names containing a dash:
<?php $xml->data->{'time-layout'}->{'start-valid-time'} ?>
--
On a only partially related note, dealing with SimpleXML is one of the only times I have employed casting with PHP. While iterating (foreach) through the valid times, echo'ing the element worked great (it merely echo'ed the apropriate time), assigning it to another variable resulted in a SimpleXML object containing the time to be assigned, rather than just the time itself. This was resolved by casting the time to a string:
<?php
foreach($xml->data->{'time-layout'}->{'start-valid-time'} AS $time)
{
$weatherDates[] = (string) $time;
}
?>
Another method to parse an XML Document into a PHP array with SIMPLEXML inspired from Daniel FAIVRE !
<?php
function xml2php($xml)
{
$fils = 0;
$tab = false;
$array = array();
foreach($xml->children() as $key => $value)
{
$child = xml2php($value);
//To deal with the attributes
foreach($node->attributes() as $ak=>$av)
{
$child[$ak] = (string)$av;
}
//Let see if the new child is not in the array
if($tab==false && in_array($key,array_keys($array)))
{
//If this element is already in the array we will create an indexed array
$tmp = $array[$key];
$array[$key] = NULL;
$array[$key][] = $tmp;
$array[$key][] = $child;
$tab = true;
}
elseif($tab == true)
{
//Add an element in an existing array
$array[$key][] = $child;
}
else
{
//Add a simple element
$array[$key] = $child;
}
$fils++;
}
if($fils==0)
{
return (string)$xml;
}
return $array;
}
?>
Note that SimpleXML expects to both read and output XML in UTF-8 encoding. You'll need to add a line such as this at the top of your input XML file if it isn't saved in UTF-8 (adjust to whatever encoding used):
<?xml version="1.0" encoding="ISO-8859-1" ?>
On the output side of things, if you're not serving/handling UTF-8, you'll need to use utf8_decode() [red. but that will only work for ISO-8859-1, not other encodings]. Common mistake: http://bugs.php.net/bug.php?id=28154
If you are looking to use SimpleXML for anything but reading XML documents, you should really reconsider, and use the XML DOM library. By the time you get enough utilities implemented in DOM to handle all the set backs in SimpleXML, you will have defeated the purpose of using SimpleXML. There are a few reasons for this, and there are already many workrounds, but the primairy issues are this
1) No complex node assignment. You cannot switch nodes or replace them.
2) No appending new child nodes
3) Whenever you do something like $new_node = $xml_doc->node you will always get a reference even if you use a clone method, which will crash the script.
Other than that, its a great tool for reading docs.