(PHP 4 >= 4.0.1, PHP 5)
fscanf — Interpretiert den Input einer Datei entsprechend einem angegebenen Format
Die Funktion fscanf() ist sscanf() ähnlich, wobei sie ihren Input aus der mit handle angegebenen Datei nimmt, und entsprechend dem angegebenen format interpretiert. Sind nur 2 Parameter angegeben, werden die analysierten Werte als Array zurückgegeben. Werden auch optionale Parameter übergeben, gibt die Funktion die Anzahl der ermittelten Werte zurück. Die optionalen Parameter müssen referenziert übergeben werden.
Irgendwelche Whitespace-Zeichen (z.B. Leerzeichen, Tabulator, etc.) im Format String gelten mit irgendwelchen Whitespace-Zeichen des Input-Streams als übereinstimmend. Das heißt, dass auch ein Tabulator \t im Format String mit einem einzigen Leerzeichen im Input-Stream als übereinstimmend gelten kann.
Beispiel #1 fscanf()
<?php
$handle = fopen ("users.txt","r");
while ($userinfo = fscanf ($handle, "%s\t%s\t%s\n")) {
list ($name, $profession, $countrycode) = $userinfo;
//... Tue etwas mit den Werten
}
fclose($handle);
?>
Beispiel #2 users.txt
javier argonaut pe hiroshi sculptor jp robert slacker us luigi florist it
Hinweis: Vor PHP 4.3.0 war die maximale Anzahl der von der Datei eingelesenen Zeichen 512 (oder bis zu dem ersten \n, was zuerst eintrat). Seit PHP 4.3.0 werden beliebig lange Zeilen gelesen und interprtiert.
Siehe auch fread(), fgets(), fgetss(), sscanf(), printf(), und sprintf().
The fgetd() function suggested by worldwideroach on 14-Jul-2005 04:33 does not handle the last buffer-load of data correctly. At least not for my requirement. It is possible for EOF to have been reached but for there still to be characters in $sRecord.
For the, er... record I got better results with this version:
<?php
function fgetd(&$rFile, $sDelim, $iBuffer=1024) {
$sRecord = '';
while(!feof($rFile)) {
$iPos = strpos($sRecord, $sDelim);
if ($iPos === FALSE) {
$sRecord .= fread($rFile, $iBuffer);
} else {
fseek($rFile, 0-strlen($sRecord)+$iPos+strlen($sDelim), SEEK_CUR);
return substr($sRecord, 0, $iPos);
}
}
// Last read got some more data before hitting EOF?
if ($sRecord != '') {
if (($iPos = strpos($sRecord, $sDelim)) !== FALSE) {
fseek($rFile, 0-strlen($sRecord)+$iPos+strlen($sDelim), SEEK_CUR);
return substr($sRecord, 0, $iPos);
}
else {
return $sRecord;
}
}
else {
return FALSE;
}
}
?>
If you want fscanf()to scan one variable in a large number of lines, e.g an Ipadress in a line with more variables, then use fscanf with explode()
<?
$filename = "somefile.txt";
$fp = fopen($filename, "r") or die ("Error opening file! \n");
$u = explode(" ",$line); // $u is the variable eg. an IPadress
while ($line = fscanf($fp,"%s",$u)) {
if(preg_match("/^$u/",$_SERVER['REMOTE_ADDR'])) {$badipadresss++;} // do something and continue scan
}
?>
Besides, fscanf()is much faster than fgets()
It would be great to precise in the fscanf documentation
that one call to the function, reads a complete line.
and not just the number of values defined in the format.
If a text file contains 2 lines each containing 4 integer values,
reading the file with 8 fscanf($fd,"%d",$v) doesnt run !
You have to make 2
fscanf($fd,"%d %d %d %d",$v1,$v2,$v3,$v4);
Then 1 fscanf per line.
to include all type of visible chars you should try:
<?php fscanf($file_handler,"%[ -~]"); ?>
Yet another function to read a file and return a record/string by a delimiter. It is very much like fgets() with the delimiter being an additional parameter. Works great across multiple lines.
function fgetd(&$rFile, $sDelim, $iBuffer=1024) {
$sRecord = '';
while(!feof($rFile)) {
$iPos = strpos($sRecord, $sDelim);
if ($iPos === false) {
$sRecord .= fread($rFile, $iBuffer);
} else {
fseek($rFile, 0-strlen($sRecord)+$iPos+strlen($sDelim), SEEK_CUR);
return substr($sRecord, 0, $iPos);
}
}
return false;
}
I have a function for reading delimited files, it works for multiple lines too (i think...)
<?
//$fh - is the file pointer
//$delim - is the seperator
//$callback - self explanatory
//$len - optional
function file_read_delim($fh,$delim,$callback,$len=1024)
{
$rec = '';
while(!feof($fh))
{
$buf = fread($fh,$len);
if(strpos($buf,$delim) === false)
{
$rec .= $buf;
}
else
{
$strs = explode($delim,$buf);
foreach ($strs as $ele)
{
$rec .= $ele;
call_user_func($callback,$rec);
$rec = '';
}
}
}
}
//Here is an example how to use the function
$fh = fopen($filename,'r');
if(!$fh)
{
die 'Could not open file for reading';
}
//call the function
file_read_delim($fh,'-','cb');
fclose($fh);
function cb($rec)
{
echo "$rec \n";
}
?>
I have a simpler method I use to parse delimited text. Using the data posted by gozer at fanhunter dot com, here is my script. Maybe it is faster?
<?php
$fp = fopen ("sections.dat","r");
if (!$fp) {echo "<p>Unable to open remote file.</p>"; exit;}
while (!feof($fp)):
$line = fgets($fp, 2048);
$out = array($line);
list ($id, $name, $description, $language, $directory, $id_uplevel, $order, $hassubsection) = split ("\|", $out[0]);
echo "$id-$name-$description-$language-$directory-
$id_uplevel-$order-$hassubsection<br />\n";
$fp++;
endwhile;
fclose($fp);
?>
Notes:
Avoid the php extension on a data file - it will cause PHP to parse the file, but there is no PHP in it.
The "2048" value on line 2 of the loop is set for long lines. 1024 works fine, but I had to increase it with a large database I use a similar script to read.
You don't need to open and close the speech marks (as in gozer at fanhunter dot com's example) in the echo line, just use the variables inbetween the dashes.
(Remove the line break halfway through the echo line - it is just there for this forum.)
A better way to use fscanf() would be this:
<?php
$handle = fopen("users.txt", "r");
while (!feof($handle)) {
$userinfo = fscanf($handle, "%s\t%s\t%s\n");
if ($userinfo) {
list ($name, $profession, $countrycode) = $userinfo;
//... do something with the values
}
$userinfo=NULL;
}
fclose($handle);
?>
as you can see, instead of waiting for fscanf() to fail to return a value... it waits for the the pointer to get to the end of the file... this way, if for some reason one of your lines does not match your expression, it will not kill the loop. it will simply go on to the next line.
actually, instead of trying to think of every character that might be in your file, excluding the delimiter would be much easier.
for example, if your delimiter was a comma use:
%[^,]
instead of:
%[a-zA-Z0-9.| ... ]
Just make sure to use %[^,\n] on your last entry so you don't include the newline.
Clear the variables before you scan them in. As a programmer this should have been very apparent. Goof.
If you want to read text files in csv format or the like(no matter what character the fields are separated with), you should use fgetcsv() instead. When a text for a field is blank, fscanf() may skip it and fill it with the next text, whereas fgetcsv() correctly regards it as a blank field.
Hi,
A few days ago we got multiple mySQL crashes due to a hardware failure and other processes running.
While we thought it could be the mySQL daemon overloaded, we started looking for alternate ways to get our little databases working so we started using fscanf to parse files.
We ran into multiple problems due to the whitespace and other characters that were in our database. Finally, we made it to work using sets as james@zephyr-works.com remarked.
Our final function is:
function get_sections($include_dir){
$filename = $include_dir . "sections.dat.php";
$datafile = fopen ($filename ,"r");
while ($sectioninfo = fscanf ($datafile, "%[0-9]|%[a-zA-Z0-9@&;:,. /!?-]|%[a-zA-Z0-9@&;:,. /!?-]|%[a-zA-Z]|%[a-zA-Z0-9@/?&;.+=-]|%[0-9]|%[0-9]|%[0-9]\n")) {
list($id, $name, $description, $language, $directory, $id_uplevel, $order, $hassubsection) = $sectioninfo;
// Show output
echo $id . "-" . $name. "-" . $description . "-" . $language . "-" . $directory . "-" . $id_uplevel . "-" . $order . "-" . $hassubsection . "<br>\n";
}
fclose($datafile);
}
The contents of sections.dat.php (for example):
1|home|Página principal de Fanhunter.|castellano|==|0|0|0
2|fanhunter|Sección principal dedicada al universo Fanhunter.|castellano|fanhunter/|1|0|0
3|outfan|Sección principal dedicada al universo Outfan.|castellano|outfan/|1|0|0
4|fanpiro|Sección principal dedicada al universo Fanpiro.|castellano|fanpiro/|1|0|0
5|tienda|La tienda de Fanhunter.|castellano|tienda/|1|0|0
6|the zone|Sección principal Miscelánea.|castellano|thezone/|1|0|0
7|flfcn|Sección principal dedicada a Fan Letal/Fan con Nata.|castellano|fanletal/|1|0|0
8|foro|Nuestro foro de discusión.|castellano|foro/|1|0|0
9|chat|Sección para chatear.|castellano|chat/|1|0|0
10|links|Sección recopilatoria de enlaces de interés a otras páginas.|castellano|links/|1|0|0
Note: The '==' in directory means no directory needed to be specified.
Pay attention to linebreaks, as this forum puts some of them into the code I pasted.
Good luck guys.
fscanf works a little retardedly I've found. Instead of using just a plain %s you probably will need to use sets instead. Because it works so screwy compared to C/C++, fscanf does not have the ability to scan ahead in a string and pattern match correctly, so a seemingly perfect function call like:
fscanf($fh, "%s::%s");
With a file like:
user::password
Will not work. When fscanf looks for a string, it will look and stop at nothing except for a whitespace so :: and everything except whitespace is considered part of that string, however you can make it a little smarter by:
fscanf($fh, "%[a-zA-Z0-9,. ]::%[a-zA-Z0-9,. ]" $var1, $var2);
Which tells it that it can only accept a through z A through Z 0 through 9 a comma a period and a whitespace as input to the string, everything else cause it to stop taking in as input and continue parsing the line. This is very useful if you want to get a sentence into the string and you're not sure of exactly how many words to add, etc.
For C/C++ programmers.
fscanf() does not work like C/C++, because PHP's fscanf() move file pointer the next line implicitly.