LOAD DATA LOCAL INFILE not loading all records into mySQL from CSV file

1,351 views
Skip to first unread message

ByteSize

unread,
Jun 25, 2013, 5:40:01 PM6/25/13
to phr...@googlegroups.com
Using phrets, I have created two CSV files using Troy's example as my template (Connect, download listing data in csv format, disconnect). Using the Residential file as my example here, the record count returned from the initial search is 11,945. I use a limit of 2,500 for my batch loop, and use fputcsv to create my CSV file. However, when I open the CSV file in Excel, it shows 12,501 records (including the header row). First mystery.

I then use LOAD DATA LOCAL INFILE to transfer the contents of the CSV file to mySQL. However, only 2,502 records are written. It doesn't matter if I run the import from my PHP script, or through a query in phpMyAdmin, I get the same result - only a fraction of the records are added. Second mystery.

I've tried changing the limit to alter the loop length, and the number of records imported into mySQL seems to correspond with the limit length of the initial loop. This suggests to me that something's happening when I write the second batch of records to the CSV file. I tried switching from $fh = fopen($file_name, 'w+') to $fh = fopen($file_name, 'a+') but that made no difference.

I've tried checking for problem linebreaks other than "\r" using a hex editor, and it indicates all is fine. I also tried running some parsing code to replace anything other than "\r" with "\r" but that didn't make a difference either.

I'm currently running my PHP scripts from a XAMPP installation, and importing data into a mySQL database on Amazon's S3. I also tried running the PHP code from a LAMP installation on EC2 with the same frustrating results.

I've been working on this for three days straight and can't find a solution. I've posted my code below in the hopes that someone has a solution. I know it's not elegant, but right now I just want it to work!!

========== DOWNLOAD RECORDS AND CREATE CSV FILE  ===================
<?php
@include_once('login.php');

$rets = new PHRETS;
$connect = $rets->Connect($login, $un, $pw);
$sixmonths = date('Y-m-d\TH:i:s', time()-15778800); // get listings updated within last 6 months
if ($connect) {
        echo "  + Connected<br>\n";
}
else {
        echo "  + Not connected:<br>\n";
        print_r($rets->Error());
        exit;
}

$property_classes = array("ResidentialProperty","CondoProperty");
foreach ($property_classes as $class) {   
    echo "+ Property:{$class}<br>\n";   
    $file_name = strtolower("{$class}.csv");
    $fh = fopen($file_name, 'w+');
    $maxrows = true;
    $fields_order = array();       
    $limit = 2500; // max rows downloaded in a batch
    $offset = 1; // offset will be incremented as we progress leave it at 1   
    $search = $rets->Search('Property',$class,'((timestamp_sql='.$sixmonths.'+),(status=A))', array('Count'=> 2));  // just grab count first time so we can figure out how many batches
    $count = $rets->TotalRecordsFound();
    echo "count = $count records in $class<br />";
    unset($search);

    while ( $count >= $offset ) { // now continue searching as long as the offset is less than the count
        echo "offset = $offset <br />";
        $search = $rets->SearchQuery('Property',$class,'((timestamp_sql='.$sixmonths.'+),(status=A))',
            array(
                'Format'    => 'COMPACT-DECODED',
                'Offset'    => $offset,
                'Count'        => $count,
                'Limit'        => $limit
            )
        );   
        if ($offset == 1) {        // print filename headers as first line
                $fields_order = $rets->SearchGetFields($search);
                fputcsv($fh, $fields_order);
        }
        while ($record = $rets->FetchRow($search)) {
                $this_record = array();
                foreach ($fields_order as $fo) {
                        $this_record[] = $record[$fo];
                }
                fputcsv($fh, $this_record);
        }
        $rets->FreeResult($search);
        $offset = $offset + $limit; //next loop of the while will start at whatever limit +1 is   
       
    }
    fclose($fh);
    echo "  - done $class <br>\n";
}
echo "+ Disconnecting<br>\n";
$rets->Disconnect();
?>

========== IMPORT RECORDS INTO MYSQL FROM CSV FILE  ===================
<?php
    $dbhost = "hostname";
    $dbName = "dbname";
    $dbuser = "dbuser";
    $dbpass = "dbpassword";
     
    $conn = mysql_connect($dbhost, $dbuser, $dbpass) or die(mysql_error());
    $db = mysql_select_db($dbName, $conn) or die (mysql_error());
   
    $csvfile_free = 'residentialproperty.csv';
    $csvfile_condo = 'condoproperty.csv';
   
    $sql = mysql_query("LOAD DATA LOCAL INFILE '$csvfile_free' INTO TABLE treb_freehold FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n' IGNORE 1 LINES") or die (mysql_error()); 
    if (!$sql) {        // problem
        echo mysql_error() ."<br>";
    } else {
        echo "freehold success!";   
        $sql2 = mysql_query("LOAD DATA LOCAL INFILE '$csvfile_condo' INTO TABLE treb_condo FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n' IGNORE 1 LINES") or die (mysql_error());
        if (!$sql2) {        // problem
        echo mysql_error() ."<br>";
        } else {
            echo "condo success!";
        }
    }
mysql_close($conn);
?>
================

ByteSize

unread,
Jul 3, 2013, 2:22:41 PM7/3/13
to phr...@googlegroups.com
Okay, I finally figured out the problem - the RETS server I'm accessing doesn't support OFFSET, so I keep grabbing the same records over and over. Since the MLS number field in mySQL table is a key index, the import into mySQL stops after the first loop because the MLS number repeats for each loop.

Don't know if this will help anybody else out there but at least my hair can start growing back in!
Reply all
Reply to author
Forward
0 new messages