Google Groups Home
Help | Sign in
Message from discussion Run query in background while showing "Please wait"
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
kdecapite  
View profile
 More options Mar 26, 7:39 pm
From: kdecapite <decap...@creationsite.com>
Date: Wed, 26 Mar 2008 16:39:06 -0700 (PDT)
Local: Wed, Mar 26 2008 7:39 pm
Subject: Re: Run query in background while showing "Please wait"
What I have is an ImportContact model which is simply a table
containing basic information about a person (name, email, address,
etc). This table gets populated either from a standard website
"Contact Us" form or by my client manually importing a CSV file. Now,
once this ImportContact model has new records in it, I want to display
a list of *potential* duplicates by various criteria. The "master"
contacts model is simply called Contact and contains 30,000+ records.
The Contact model does NOT include addresses, there is a separate
Address model for this (again, with 30,000+ records).

The reason I can't have the database do the searching for me is
because the criteria I want to search against to find potential
duplicates doesn't have a direct one-to-one correlation to the
database table fields.

For example this "duplicate search" criteria works great using Cake:

- Show potential duplicates based on field `email1`
- Cake will generate a SELECT...IN query, listing ALL email addresses
from the ImportContact model and returning all records from the
Contact model containing anyone of the listed emails

However, this example does NOT work:

- Show potential duplicates based on fields `fn` and `ln` (a person's
full name)
- Since I have to CONCAT(`fn`, `ln`) AS `full_name`, the SELECT...IN
syntax no longer works

So what I did was query ALL records from both ImportContact and
Contact models and store each of these result sets in their own
arrays. Then I perform the search using PHP array functions and store
the potential duplicates in a third array. This is working great, but
it's just on the slow side (sometimes 3+ minutes). It's not just the
array searching that's taking a while either, it's actually the
database query retrieving 30,000+ records that's slowing things down
as well.

I'm just giving a brief overview here and can provide specific code if
you like. Thanks for the quick replies, maybe I can get this thing to
run more efficiently but I'm not quite sure how :-/

- Kevin

On Mar 26, 4:14 pm, jonknee <m...@jongales.com> wrote:

> > I have a CakePHP CRM application with 30,000+ contact records. Each
> > Contact hasMany Address records. One feature of my app is to allow my
> > client to search for duplicate entries within the contacts table
> > before importing new contacts. In short, I am querying the entire
> > Contact table and LEFT JOINing with the Address table. I store the
> > result set into an array and then use PHP's array functions to filter
> > potential duplicates from a "pending contacts" table.

> What exactly are you doing? It can probably be handled pretty quickly
> by the database. You shouldn't have to manually fish out dupes.

> I always thought the please wait while we search messages were either
> a [poor] attempt to seem extra high-tech or a sign that they have a
> really inefficient system. Google searches the whole web--not just a
> day's commercial flights--and comes back in a flash.

> I have used a PHP app with a valid reason for it, it was a mailing
> list program and it accomplished sending large amounts of email out by
> batching (the page would keep reloading with incrementing GET
> variables, such as start=0&num=100). It wouldn't work in your case.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google