Bulk database updates

1 view
Skip to first unread message

jeff

unread,
Jul 11, 2008, 4:26:51 PM7/11/08
to transfer-dev
I have to do some bulk updates with transfer, and my only idea so far
seems highly inefficient.

Basically, lets say a user (Userid =123) has a multi-select box and
they need to choose, "sports they like". Previously, the user had
picked "Soccer" (im american), "Football", "Lacrosse", "Golf", with
ID's of 1, 2, 3, 4

Now they want to update it to be

"Soccer", "Kayaking", "Lacrosse", "Golf", with ID's of 1, 5, 3, 4

They way I previously would do this would be to
1) Delete any records in the association table (User_Sports) with
userID = 123
2) Loop through a SQL insert statement and insert a new record into
the assocation table (User_Sports) for EACH sport.

So before this operation, our table would be:

USERID | SPORTID
123 | 1
123 | 2
123 | 3
123 | 4

Then we'd delete and the database would be blank.

Then we'd insert and have:

USERID | SPORTID
123 | 1
123 | 5 <---- The big change
123 | 3
123 | 4

----------------------
doing it the old way was fine, but now I HAVE to do this in transfer.
Which seems like I have to
1) Do a transfer() list to get a query of all my records
2) Loop through and load each record as an object
3) Delete it
4) Do ANOTHER loop where I take the newly inputted values and create
an object for each, then save it.

This seems highly inefficient, is there a better way to do this?

-Jeff

Jennifer Larkin

unread,
Jul 11, 2008, 4:34:50 PM7/11/08
to transf...@googlegroups.com
To clarify, and I'm about to go into a meeting, so if this doesn't make this make more sense, tough cookies..

The table that needs to be updated is a composite key table that contains only the two foreign keys. Basically, we are populating a table that links the content to a multi-select box. We will need to use a similar method to update composite key tables that have columns in addition to the composite keys, but if we have to do something technically different to manage that, I have no objections.

Jeff is assigned with figuring out the best way to do this. Good luck, Jeff! :D
--
I did not come here to get devoured by symbols of monarchy! -- Smirnov of the KGB, Casino Royale

Now blogging....
http://www.blivit.org/blog/index.cfm
http://www.blivit.org/mr_urc/index.cfm

Brian Kotek

unread,
Jul 11, 2008, 4:41:59 PM7/11/08
to transf...@googlegroups.com
I don't use composite keys, but with a normal many to many all you have to do is clear it, then repopulate the relationship, and then save the target object. I don't know if that works the same way with composite keys or not though.

jeff

unread,
Jul 11, 2008, 5:14:50 PM7/11/08
to transfer-dev
Is there a bulk delete for transfer, or do i have to loop through and
load EVERY object before deleting it?

On Jul 11, 1:41 pm, "Brian Kotek" <brian...@gmail.com> wrote:
> I don't use composite keys, but with a normal many to many all you have to
> do is clear it, then repopulate the relationship, and then save the target
> object. I don't know if that works the same way with composite keys or not
> though.
>
> On Fri, Jul 11, 2008 at 4:34 PM, Jennifer Larkin <jlar...@gmail.com> wrote:
> > To clarify, and I'm about to go into a meeting, so if this doesn't make
> > this make more sense, tough cookies..
>
> > The table that needs to be updated is a composite key table that contains
> > only the two foreign keys. Basically, we are populating a table that links
> > the content to a multi-select box. We will need to use a similar method to
> > update composite key tables that have columns in addition to the composite
> > keys, but if we have to do something technically different to manage that, I
> > have no objections.
>
> > Jeff is assigned with figuring out the best way to do this. Good luck,
> > Jeff! :D
>

Brian Kotek

unread,
Jul 11, 2008, 5:27:22 PM7/11/08
to transf...@googlegroups.com
There is for a Many to Many relationship, but not for One to Many or Many to One. If this table is ONLY a linking table, is there a reason why you aren't using a Many to Many relationship, because what you're describing is exactly what the Many to Many relationship is meant to be.?

Otherwise, yes, you'll either have to loop over them and delete them, then rebuild it. Another option would be to determine which values changed and then loop over and remove only the ones that were removed and add the ones that were added.

Jennifer Larkin

unread,
Jul 11, 2008, 6:40:58 PM7/11/08
to transf...@googlegroups.com
The reason that we aren't using a manytomany in this case is that I set it up and I am flying by the seat of my pants, which luckily are sturdier than they look.

So, we can use manytomany for some of the junction tables but not for others. It sounds like we actually need two different ways of doing this based on whether the junction table contains columns in addition to the composite keys, because we shouldn't be using manytomany on tables that contain additional columns.

Mark Mandel

unread,
Jul 11, 2008, 6:53:15 PM7/11/08
to transf...@googlegroups.com
If you have the collection from one object -> another

Why not just loop around your collection of objects, and then do a
check - if it's meant to be there, keep it, if it's not,delete it,and
if it's not there, add it.

Doesn't seem like that big an issue.

If you are using m2m, then it manages this for you.

Mark

--
E: mark....@gmail.com
W: www.compoundtheory.com

jeff

unread,
Jul 11, 2008, 6:54:07 PM7/11/08
to transfer-dev
My current idea for doing the plan B is this:

1) TQL Statement to "Delete where not in (1,2,3,....x) - this
deletes any ID's that were not passed into the function
2) transfer.list() function to get all the records for a given "key"
ID - the userID=123 in my example above
3) Loop through the list
3a) <meanwhile, inside the loop> check if this the current id (in
the loop) exists in the list of ID's we passed in
3b) If it does NOT, then we create a new transfer object and save
it, equivalent to a SQL insert statement.

This seems to be the best way I can think of to do it, with the fewest
amount of hits to the db.

On Jul 11, 3:40 pm, "Jennifer Larkin" <jlar...@gmail.com> wrote:
> The reason that we aren't using a manytomany in this case is that I set it
> up and I am flying by the seat of my pants, which luckily are sturdier than
> they look.
>
> So, we can use manytomany for some of the junction tables but not for
> others. It sounds like we actually need two different ways of doing this
> based on whether the junction table contains columns in addition to the
> composite keys, because we shouldn't be using manytomany on tables that
> contain additional columns.
>
>
>
> On Fri, Jul 11, 2008 at 2:27 PM, Brian Kotek <brian...@gmail.com> wrote:
> > There is for a Many to Many relationship, but not for One to Many or Many
> > to One. If this table is ONLY a linking table, is there a reason why you
> > aren't using a Many to Many relationship, because what you're describing is
> > exactly what the Many to Many relationship is meant to be.?
>
> > Otherwise, yes, you'll either have to loop over them and delete them, then
> > rebuild it. Another option would be to determine which values changed and
> > then loop over and remove only the ones that were removed and add the ones
> > that were added.
>

Jon Messer

unread,
Jul 11, 2008, 7:13:09 PM7/11/08
to transf...@googlegroups.com
If this really is a "bulk" operation, why don't you just use the sql and discardByClassAndKeyQuery (or one of the other discards)? If you're talking about updating a single objects related records (and those are the "bulk") then yeah you're gonna have to iterate probably.

Jennifer Larkin

unread,
Jul 11, 2008, 7:13:49 PM7/11/08
to transf...@googlegroups.com
This is the solution that we are going to have to go with in this case. The delete solution isn't going to work for all of the tables and for those tables, this would be the correct solution. We'll have to standardize to this method if we want to use only two methods of doing this and not three.

Jennifer Larkin

unread,
Jul 11, 2008, 7:39:10 PM7/11/08
to transf...@googlegroups.com
Sorry, I probably should have explained why so other people would be able to reference it.

We have business rules around whether or not things in certain junction tables can be deleted. Just because the user has requested that something be deleted, that doesn't mean that the record actually gets deleted. We have to detect the delete request, fire the business rules to determine if the record is deletable, then delete or error as appropriate.

So, for instance, we have a site management tool that allows the admin to select from a list of products to denote which products are supported by their site. When the end user adds their content, they can select from the list of products for that site and select any that are appropriate to their content. The admin tool can not allow the admin to delete the link between the site and the product if there is content assigned to that marketplace and assigned to that product. This could cause the content to fall completely out of the product navigation if it is assigned to only one product.

So in the case of the content to product link, we can use the built in manytomany. In the case of the site to product link, we have to loop and detect whether or not a delete has been requested so that we can then determine on a per row basis whether or not the row can be deleted. There are some cases where deletes can happen across the board but the tables can not be many to many. In that case we *could* delete everything and start over but then we would be doing things three different ways without much added value. I would rather consolidate the two cases that can not use manytomany, just so we are consistent.

Elliott Sprehn

unread,
Jul 15, 2008, 11:21:13 PM7/15/08
to transfer-dev
I second Jon's idea to use the a SQL delete query in a normal
<cfquery>, and then tell transfer to discard the stuff in the cache
that wouldn't be correct anymore. The database was designed to do this
really fast (tons faster than a loop over objects that might not even
be in transfer's cache yet). I think you could probably write one that
only deletes things that wouldn't orphan the stuff like you described.

Just wrap that up in a method in your service layer, and huzzah!

As a side note, TQL doesn't actually have a DELETE statement...
yet. :/

- Elliott
> > On Fri, Jul 11, 2008 at 3:53 PM, Mark Mandel <mark.man...@gmail.com>
> > wrote:
>
> >> If you have the collection from one object -> another
>
> >> Why not just loop around your collection of objects, and then do a
> >> check - if it's meant to be there, keep it, if it's not,delete it,and
> >> if it's not there, add it.
>
> >> Doesn't seem like that big an issue.
>
> >> If you are using m2m, then it manages this for you.
>
> >> Mark
>
> >> On Sat, Jul 12, 2008 at 8:40 AM, Jennifer Larkin <jlar...@gmail.com>
> >> wrote:
> >> > The reason that we aren't using a manytomany in this case is that I set
> >> it
> >> > up and I am flying by the seat of my pants, which luckily are sturdier
> >> than
> >> > they look.
>
> >> > So, we can use manytomany for some of the junction tables but not for
> >> > others. It sounds like we actually need two different ways of doing this
> >> > based on whether the junction table contains columns in addition to the
> >> > composite keys, because we shouldn't be using manytomany on tables that
> >> > contain additional columns.
>
> >> > On Fri, Jul 11, 2008 at 2:27 PM, Brian Kotek <brian...@gmail.com>
> >> wrote:
>
> >> >> There is for a Many to Many relationship, but not for One to Many or
> >> Many
> >> >> to One. If this table is ONLY a linking table, is there a reason why
> >> you
> >> >> aren't using a Many to Many relationship, because what you're
> >> describing is
> >> >> exactly what the Many to Many relationship is meant to be.?
>
> >> >> Otherwise, yes, you'll either have to loop over them and delete them,
> >> then
> >> >> rebuild it. Another option would be to determine which values changed
> >> and
> >> >> then loop over and remove only the ones that were removed and add the
> >> ones
> >> >> that were added.
>
> >> E: mark.man...@gmail.com

Jennifer Larkin

unread,
Jul 16, 2008, 1:39:43 PM7/16/08
to transf...@googlegroups.com
We can not delete any record with an ID that is in use in any of the tables with foreign keys to that ID. We have to delete what can be deleted and collect the name of anything that could not be deleted, returning that list of names to the browser for the user to eliminate the foreign key usage. Some of the referential integrity has to be enforced by the application because it can't be enforced within the database. There are parent-child relationships in the table and we need to cascade the deletes to the child records and run the referential integrity checks for those as well, which could cause a failure in the parent delete.

The request is going to manage actions on over 100 records at a time, inserting, updating, and deleting records as designated by the incoming data, where the individual records pass the business rules. We have to know on a per record basis whether or not the requested action is allowed. We may have 100 records change or we may have 2 of 100.

The objects in question make up the nav bar, so they had better be in the cache-- they are all used on every request. And this data-management function is expected to be used approximately every three months or so, by the single person who has access to the function.

I don't care about the performance; I care about the referential integrity. We have to run actions on a per incoming record basis to manage the actions, business rules, and error handling on a per record basis regardless of if we do that in Transfer or in the database. The objects will always be in the cache. The function will hardly ever get used.

I honestly don't see what the big deal would be in doing this in Transfer instead of in the database since we have to go through the same laborious, labor-intensive process either way.

Brian Kotek

unread,
Jul 16, 2008, 3:08:49 PM7/16/08
to transf...@googlegroups.com
On Wed, Jul 16, 2008 at 1:39 PM, Jennifer Larkin <jla...@gmail.com> wrote:

The request is going to manage actions on over 100 records at a time, inserting, updating, and deleting records as designated by the incoming data, where the individual records pass the business rules. We have to know on a per record basis whether or not the requested action is allowed. We may have 100 records change or we may have 2 of 100.
I honestly don't see what the big deal would be in doing this in Transfer instead of in the database since we have to go through the same laborious, labor-intensive process either way.

You definitely can do this in Transfer, its just not what it was designed to be good at. If you're comfortable with the performance hit of doing it this way, then by all means use it.



Reply all
Reply to author
Forward
0 new messages