To UDF or not to UDF

teddy...@hotmail.com

unread,

Jun 10, 2009, 4:55:56 AM6/10/09

to

Our new clients have a mature database but a new requirement to obtain
some summary data. Frankly, I don't think it's terribly well
designed, but we're stuck with it.

They want us to script a collection of views, and we've done the first
tranche of work. They are now quibbling about the cost of the second
batch, and their DBA has come back with some suggestions for keeping
down costs, specifically that we "could use user defined functions to
replace any replace, convert or case statements and indeed the look
ups from the lookup tables".

Isn't there a performance implication with UDFs as compared, say, to
in-line case/convert etc.?

What do you mavens think?

Edward

Uri Dimant

unread,

Jun 10, 2009, 5:39:51 AM6/10/09

to

Hi
What kind of UDF are you talking about?
Read Alex's article
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/04/21/not-all-udfs-are-bad-for-performance.aspx

<teddy...@hotmail.com> wrote in message
news:d5ee02c3-9d22-4ad4...@o14g2000vbo.googlegroups.com...

teddy...@hotmail.com

unread,

Jun 10, 2009, 6:49:34 AM6/10/09

to

On Jun 10, 10:39 am, "Uri Dimant" <u...@iscar.co.il> wrote:
> Hi
> What kind of UDF are you talking about?

> Read Alex's articlehttp://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/04/21/not-a...

Thanks for the link, Uri - very interesting.

The views need to convert all dates thus:

ISNULL(CONVERT(nvarchar(30), dbo.InterimVisit.VisitDate, 106)

There are any number of CASE statements that the client's DBA thinks
would be candidates for UDFs - e.g.

CASE InterimVisit.SatisfactoryVisit WHEN 1 THEN 'Yes' WHEN 0 THEN 'No'
ELSE '' END AS SatisfactoryVisit

The system also uses a monolithic DropDownList table so where a view
has multiple end points to this table there are multiple JOINS:

dbo.DropDownListItem AS PS ON dbo.InterimVisit.Status1ID =
PS.DropDownListItemID LEFT OUTER JOIN
dbo.DropDownListItem AS RS ON dbo.InterimVisit.RecordStatusID =
RS.DropDownListItemID LEFT OUTER JOIN
dbo.DropDownListItem AS VS ON dbo.InterimVisit.Status2ID =
VS.DropDownListItemID LEFT OUTER JOIN
dbo.DropDownListItem AS PAY ON dbo.InterimVisit.EligibilityID =
PAY.DropDownListItemID LEFT OUTER JOIN

etc.

Does that provide enough information? I don't want to give away any
proprietary data.....

Thanks

Edward

Tibor Karaszi

unread,

Jun 10, 2009, 6:40:18 AM6/10/09

to

My experience is that scalar functions that does data access can be very bad (depending on how you
use them, of course). Scalar function that doesn't do data access also have a higher cost than a
comparable in-line expression (CASE or whatever), but IMO not as drastic.

--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://sqlblog.com/blogs/tibor_karaszi

<teddy...@hotmail.com> wrote in message
news:d5ee02c3-9d22-4ad4...@o14g2000vbo.googlegroups.com...

Uri Dimant

unread,

Jun 10, 2009, 7:04:21 AM6/10/09

to

See another topic written by Alex :-))
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/05/23/reuse-your-code-with-cross-apply.aspx

<teddy...@hotmail.com> wrote in message
news:ff5be60d-9eb6-42df...@m19g2000yqk.googlegroups.com...

Greg D. Moore (Strider)

unread,

Jun 10, 2009, 1:11:16 PM6/10/09

to

"Tibor Karaszi" <tibor_please.n...@hotmail.nomail.com> wrote in
message news:%232dxafb...@TK2MSFTNGP05.phx.gbl...

> My experience is that scalar functions that does data access can be very
> bad (depending on how you use them, of course). Scalar function that
> doesn't do data access also have a higher cost than a comparable in-line
> expression (CASE or whatever), but IMO not as drastic.
>

This has been my experience also, especially when used in WHERE clauses.
They can make the code cleaner, but I've found also generally MUCH slower.

--
Greg Moore
Ask me about lily, an RPI based CMC.

samalex

unread,

Jun 16, 2009, 5:52:07 PM6/16/09

to

Hi Edward,

I run into this from time to time where someone adds a UDF to a query that
returns many rows. UDF's run in a seperate process, so each time you call
it you're creating a new Process for SQL to finish before it continues.
Due to this it really depends on how many records will be returned on how
the UDF will impact the query.

For example if you have a UDF that gets a customer's Zip Code you might
use something like this:

select CustomerName,
dbo.ufnGetZip(CustomerID.CustomerTable) as Zip --> Gets
AddressTable.AddressZip
from CustomerTable
where CustomerID = 123

If you run this for one customer it'll be speedy but if you run it for
50,000 customers it has to run the dbo.ufnGetZip() UDF 50,000 times which
will greatly increase the run time of the query. A simpler way would be
this:

select CustomerTable.CustomerName,
AddressTable.AddressZip
from CustomerTable
left outer join AddressTable on
CustomerTable.CustomerID = AddressTable.CustomerID

... which given your indexing is setup somewhat normal should come back
quickly whether you're hitting 1 or 100,000 records.

So back to your question, if you can void UDF's in such queries I would do
so, whether it's using a case, join, or whatever has to be done. Just
remember that everytime SQL has to spawn a new process to do something,
whether it be a UDF or whatever, your query speed will suffer.

Others can chime in with other suggestions, but this is based on my
experience.

Sam Alex

bill

unread,

Jun 17, 2009, 3:51:39 AM6/17/09

to

teddysn,

I'm a little confused about the client's concern. From the first
message:

"... They are now quibbling about the cost of the second

batch, and their DBA has come back with some suggestions for keeping

down costs..."

Do they mean query cost? If so, the UDF's they are suggesting are
definitely not the way to go, as they will make the queries more
expensive (take longer).

Do they mean monetary cost? That seems weird to me, because I can't
see that typing out a join vs invoking a UDF is going to take
materially more or less development time.

My experience with tables valued UDFs (even multi statement ones) is
consistent with the article cited earlier, i.e. table-valued UDF's are
fast.

I confess to never having used a scalar UDF. I don't see the point.
It's easy enough to just do the requisite join, inline function, etc
and actually seems to be a more natural way to code. I suppose if I
had some tricky mathematical transformations being repeated all the
time a scalar UDF might be useful, but I haven't run into those
situations.

Exactly how did the client's DBA feel UDF's would decrease cost? I'm
not asking that question to be a dork, but because I don't understand.

Thanks,

Bill

Thanks,

Bill

Geoff Schaller

unread,

Jun 17, 2009, 7:33:05 AM6/17/09

to

<ROFL>

Basically the DBA is engaging in some self-serving spin. Less kind
circles would call it wanking.

UDFs have some positive benefit in that they encapsulate business logic
and thereby promote consistency in column calculations and allow
flexibility. Create a UDF and maintain it from a central place and use
it everywhere. Need to change the business logic? Do so in one place.

But as others have explained, there is a cost. The cost is performance
overhead in that the UDF is evaluated for each row. So if your recordset
returns 1000 rows the cost is (probably) trivial. Return a million rows
and you may have to drink a lot of coffee while you wait.

Sometimes the complexity of the business logic screams out for a UDF
simply so that further selects are simple and easy to read/write. There
is no right or wrong way: it all depends on the context of the
implementation. The simple rule of thumb is to try something two ways.
Measure both and pick the most efficient.

Cheers,

Geoff Schaller
Software Objectives

"bill" <billma...@gmail.com> wrote in message
news:19b7e7b6-607b-4e8f...@j32g2000yqh.googlegroups.com:

--CELKO--

unread,

Jun 17, 2009, 2:19:29 PM6/17/09

to

>> I confess to never having used a scalar UDF. I don't see the point. It's easy enough to just do the requisite join, inline function, etc and actually seems to be a more natural way to code. <<

Agreed. That code is also portable, optimizable and cannot have side
effects. That last advantage is harder to explain to most
programmers. We are about to enter an era of Fuctional Programming
because of parallelism.

>> I suppose if I had some tricky mathematical transformations being repeated all the time a scalar UDF might be useful, but I haven't run into those situations.<<

I have. I found it was faster, more portable, etc. to put complex
math into look-up tables and do joins to it. I put a lot of that into
the book THINKING IN SETS. The classic example is non-SQL mindset
programmers who write temporal UDFs for business days, Easter and
other lunar/solar dates, financial functions, stats, etc. using a
procedural approach with various proprietary vendor functions.

The SQL mindset programmer builds a 100 year Calendar table instead!

>> Exactly how did the client's DBA feel UDF's would decrease cost? I'm not asking that question to be a dork, but because I don't understand. <<

He could hire cheaper, VB programmers to write BASIC re-written in T-
SQL? I wish I was kidding about that, but I have seen it.

In fact, I got a "do my job/homework" email yesterday from someone I
don't know. It is BASIC re-written in T-SQL -- if-then-else logic, it
does the audit trail in the same code, etc. I can reduce it to one
INSERT INTO statement and some DDL.