SELECT * vs SELECT col1, col2

Danny Burton

unread,

Sep 17, 2007, 4:25:30 PM9/17/07

to

I know that in order to reduce network traffic you should always select only
the columns you require but I have a stored procedure which must **always**
return all columns from a specific table.

When returning all columns from a table is there any performance gain (or
any other reason) for using SELECT col1, col2 etc... instead of just SELECT
* ?

I can immediately think of a disadvantage which is that if you include named
columns in your SELECT then you must update the stored procedure each time a
column is added to the table, using SELECT * gets around this so the stored
procedure will never need to be modified.

Would anyone care to share their thoughts on the above?

Thanks,

Clive

Leythos

unread,

Sep 17, 2007, 5:06:11 PM9/17/07

to

In article <Uv-dnaiBfeH...@giganews.com>, no@email says...

Imagine if you returned columns that the application requesting the info
didn't know how to handle. Select * means you didn't know what you were
expecting on one side or the other or both....

--

Leythos
- Igitur qui desiderat pacem, praeparet bellum.
- Calling an illegal alien an "undocumented worker" is like calling a
drug dealer an "unlicensed pharmacist"
spam9...@rrohio.com (remove 999 for proper email address)

David Portas

unread,

Sep 17, 2007, 5:11:18 PM9/17/07

to

"Danny Burton" <no@email> wrote in message
news:Uv-dnaiBfeH...@giganews.com...

When doing maintenance or making changes it's often useful to do a search
over your code to find exactly where a column is being referenced. If you
use * rather than column names then you will be less likely to find
potential dependencies in your code. That can make debugging a much harder
task.

In some environments it can be difficult to maintain total control over the
logical column order returned by *. For example it's easy to re-create a
table in development (maybe even by accident) but it's probably totally
unacceptable to do that in production. So if you use SELECT * you may get
columns returned in some unexpected order, which could be a problem
depending on how your client process consumes that data. By the way, I don't
recommend that data be processed based on fixed column order anyway, but
based on your remark about not wanting to make code changes when you add new
columns I assume that must be what you are doing.

Some code may break or give the wrong results due to column order
dependencies or new or missing columns. For example UNION or CHECKSUM(*) or
INSERT without a column list. Regarding INSERT, how will you populate your
table if you don't know what columns exist in it?.

It is also a foundation of relational database design, that columns are
identified by name and not position. The compromise you are suggesting is
only possible at all because SQL violates that principle.

In summary, experience suggests it is much less trouble to maintain code
with column names than without them.

One other consideration. One day someone else will review or even inherit
your code. Using SELECT * in your code just guarantees an oppotunity for
someone to point the finger and tut-tut about best practices and coding
standards...

--
David Portas

Erland Sommarskog

unread,

Sep 17, 2007, 5:12:22 PM9/17/07

to

Danny Burton (no@email) writes:
> I know that in order to reduce network traffic you should always select
> only the columns you require but I have a stored procedure which must
> **always** return all columns from a specific table.

Unless this is a temp table created within the procedure, I would
question the wise in this. A question that I often have reason to
ask about our database is "is this column actually used for something".
It is not uncommon that I find a couple of procedures that return this
column, but I also see that they return of columns that are of no
interest in that context, so it looks like "let's select all columns
while we're at it, in case we need them later".

A stored procedure should in my opinion only return the column that
the caller actually requests.

> When returning all columns from a table is there any performance gain
> (or any other reason) for using SELECT col1, col2 etc... instead of just
> SELECT * ?

No.

> I can immediately think of a disadvantage which is that if you include
> named columns in your SELECT then you must update the stored procedure
> each time a column is added to the table, using SELECT * gets around
> this so the stored procedure will never need to be modified.

As I said, I question the wise in this. And what if the colunms in the
tables are rearranged, renamed or a column is dropped? With SELECT *,
the procedure will continue to run, but the client may be utterly
confused. With listed columns, you would get an error if columns are
renamed or dropped. And if they are merely rearranged, the client
will not even notice.

--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Hugo Kornelis

unread,

Sep 19, 2007, 3:03:26 PM9/19/07

to

On Mon, 17 Sep 2007 21:25:30 +0100, Danny Burton wrote:

>When returning all columns from a table is there any performance gain (or
>any other reason) for using SELECT col1, col2 etc... instead of just SELECT
>* ?

Hi Clive,

I have to contradict Erland here - there will be a small performance
hit, that might (in very rare cases) grow to be a annoying enough to be
noticeable.

For SELECT *, the first step in parsing is to find out which columns are
in the table. That requires a read on the syscolumns system table in SQL
Server 2000, or it's undocumented equivalent in SQL Server 2005. That in
itself means that you'll have a small bit of overhead, due to requesting
a lock, getting it, reading the data, and releasing the lock again.

If this happens frequently in a system that also frequently adds or
removes columns to of from tables, you might see blocking on the system
table, since the Sch-S (schema stability) lock taken for reading the
columns can't co-exist with the Sch-M (schema modification) lock taken
for changing the table structure.

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis

cph...@gmail.com

unread,

Sep 19, 2007, 4:21:53 PM9/19/07

to

> I know that in order to reduce network traffic you should always select only
> the columns you require but I have a stored procedure which must **always**
> return all columns from a specific table.

Reducing traffic really isn't the main reason.

> When returning all columns from a table is there any performance gain (or
> any other reason) for using SELECT col1, col2 etc... instead of just SELECT
> * ?

Simply put, when you are writing production code you should KNOW for
certain exactly how and why everything works. You should know exactly
which columns are going to be returned; even if it happens to be all
of them.

> I can immediately think of a disadvantage which is that if you include named
> columns in your SELECT then you must update the stored procedure each time a
> column is added to the table, using SELECT * gets around this so the stored
> procedure will never need to be modified.

Yes, you will have to modify the procedure in that case. However,
while that does require that you spend time actually doing work, it is
still preferable to a situation where somebody changed a table
somewhere and your stored procedure keeps "working" despite returning
the wrong data. If all they did is add a column you might be okay -
what if they changed the order of some columns? What if they dropped
a column your application was expecting?

While there may not be a significant performance gain, there might
well be gains in terms of not getting phone calls in the middle of the
night because some application crashed, or because some job failed
when a table was changed earlier in the day.

> Would anyone care to share their thoughts on the above?

Using SELECT * in a procedure, view, or whatever is generally just
lazy and sloppy.

Erland Sommarskog

unread,

Sep 19, 2007, 5:10:46 PM9/19/07

to

Hugo Kornelis (hu...@perFact.REMOVETHIS.info.INVALID) writes:
> For SELECT *, the first step in parsing is to find out which columns are
> in the table. That requires a read on the syscolumns system table in SQL
> Server 2000, or it's undocumented equivalent in SQL Server 2005. That in
> itself means that you'll have a small bit of overhead, due to requesting
> a lock, getting it, reading the data, and releasing the lock again.

But if you list all columns in the table, the optimizer still has to
read all rows in sys.columns for the table verify that all columns
exists, to find their data types etc. So the amount of locking would be
the same.

Knowledgy

unread,

Jan 19, 2008, 7:08:39 AM1/19/08

to

Adding columns to tables frequently points to a poor database design.

--
Sincerely,
John K
Knowledgy Consulting
http://knowledgy.org

Atlanta's Business Intelligence and Data Warehouse Experts

"Danny Burton" <no@email> wrote in message
news:Uv-dnaiBfeH...@giganews.com...

Serge Rielau

unread,

Jan 19, 2008, 10:23:46 AM1/19/08

to

Knowledgy wrote:
> Adding columns to tables frequently points to a poor database design.

... or a vibrant changing company with new requirements in their IT...
I see no harm in adding columns. Things that don't change are dead...

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Brian Bunin

unread,

Jan 27, 2008, 1:02:51 PM1/27/08

to

In article <H72dnbkTto7UdQza...@comcast.com>, "Knowledgy" <atlanta business intelligence consultants> wrote:
>Adding columns to tables frequently points to a poor database design.
>

It must be nice to work for a company that doesnt change the requirements
either mid development, or after it's deployed.

Knowledgy

unread,

Feb 28, 2008, 8:51:04 PM2/28/08

to

It's called abstraction and proper database design

--
Sincerely,
John K
Knowledgy Consulting

http://knowledgy.org/

Atlanta's Business Intelligence and Data Warehouse Experts

"Brian Bunin" <bb...@hotmail.com> wrote in message
news:fH3nj.808$R84...@newssvr25.news.prodigy.net...

jhof...@googlemail.com

unread,

Mar 3, 2008, 8:11:59 AM3/3/08

to

> It's called abstraction and proper database design

BI != transactional DB

And even if it were, in my experience it is very (very very) rare that
a BI system doesn't evolve over time. New business areas, new
dimension attributes, new hierarchies, performance optimisations, etc,
etc...