Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

need help building an SQL Statement...

30 views
Skip to first unread message

SpreadTooThin

unread,
May 1, 2013, 4:46:34 PM5/1/13
to
I have a table like:

ImageID, SeriesID, StudyID

ImageID is unique Series and Study are not unique.

StudyID, SeriesID, ImageID
A, a, 1
A, b, 2
B, c, 3
B, c, 4
C, d, 5
C, d, 6
C, e, 7
C, f, 8


I need a report that says:
There are 2 series in Study A and a total of 2 images
There are 1 series in Study B and a total of 2 images
There are 3 series in Study C and a total of 4 images

so sql output should look like:

StudyID, #Series, #Images
A, 2, 2
B, 1, 2
C, 3, 4

Help?
TIA.





Geoff Muldoon

unread,
May 1, 2013, 7:02:23 PM5/1/13
to
In article <c676acec-3551-475e...@googlegroups.com>,
bjobr...@gmail.com says...
select StudyID, count(distinct SeriesID), count(ImageID)
.. etc
google "group by"

GM

SpreadTooThin

unread,
May 1, 2013, 8:11:33 PM5/1/13
to
my thought was:
Select Distinct StudyID, count(SeriesID), count(ImageID) from myTable group by StudyID, SeriesID, ImageID;

Geoff Muldoon

unread,
May 1, 2013, 9:26:01 PM5/1/13
to
bjobr...@gmail.com says...

> > select StudyID, count(distinct SeriesID), count(ImageID)
> >
> > .. etc
> >
> > google "group by"

> my thought was:
> Select Distinct StudyID, count(SeriesID), count(ImageID) from myTable group by StudyID, SeriesID, ImageID;

You shouldn't GROUP BY elements that aren't SELECTed in an un-aggregated
form. And GROUP BY negates the need to a SELECT-wide DISTINCT.

Select
StudyID,
count(distinct SeriesID) as SeriesCount,
count(ImageID) as ImageCount
from myTable
group by StudyID;

GM

SpreadTooThin

unread,
May 2, 2013, 12:53:58 AM5/2/13
to

> You shouldn't GROUP BY elements that aren't SELECTed in an un-aggregated
> form. And GROUP BY negates the need to a SELECT-wide DISTINCT.
>
> Select
> StudyID,
> count(distinct SeriesID) as SeriesCount,
> count(ImageID) as ImageCount
> from myTable
> group by StudyID;
>
> GM

Thank you so much!
I was wondering can sql be scripted into more than one statement?

In other words, could I have a select statement that creates a 'temporary table' then the next line of the script would access that table, do what ever it needs to do with it and then finally the script deletes the temporary table?

Geoff Muldoon

unread,
May 2, 2013, 1:16:33 AM5/2/13
to
bjobr...@gmail.com says...
Google mysql subquery

select year, foobar
from
(
select year, sum(foo) as foobar
from bar
group by year
)
where foobar > 20;

GM

SpreadTooThin

unread,
May 7, 2013, 10:33:02 PM5/7/13
to
I'm trying to improve the speed of this as this is really slow.
This is done in two steps... the first step creates a table of my StudyIDs... then for each of the StudyIDs in that table I count the series... (Or at least I think that this what the second line of sql should do.

create table MyIDs select distinct StudyID from MyTable;
select count(distinct SeriesID) from ImageTable where StudyID in (Select * from MyIDs);

I think 'in' is not right as the will always be true so should it be StudyID=MyIDs.StudyID;


Geoff Muldoon

unread,
May 7, 2013, 11:05:56 PM5/7/13
to
bjobr...@gmail.com says...

> I'm trying to improve the speed of this as this is really slow.
> This is done in two steps... the first step creates a table of my StudyIDs...

Error #1: If you are writing a query to retrieve a data set is is
nearly always wrong to create a new physical database object simply to
achieve that goal.

> then for each of the StudyIDs in that table I count the series... (Or
> at least I think that this what the second line of sql should do.
>
> create table MyIDs select distinct StudyID from MyTable;

> select count(distinct SeriesID) from ImageTable where StudyID in (Select * from MyIDs);

Error #2: The above statement will retrieve a single count of distinct
SeriesID, not one per StudyID.

> I think 'in' is not right as the will always be true so should it be
> StudyID=MyIDs.StudyID;

google SQL COUNT DISTINCT WITH GROUP BY

GM

SpreadTooThin

unread,
May 7, 2013, 11:33:59 PM5/7/13
to
On Tuesday, May 7, 2013 9:05:56 PM UTC-6, Geoff Muldoon wrote:
> bjobr...@gmail.com says...
>
>
>
> > I'm trying to improve the speed of this as this is really slow.
>
> > This is done in two steps... the first step creates a table of my StudyIDs...
>
>
>
> Error #1: If you are writing a query to retrieve a data set is is
>
> nearly always wrong to create a new physical database object simply to
>
> achieve that goal.
>
Nearly, unless you have a script to write and don't want to regenerate that table again and again.
So my question is how can I use that tables contents in the subsequent select, because I obviously don't have the syntax correct.

Geoff Muldoon

unread,
May 8, 2013, 12:44:46 AM5/8/13
to
bjobr...@gmail.com says...
>
> On Tuesday, May 7, 2013 9:05:56 PM UTC-6, Geoff Muldoon wrote:
> > bjobr...@gmail.com says...

> > > This is done in two steps... the first step creates a table of my
> > >StudyIDs...
> >
> > Error #1: If you are writing a query to retrieve a data set is is
> > nearly always wrong to create a new physical database object simply
> > to achieve that goal.
> >
> Nearly, unless you have a script to write and don't want to regenerate that table again and again.

But why create the table at all? You don't need it.

GM

Jerry Stuckle

unread,
May 8, 2013, 6:55:50 AM5/8/13
to
Sorry, coming in late on this one (Rebuilding a new laptop to replace
one that died).

I agree with Goeff. Your table is completely unnecessary and the wrong
way to go.

In fact, all of his advice has been right on; I suggest you seriously
look at his recommendations.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstu...@attglobal.net
==================

SpreadTooThin

unread,
May 8, 2013, 8:37:37 PM5/8/13
to
Well here is the issue.
I need a table with the StudyIDs.... Where they are generated by the Date Range.
Some records of the StudyID don't have their dates set correctly they are 0000-00-00
So when combined with the between dates the counts don't match because some records don't have their dates set.

table = select distinct StudyID from MyTable where StudyDT between '20130501' and '20130531';
select count(distinct SeriesID) from MyTable where StudyID=table.StudyID;

Maybe these can be combined.. by my newbie brain is not giving me the answer...

Geoff Muldoon

unread,
May 8, 2013, 9:35:46 PM5/8/13
to
In article <ce17cabe-60d7-4c4b...@googlegroups.com>,
bjobr...@gmail.com says...
What's wrong with:

select
StudyID,
count(distinct SeriesID)
from
MyTable
where StudyDT between '20130501' and '20130531'
group by
StudyID;

Or if you want the SeriesID to be included in the count even if it is
for a study date outside the date restiction, but where that StudyID has
some other record what IS within the date restiction, then you counld
use table aliasing and a self-join dependent subquery:

select
distinct TableOuter.StudyID,
(select
count(distinct TableInner.SeriesID)
from MyTable TableInner
where TableInner.StudyID = TableOuter.StudyID) as SeriesCount
from
MyTable TableOuter
where TableOuter.StudyDT between '20130501' and '20130531';

Note: the above is only one of a possible series of syntactical options,
some of which might be better than others depending on record counts,
indexing strategy, etc.

GM

SpreadTooThin

unread,
May 12, 2013, 12:16:56 PM5/12/13
to

> Or if you want the SeriesID to be included in the count even if it is
> for a study date outside the date restiction, but where that StudyID has
> some other record what IS within the date restiction, then you counld
> use table aliasing and a self-join dependent subquery:
>
> select
> distinct TableOuter.StudyID,
> (select
> count(distinct TableInner.SeriesID)
> from MyTable TableInner
> where TableInner.StudyID = TableOuter.StudyID) as SeriesCount
> from
> MyTable TableOuter
> where TableOuter.StudyDT between '20130501' and '20130531';
>

I'm confused by the line:
from MyTable TableInner.
Can you specify two table names in the from?

Erick T. Barkhuis

unread,
May 12, 2013, 2:05:44 PM5/12/13
to
SpreadTooThin:


>> select
>> distinct TableOuter.StudyID,
>> (select
>> count(distinct TableInner.SeriesID)
>> from MyTable TableInner
>> where...
>>
>
>I'm confused by the line:
>from MyTable TableInner.
>Can you specify two table names in the from?

No, it reads
from MyTable AS TableInner
but Geoff uses the shorthand (allowing him to post faster)


--
Erick

Geoff Muldoon

unread,
May 12, 2013, 7:39:43 PM5/12/13
to
In article <60fb64ea-1cfd-4d9c...@googlegroups.com>,
bjobr...@gmail.com says...
As Eric posted, TableInner and TableOuter are simply two instances of
aliases, both for MyTable. Because I've used MyTable twice in the same
statement aliases were necessary so I could do a self-join.

And of course you can select more than one table (comma separated, if no
comma the second word is treated as an alias) in the from, eg:
select a.name, b.type
from table1 a, table2 b
where a.pk = b.fk;

I think you better get a copy of SQL for Dummies.

GM
0 new messages