I have a table of servers that receive updates on a regular basis
CREATE TABLE [dbo].[tbl_ServerUpdate]( [ServerUpdateID] [int] NOT NULL, [Server] [nvarchar](256) NOT NULL, [UpdateStamp] [datetime] NOT NULL, )
I then have tolerance values in another table which I can get like this (they are in terms of minutes):
DECLARE @Interval INT
SELECT @Interval = IntervalValue FROM tbl_Interval (NOLOCK) WHERE IntervalID = 1
I want to know which servers have NOT received an update within the tolerance value. For example, give me the set of Server that have not received an update in the last 5 (@Interval) minutes.
Ideally, you would have a table of servers. But since you (appearently) don't, then
Select u.Server From dbo.tbl_ServerUpdate u Group By u.Server Having Max(u.UpdateStamp) < (Select DateAdd(mi, -1 * i.IntervalValue, GetDate()) From dbo.tbl_Interval i Where i.IntervalID = 1)
That will find any server with no row with UpdateStamp within i.IntervalValue minutes provided that server has at least one row in dbo.tbl_ServerUpdate. It also assumes there is only one row in dbo.tbl_Interval with IntervalID = 1.
>I have a table of servers that receive updates on a regular basis
> CREATE TABLE [dbo].[tbl_ServerUpdate]( > [ServerUpdateID] [int] NOT NULL, > [Server] [nvarchar](256) NOT NULL, > [UpdateStamp] [datetime] NOT NULL, > )
> I then have tolerance values in another table which I can get like this > (they are in terms of minutes):
> DECLARE @Interval INT
> SELECT @Interval = IntervalValue > FROM tbl_Interval (NOLOCK) > WHERE IntervalID = 1
> I want to know which servers have NOT received an update within the > tolerance value. For example, give me the set of Server that have not > received an update in the last 5 (@Interval) minutes.
What is an update_id? What would it mean in a logical data model? Surely, you did NOT just physically number the rows in a table!
This table also had no key, so I made an assumption that you want to use the (originally very long AND possibly in Chinese!) server names. But isn't the update interval logically an attribute of each server? Shouldn't it be in the Servers table? I cannot see an interval floating around as an entity in itself.
CREATE TABLE Servers (server_name VARCHAR(25) NOT NULL PRIMARY KEY, update_stamp DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL, update_interval INTEGER NOT NULL CHECK(update_interval > 0), etc. );
If you need this a lot, put it in a VIEW.
CREATE VIEW ExpiredServers (server_name) AS SELECT server_name FROM Servers WHERE update_stamp < DATEADD(MI, -update_interval, CURRENT_TIMESTAMP);
Well, I do have a table of servers....its the one that I described below. Its just a little misleading. The Server column is actually the ip address and the primary key is just the identity. But, it actually contains all the servers.
"Tom Cooper" <tomcoo...@comcast.no.spam.please.net> wrote in message
> Ideally, you would have a table of servers. But since you (appearently) > don't, then
> Select u.Server > From dbo.tbl_ServerUpdate u > Group By u.Server > Having Max(u.UpdateStamp) < > (Select DateAdd(mi, -1 * i.IntervalValue, GetDate()) > From dbo.tbl_Interval i Where i.IntervalID = 1)
> That will find any server with no row with UpdateStamp within > i.IntervalValue minutes provided that server has at least one row in > dbo.tbl_ServerUpdate. It also assumes there is only one row in > dbo.tbl_Interval with IntervalID = 1.
>> CREATE TABLE [dbo].[tbl_ServerUpdate]( >> [ServerUpdateID] [int] NOT NULL, >> [Server] [nvarchar](256) NOT NULL, >> [UpdateStamp] [datetime] NOT NULL, >> )
>> I then have tolerance values in another table which I can get like this >> (they are in terms of minutes):
>> DECLARE @Interval INT
>> SELECT @Interval = IntervalValue >> FROM tbl_Interval (NOLOCK) >> WHERE IntervalID = 1
>> I want to know which servers have NOT received an update within the >> tolerance value. For example, give me the set of Server that have not >> received an update in the last 5 (@Interval) minutes.
In the real world, everything is not a beautifully static picture of relational data. I didn't include the key because I don't think its relevant to the solution. But since you brought it up: is it better to have a natural key on a varchar or an identity key as an integer that doesn't really have any intrinsic business meaning
? In theory, it should be the varchar, but in practice its more performant to index an int rather than a varchar(25). Is that not true?
The interval is not an attribute of the server, but rather an attribute of the update type. In my case, there are seven different minute intervals. So, in the end, I need to know - for each update type - which server did not receive it within the last [interval] (in terms of minutes). The interval is configurable by the operations group - sometimes they may want to be alerted when a server didn't receive an update in the last 3 minutes, sometimes in the last 10 minutes. They can change this real-time in the database.
I am not a TSQL guru - thats why I use this newsgroup from time to time. I also have every one of your books on my shelf. However, the corporate world is NOT academic and there is a tradeoff between having an understandable data model that is easy to work with and having a data model that can serve as a model in CS405 (or whatever). Sometimes we have to denormalize or flatten data to be able to run reports in an acceptable time; sometimes we have to add bit columns that could easily be derived from another to improve query performance.
The balance you give is valuable, so I hope you continue to berate us when our designs violate, but I'll bet - despite all the companies you have worked with - you can't name one that had a data model that gave you a boner.
"--CELKO--" <jcelko...@earthlink.net> wrote in message
> What is an update_id? What would it mean in a logical data model? > Surely, you did NOT just physically number the rows in a table!
> This table also had no key, so I made an assumption that you want to > use the (originally very long AND possibly in Chinese!) server names. > But isn't the update interval logically an attribute of each server? > Shouldn't it be in the Servers table? I cannot see an interval > floating around as an entity in itself.
> CREATE TABLE Servers > (server_name VARCHAR(25) NOT NULL PRIMARY KEY, > update_stamp DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL, > update_interval INTEGER NOT NULL > CHECK(update_interval > 0), > etc. > );
> If you need this a lot, put it in a VIEW.
> CREATE VIEW ExpiredServers (server_name) > AS > SELECT server_name > FROM Servers > WHERE update_stamp > < DATEADD(MI, -update_interval, CURRENT_TIMESTAMP);
What if there are no rows for the server? I also need to consider that. So, in other words, if there should have been an update in the last 5 minutes and there are none, it should be present in the results.
"Tom Cooper" <tomcoo...@comcast.no.spam.please.net> wrote in message
> Ideally, you would have a table of servers. But since you (appearently) > don't, then
> Select u.Server > From dbo.tbl_ServerUpdate u > Group By u.Server > Having Max(u.UpdateStamp) < > (Select DateAdd(mi, -1 * i.IntervalValue, GetDate()) > From dbo.tbl_Interval i Where i.IntervalID = 1)
> That will find any server with no row with UpdateStamp within > i.IntervalValue minutes provided that server has at least one row in > dbo.tbl_ServerUpdate. It also assumes there is only one row in > dbo.tbl_Interval with IntervalID = 1.
>> CREATE TABLE [dbo].[tbl_ServerUpdate]( >> [ServerUpdateID] [int] NOT NULL, >> [Server] [nvarchar](256) NOT NULL, >> [UpdateStamp] [datetime] NOT NULL, >> )
>> I then have tolerance values in another table which I can get like this >> (they are in terms of minutes):
>> DECLARE @Interval INT
>> SELECT @Interval = IntervalValue >> FROM tbl_Interval (NOLOCK) >> WHERE IntervalID = 1
>> I want to know which servers have NOT received an update within the >> tolerance value. For example, give me the set of Server that have not >> received an update in the last 5 (@Interval) minutes.
>> is it better to have a natural key on a VARCHAR(n) or an IDENTITY key [SIC] as an integer that doesn't really have any intrinsic business meaning? <<
How important is data integrity? Why not use a pointer chain DB instead of mimicking it in SQL? At least they have features to do garbage collection, restore chains, etc. Eventually, denormalization and short-cuts will come back and bite you.
>> In theory, it should be the VARCHAR, but in practice its more performant to index an INTEGER rather than a VARCHAR(25). Is that not true? <<
It depends on the product. You automatically assumed indexing; look at hashing in Teradata. Longer strings for keys lead to easy perfect hashing, which is always one probe as opposed to multiple probes with an index when the database gets large. If you are using hidden pointer chains like SQL Anywhere to implement PK-FK, there is no difference. SQL Server does have a lot of "code museum" problems and this is one of them; one type of simple B+ Tree index is used for all data types and distributions. However, until you get to large DBs, it works fine and it will get a shot in the arm from 64-bit hardware, too.
>> The interval is not an attribute of the server, but rather an attribute of the update type. In my case, there are seven different minute intervals. So, in the end, I need to know - for each update type - which server did not receive it within the last [interval] (in terms of minutes). The interval is configurable by the operations group - sometimes they may want to be alerted when a server didn't receive an update in the last 3 minutes, sometimes in the last 10 minutes. They can change this real-time in the database. <<
A type is an attribute by definition, so what entity does the update_type belong to, if not a server? I would think from this description, you would have seven columns for the logically different updates, and seven columns for their corresponding intervals. Or is this a repeating group where not all update types apply to all servers and the update types can be changed?
>> I also have every one of your books on my shelf. <<
Neat! Number 7 is out in 2008 February.
>> .. I'll bet - despite all the companies you have worked with - you can't name one that had a data model that gave you a boner. <<
LOL! Remember what I do for a living -- this is like asking a doctor why all his patients are sick!
The best one I can remember just off hand was for a credit management company. They had just designed it and wanted a two day review from me. The only real problem I found was that a bunch of the columns were stubbed in with a magical CHAR(1) NOT NULL data type; they were still working on the encodings and waiting for advise from legal.
Another one was a software company with a portal product that manages corporate software access. We replaced ~60 tables with a nested set model that lead to ~6 tables for the core processes. In fairness, the original model had grown over time from a denormalized model on a small platform to a mainframe tool. People kept adding tables to it as a work-around and it had become a jungle.
That query (not tested since you didn't supply INSERT's of sample data for either table nor DDL for tbl_Interval) should return data for every server which has at least one row in that table, but does not have a row with a time within the last five minutes. So if a server does not have a row within the last five minutes, but does have one from 8 minutes ago, or one from 100 years ago, that server will be returned. But if there are no rows at all for a server, not just no rows in the last five minutes, that server will, of course, not be returned.
> What if there are no rows for the server? I also need to consider that. > So, in other words, if there should have been an update in the last 5 > minutes and there are none, it should be present in the results.
> "Tom Cooper" <tomcoo...@comcast.no.spam.please.net> wrote in message > news:OtmtZGKNIHA.3384@TK2MSFTNGP04.phx.gbl... >> Ideally, you would have a table of servers. But since you (appearently) >> don't, then
>> Select u.Server >> From dbo.tbl_ServerUpdate u >> Group By u.Server >> Having Max(u.UpdateStamp) < >> (Select DateAdd(mi, -1 * i.IntervalValue, GetDate()) >> From dbo.tbl_Interval i Where i.IntervalID = 1)
>> That will find any server with no row with UpdateStamp within >> i.IntervalValue minutes provided that server has at least one row in >> dbo.tbl_ServerUpdate. It also assumes there is only one row in >> dbo.tbl_Interval with IntervalID = 1.
>>> CREATE TABLE [dbo].[tbl_ServerUpdate]( >>> [ServerUpdateID] [int] NOT NULL, >>> [Server] [nvarchar](256) NOT NULL, >>> [UpdateStamp] [datetime] NOT NULL, >>> )
>>> I then have tolerance values in another table which I can get like this >>> (they are in terms of minutes):
>>> DECLARE @Interval INT
>>> SELECT @Interval = IntervalValue >>> FROM tbl_Interval (NOLOCK) >>> WHERE IntervalID = 1
>>> I want to know which servers have NOT received an update within the >>> tolerance value. For example, give me the set of Server that have not >>> received an update in the last 5 (@Interval) minutes.
>>> is it better to have a natural key on a VARCHAR(n) or an IDENTITY key >>> [SIC] as an integer that doesn't really have any intrinsic business >>> meaning? <<
> How important is data integrity? Why not use a pointer chain DB > instead of mimicking it in SQL? At least they have features to do > garbage collection, restore chains, etc. Eventually, denormalization > and short-cuts will come back and bite you.
Well, why not use the simplex method to determine the optimal distribution of resources? The answer is pretty simple - the complexity is not worth the investment. Data integrity is always important - but in this case, we are dealing with mostly read-only data. So, no need to worry about transacted operations. I agree that denormalization and short-cuts come back to bite you, but sometimes they are a fact of life. And, sometimes its not so much a bite, but just a little nibble, which most times is acceptable.
>>> In theory, it should be the VARCHAR, but in practice its more performant >>> to index an INTEGER rather than a VARCHAR(25). Is that not true? <<
> It depends on the product. You automatically assumed indexing; look > at hashing in Teradata. Longer strings for keys lead to easy perfect > hashing, which is always one probe as opposed to multiple probes with > an index when the database gets large. If you are using hidden > pointer chains like SQL Anywhere to implement PK-FK, there is no > difference. SQL Server does have a lot of "code museum" problems and > this is one of them; one type of simple B+ Tree index is used for all > data types and distributions. However, until you get to large DBs, it > works fine and it will get a shot in the arm from 64-bit hardware, > too.
I hate to throw additional address space at the problem; also hate to throw beefed up hardware at the problem. I don't really know what you mean by "pointer chains" in SQL. What are you saying? That there is an internal implementation that makes it faster to operate on indexed string data?
>>> The interval is not an attribute of the server, but rather an attribute >>> of the update type. In my case, there are seven different minute >>> intervals. So, in the end, I need to know - for each update type - which >>> server did not receive it within the last [interval] (in terms of >>> minutes). The interval is configurable by the operations group - >>> sometimes they may want to be alerted when a server didn't receive an >>> update in the last 3 minutes, sometimes in the last 10 minutes. They can >>> change this real-time in the database. <<
> A type is an attribute by definition, so what entity does the > update_type belong to, if not a server? I would think from this > description, you would have seven columns for the logically different > updates, and seven columns for their corresponding intervals. Or is > this a repeating group where not all update types apply to all servers > and the update types can be changed?
The update_type belongs to the "system" and is associated with the server. You would have seven columns for the logically different updates where? And where would you place the corresponding intervals? This seems to go against your typically relational advice. What datatype would these columns be...datetime? I am not sure I understand your suggestion.
>>> I also have every one of your books on my shelf. <<
> Neat! Number 7 is out in 2008 February.
And what is number 7? I hope it covers rank and a bunch of other features that MS has not been able to implement correctly, yet - until they are able to hire the developers from Oracle who did it 5 years ago.
>>> .. I'll bet - despite all the companies you have worked with - you >>> can't name one that had a data model that gave you a boner. <<
> LOL! Remember what I do for a living -- this is like asking a doctor > why all his patients are sick!
> The best one I can remember just off hand was for a credit management > company. They had just designed it and wanted a two day review from > me. The only real problem I found was that a bunch of the columns > were stubbed in with a magical CHAR(1) NOT NULL data type; they were > still working on the encodings and waiting for advise from legal.
Never underestimate the power of char(1) and the misdirection of legal.
> Another one was a software company with a portal product that manages > corporate software access. We replaced ~60 tables with a nested set > model that lead to ~6 tables for the core processes. In fairness, the > original model had grown over time from a denormalized model on a > small platform to a mainframe tool. People kept adding tables to it > as a work-around and it had become a jungle.
Hmmm...I would be interested in seeing that. How do you take 60 tables to 6 and what exactly is a "nested set" model? Are you talking about the query structure or the data storage?
>>> is it better to have a natural key on a VARCHAR(n) or an IDENTITY key >>> [SIC] as an integer that doesn't really have any intrinsic business >>> meaning? <<
> How important is data integrity? Why not use a pointer chain DB > instead of mimicking it in SQL? At least they have features to do > garbage collection, restore chains, etc. Eventually, denormalization > and short-cuts will come back and bite you.
>>> In theory, it should be the VARCHAR, but in practice its more performant >>> to index an INTEGER rather than a VARCHAR(25). Is that not true? <<
> It depends on the product. You automatically assumed indexing; look > at hashing in Teradata. Longer strings for keys lead to easy perfect > hashing, which is always one probe as opposed to multiple probes with > an index when the database gets large. If you are using hidden > pointer chains like SQL Anywhere to implement PK-FK, there is no > difference. SQL Server does have a lot of "code museum" problems and > this is one of them; one type of simple B+ Tree index is used for all > data types and distributions. However, until you get to large DBs, it > works fine and it will get a shot in the arm from 64-bit hardware, > too.
>>> The interval is not an attribute of the server, but rather an attribute >>> of the update type. In my case, there are seven different minute >>> intervals. So, in the end, I need to know - for each update type - which >>> server did not receive it within the last [interval] (in terms of >>> minutes). The interval is configurable by the operations group - >>> sometimes they may want to be alerted when a server didn't receive an >>> update in the last 3 minutes, sometimes in the last 10 minutes. They can >>> change this real-time in the database. <<
> A type is an attribute by definition, so what entity does the > update_type belong to, if not a server? I would think from this > description, you would have seven columns for the logically different > updates, and seven columns for their corresponding intervals. Or is > this a repeating group where not all update types apply to all servers > and the update types can be changed?
>>> I also have every one of your books on my shelf. <<
> Neat! Number 7 is out in 2008 February.
>>> .. I'll bet - despite all the companies you have worked with - you >>> can't name one that had a data model that gave you a boner. <<
> LOL! Remember what I do for a living -- this is like asking a doctor > why all his patients are sick!
> The best one I can remember just off hand was for a credit management > company. They had just designed it and wanted a two day review from > me. The only real problem I found was that a bunch of the columns > were stubbed in with a magical CHAR(1) NOT NULL data type; they were > still working on the encodings and waiting for advise from legal.
> Another one was a software company with a portal product that manages > corporate software access. We replaced ~60 tables with a nested set > model that lead to ~6 tables for the core processes. In fairness, the > original model had grown over time from a denormalized model on a > small platform to a mainframe tool. People kept adding tables to it > as a work-around and it had become a jungle.
On Sun, 2 Dec 2007 12:18:29 -0600, Karch wrote: >In the real world, everything is not a beautifully static picture of >relational data. I didn't include the key because I don't think its relevant >to the solution. But since you brought it up: is it better to have a natural >key on a varchar or an identity key as an integer that doesn't really have >any intrinsic business meaning
Hi Karch,
In spite of everything Joe Celko says, the best solution is often to have both. You use the natural key (I prefer the term real key, BTW) in the table for the entity, and add a surrogate key to use in other tables referencing the entity.
CREATE TABLE Servers (ServerID int NOT NULL IDENTITY, ServerName varchar(25) NOT NULL, -- other columns, PRIMARY KEY (ServerID), UNIQUE (ServerName) );
CREATE TABLE SomethingThatRefersToServers (SomethingThatRefersToServers int NOT NULL IDENTITY, -- Column(s) for the real key go here ServerID int NOT NULL, PRIMARY KEY (SomethingThatRefersToServers), -- UNIQUE constraint for real key, FOREIGN KEY (ServerID) REFERENCES Servers (ServerID) ON DELETE NO ACTION );
Don't forget to add the unique constraints on the real key column(s). And don't forget to hide the surrogate key values from end users. Use a view instead:
CREATE VIEW ViewForSomethingThatRefersToServers AS SELECT r.RealKeyColumns, s.ServerName FROM SomethingThatRefersToServers AS r INNER JOIN Servers AS s ON s.ServerID = r.ServerID;
>? In theory, it should be the varchar, but in practice its more performant >to index an int rather than a varchar(25). Is that not true?
With the above design, you have the performance of indexing an integer column, and the integrity assurances of using a real key.
On Sun, 2 Dec 2007 23:48:06 -0600, Karch wrote: >Hmmm...I would be interested in seeing that. How do you take 60 tables to 6 >and what exactly is a "nested set" model? Are you talking about the query >structure or the data storage?
Hi Karch,
Since you already own all books by Joe Celko, you can find the answer in his Trees and Hierarchies book.
>>Hmmm...I would be interested in seeing that. How do you take 60 tables to >>6 >>and what exactly is a "nested set" model? Are you talking about the query >>structure or the data storage?
> Hi Karch,
> Since you already own all books by Joe Celko, you can find the answer in > his Trees and Hierarchies book.
Elegant solution, but the fact is: I can't add views everytime I want to do something relatively simple as this. It just makes finding and deciphering things to complicated for developers. What I ended up doing is maintaining two static data tables: one for the servers and one for the intervals. Then the solution was easy - using a CASE statement, obtain the configuration value - @Window - (in minutes) and do this:
SET @Time = ( SELECT DATEADD(minute, @Window, GETUTCDATE()) )
UPDATE @Server SET HasUpdate = 1 WHERE EXISTS ( SELECT 1 FROM @Server N, Deliveries D ( NOLOCK ) WHERE N.ServerAddress = D.ServerAddress AND UpdateTypeID = @UpdateTypeID AND UpdateStamp > @Time ) ;
SELECT ServerAddress FROM @Server WHERE HasUpdate = 0 ORDER BY ServerAddress
How does that look to you (contained in a stored procedure)?
"Hugo Kornelis" <h...@perFact.REMOVETHIS.info.INVALID> wrote in message
>>In the real world, everything is not a beautifully static picture of >>relational data. I didn't include the key because I don't think its >>relevant >>to the solution. But since you brought it up: is it better to have a >>natural >>key on a varchar or an identity key as an integer that doesn't really have >>any intrinsic business meaning
> Hi Karch,
> In spite of everything Joe Celko says, the best solution is often to > have both. You use the natural key (I prefer the term real key, BTW) in > the table for the entity, and add a surrogate key to use in other tables > referencing the entity.
> CREATE TABLE Servers > (ServerID int NOT NULL IDENTITY, > ServerName varchar(25) NOT NULL, > -- other columns, > PRIMARY KEY (ServerID), > UNIQUE (ServerName) > );
> CREATE TABLE SomethingThatRefersToServers > (SomethingThatRefersToServers int NOT NULL IDENTITY, > -- Column(s) for the real key go here > ServerID int NOT NULL, > PRIMARY KEY (SomethingThatRefersToServers), > -- UNIQUE constraint for real key, > FOREIGN KEY (ServerID) REFERENCES Servers (ServerID) > ON DELETE NO ACTION > );
> Don't forget to add the unique constraints on the real key column(s). > And don't forget to hide the surrogate key values from end users. Use a > view instead:
> CREATE VIEW ViewForSomethingThatRefersToServers > AS SELECT r.RealKeyColumns, s.ServerName > FROM SomethingThatRefersToServers AS r > INNER JOIN Servers AS s > ON s.ServerID = r.ServerID;
>>? In theory, it should be the varchar, but in practice its more performant >>to index an int rather than a varchar(25). Is that not true?
> With the above design, you have the performance of indexing an integer > column, and the integrity assurances of using a real key.
Well, I don't mind his replies actually, arrogant they may be. I think its important to understand the fundamentals before taking advantage of the syntax, abbreviated keywords and tools provided today. I would rather know the rules I am breaking.
"Tony Rogerson" <tonyroger...@torver.net> wrote in message
>> So I am not the only one who thinks you are a JERK. You must think you >> are better than everyone else here because you are arrogant.
> He's too arrogant to realise the profession has completely lost respect > and interest in what he posts.
> Most of his drivel is wrong and even when we are lucky enough for it to > work syntactically it's of little practical use in the environment of > today.
> He got stuck in the 70/80's mind set and is too lazy and arrogant to move > on.