Saturday, February 25, 2012

Rounding and Grouping

Perhaps someone can settle an arguement for me ?

I have a set of data that I need to group together. SQL Script below.

CREATE TABLE [dbo].[CommTransactions] (
[ID] [id_type] NOT NULL ,
[TransactionID] [id_type] NULL ,
[ClientID] [id_type] NULL ,
[AccountCode] [varchar] (10) NULL ,
[Amount] [float] NULL ,
[CreateDateTime] [datetime] NULL

For the records I want to group the following applies.

The ID is unique and distinct.
The TransactionId is the same.
The ClientId is the same.
The AccountCode is different.
The Amount will be the same.
The CreateDateTime field is different by a few milliseconds.

I want to create a single line showing two account codes in different
fields. i.e. Staff and Manager (where their ID is the account code).

These can be entered in any order in the table mentioned.

The problem I have is I need to link two records together (that's the
problem in it's most simplistic terms). However, there may be
additional records with the same TransactionId, ClientId, AccountCode
and Amount, but happened at a slightly different time. It could be
done on the same day.

Now, the arguement is that we can group using the CreateDateTime
field. I argue that we can't as it will show down to the millisecond
and any rounding will not always allow for a match. If we added the
matching records once per day, then I can extract the date and group
on it, but if more than one group is added per day, then this would
cause the logic to fail.

So, are there any reliable methods for grouping date/time fields
reliably if there is a small difference (I suspect not)?

Is there anything I have missed ?

Any help or suggestions would be appreciated.

Thanks

RyanRyan,
Forgive me if I am not understanding the question correctly.
But I think the answer is that you don't have to group on a field; you
can group on an expression in most cases.
In this case, you can probably group by
convert(varchar,CreateDateTime,101), which is the date portion of
CreateDateTime.
I hate to suggest this because the performance will probably be terrible
unless your WHERE clause if very specific, but it may be the quick fix you
are looking for.

Best regards,
Chuck Conover
www.TechnicalVideos.net

"Ryan" <ryanofford@.hotmail.com> wrote in message
news:7802b79d.0402020100.41141655@.posting.google.c om...
> Perhaps someone can settle an arguement for me ?
> I have a set of data that I need to group together. SQL Script below.
> CREATE TABLE [dbo].[CommTransactions] (
> [ID] [id_type] NOT NULL ,
> [TransactionID] [id_type] NULL ,
> [ClientID] [id_type] NULL ,
> [AccountCode] [varchar] (10) NULL ,
> [Amount] [float] NULL ,
> [CreateDateTime] [datetime] NULL
> For the records I want to group the following applies.
> The ID is unique and distinct.
> The TransactionId is the same.
> The ClientId is the same.
> The AccountCode is different.
> The Amount will be the same.
> The CreateDateTime field is different by a few milliseconds.
> I want to create a single line showing two account codes in different
> fields. i.e. Staff and Manager (where their ID is the account code).
> These can be entered in any order in the table mentioned.
> The problem I have is I need to link two records together (that's the
> problem in it's most simplistic terms). However, there may be
> additional records with the same TransactionId, ClientId, AccountCode
> and Amount, but happened at a slightly different time. It could be
> done on the same day.
> Now, the arguement is that we can group using the CreateDateTime
> field. I argue that we can't as it will show down to the millisecond
> and any rounding will not always allow for a match. If we added the
> matching records once per day, then I can extract the date and group
> on it, but if more than one group is added per day, then this would
> cause the logic to fail.
> So, are there any reliable methods for grouping date/time fields
> reliably if there is a small difference (I suspect not)?
> Is there anything I have missed ?
> Any help or suggestions would be appreciated.
> Thanks
> Ryan|||Ryan (ryanofford@.hotmail.com) writes:
> Now, the arguement is that we can group using the CreateDateTime
> field. I argue that we can't as it will show down to the millisecond
> and any rounding will not always allow for a match. If we added the
> matching records once per day, then I can extract the date and group
> on it, but if more than one group is added per day, then this would
> cause the logic to fail.
> So, are there any reliable methods for grouping date/time fields
> reliably if there is a small difference (I suspect not)?

I'm not sure that I follow, but it sounds to me more like a business
problem.

You can group by the hour for instance:

SELECT yadadada, d, COUNT(*)
FROM (SELECT yadayada,
d = convert(char(8), CreateDateTime, 112) +
convert(char(5), CreateDateTime, 108)
FROM ...) AS a
GROUP BY yadayada, d

Of course, is a group is inserted so that some rows are inserted before
one o'clock, and others after you lose. Likewise, if two groups are
inserted the same hour.

A more complicated scheme may be devised where you compute the time
between two inserted rows, and if the difference is > some value,
those are two groups.

But you probably get a lot more robust application, by introducing a
marker which is unique for every batch you insert. This could still
be a datetime value, you just need to make sure that all rows in the
same batch gets the the same value.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Yep, pretty much as I suspected. Unfortunately this table is one
supplied by another company so I can't change it as easily as I want
without affecting their app. Our users expectation differs from what
this package does hence the problem.

I want the other company to change this slightly and there will be a
cost (fair enough), only problem is our company doesn't want to pay
for it. So, I'm trying to provide them with everything to prove they
either pay for the change or accept it won't work. They would rather
my team spend several days (at God knows what cost) examining
something I know won't work instead of paying for a days worth of
development.

Daft.

As you have guessed, I'm trying to steer them down the route of a
marker that I can group on.

Thanks for the help.

Ryan

Erland Sommarskog <sommar@.algonet.se> wrote in message news:<Xns94843E9F3EA3Yazorman@.127.0.0.1>...
> Ryan (ryanofford@.hotmail.com) writes:
> > Now, the arguement is that we can group using the CreateDateTime
> > field. I argue that we can't as it will show down to the millisecond
> > and any rounding will not always allow for a match. If we added the
> > matching records once per day, then I can extract the date and group
> > on it, but if more than one group is added per day, then this would
> > cause the logic to fail.
> > So, are there any reliable methods for grouping date/time fields
> > reliably if there is a small difference (I suspect not)?
> I'm not sure that I follow, but it sounds to me more like a business
> problem.
> You can group by the hour for instance:
> SELECT yadadada, d, COUNT(*)
> FROM (SELECT yadayada,
> d = convert(char(8), CreateDateTime, 112) +
> convert(char(5), CreateDateTime, 108)
> FROM ...) AS a
> GROUP BY yadayada, d
> Of course, is a group is inserted so that some rows are inserted before
> one o'clock, and others after you lose. Likewise, if two groups are
> inserted the same hour.
> A more complicated scheme may be devised where you compute the time
> between two inserted rows, and if the difference is > some value,
> those are two groups.
> But you probably get a lot more robust application, by introducing a
> marker which is unique for every batch you insert. This could still
> be a datetime value, you just need to make sure that all rows in the
> same batch gets the the same value.|||I have another thought that is worth a go. A slightly unusual approach
I must admit, but I think it may work.

I can establish the initial line that I want and take the
CreateDateTime from that. If I then add 1 minute to give me a start
time. Then subtract 1 minute to give me an end time, I can create a
table which holds the various ID fields, the accountcode I need and
the start and end times of a group.

I then use another query to pull out the second accountcode I want and
use a left join to the table I created previously, joining where the
createdatetime is between the start and end date. I add the
accountcode from the first table as a new field on the end of the
results of this query.

It means that the system will have a 2 minute window to commit the
transactions. Normally this is a few seconds, but I can adjust my
window.

I'll have to do some work checking where this can fail though, but
it's worth a little time doing this.

Feel free to pull this apart so I can check how well it will work.

No comments:

Post a Comment