Handling Edits The SF Way


Author
Message
Peter Jones
Peter Jones
Advanced StrataFrame User (718 reputation)Advanced StrataFrame User (718 reputation)Advanced StrataFrame User (718 reputation)Advanced StrataFrame User (718 reputation)Advanced StrataFrame User (718 reputation)Advanced StrataFrame User (718 reputation)Advanced StrataFrame User (718 reputation)Advanced StrataFrame User (718 reputation)Advanced StrataFrame User (718 reputation)
Group: Forum Members
Posts: 386, Visits: 2.1K
Hi Bill,

We keep things effectively forever, at one plant the main widget table has 12 million rows. We also manage point in time inventory and recalculate what was the overall position at any point in time, e.g. the user can say 'what happened in December 2006' and get, what we call the Open/Close Inventory Report, which gives the opening inventory at a selected date, plus all the production, shipping, regrading and adjustments activity during the period and closing inventory (this is at the inventory item level). This report run 'today' would give exactly the same answer as the same report run on 31st Dec 2006.

I believe the key to performance in this kind of (SQL Server anyway) database is the clustered index we have on the widget production date and other strategic indexes but I believe the clusted date index is the most important for us as most reporting is selectable on a date range. For instance I've just VPNed to the plant with 12 million rows and a got a daily production summary for yesterday - 4000 widgets in a two page pivot table format in about 4 seconds from clicking go to having the report rendered - I ran the same report for 26 Feb 2006 and it rendered in about the same time. So that's a trip to the database that returned 4000 physical rows which had to be compiled into a pivot table report by Crystal Reports.

Cheers, Peter

Bill Cunnien
Bill Cunnien
StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)
Group: Forum Members
Posts: 785, Visits: 3.6K
That is excellent performance!  I must blow the dust off of my book, "Fun With Indexing, Vol. 3".  w00t  At this point, 12 million rows is way out of my league...that does give me a nice benchmark, though.  Thanks! Smile
Trent Taylor
Trent Taylor
StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)
Group: StrataFrame Developers
Posts: 6.6K, Visits: 7K
Wow....lots of good stuff on this post.  My point the other day, Bill, was that many times as developers we create fields that we continually have to update when this should be getting pulled from a stored procedure, view, or user-defined function (which the DDT now supports in the final 1.6.5 update...just FYI Smile).  Inventory can be a tricky thing at times, and the more stress that you place at the server level and fewer trips to and from the server you make, the faster and better things are going to be.
Bill Cunnien
Bill Cunnien
StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)
Group: Forum Members
Posts: 785, Visits: 3.6K
Help!

I have a query that results in a total row count of ~980.  It returns the total production for each active part by a certain date.  The production table consists of about 96000 records.  I created a UDF that does the totalling of the production table.  It is a simple query...at least, I thought so.  It takes way too long (~3 min) to run a query like this:

SELECT
 partnum, 
 dbo.fn_GetTotalProduction(partnum, '3/1/2008') AS produced
FROM
 PartsMaster
WHERE
 divisionindex = 1
 AND inactive = 0
ORDER BY
 partnum

The function look like this:

CREATE FUNCTION fn_GetTotalProduction(@partnum varchar(30), @enddate datetime) 
RETURNS float
AS 
BEGIN
 RETURN
  (
  SELECT
   ISNULL(SUM([Units Produced]),0)
  FROM
   PartsMaster Parts
   LEFT OUTER JOIN [4D_Romulus].dbo.[ORDER DATA] Orders ON Orders.[PartNum] = Parts.partnum
   LEFT OUTER JOIN [4D_Romulus].dbo.[PLANT SURVEY] Production ON Production.[Cust Job Num] = Orders.[Cust Job Num]
  WHERE
   Parts.partnum = @partnum
   AND Production.[Date] <= @enddate
  Wink
END

If I run the query without the UDF, it takes less than 3 seconds.  In order to get the production total, shipping total and adjustment total for each active part, I would like to use UDFs; however, it looks like there is a huge overhead in using UDFs.

What am I doing wrong?  I am trying to get my inventory setup like you all outlined above.  Any tip here would be helpful.

Thanks,
Bill

Bill Cunnien
Bill Cunnien
StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)
Group: Forum Members
Posts: 785, Visits: 3.6K
I got the 3 minutes down to 25 seconds.  Tolerable, but it could be much better.  The change that I made was in the UDF:

CREATE FUNCTION fn_GetTotalProduction(@partnum varchar(30), @enddate datetime) 
RETURNS float
AS 
BEGIN
 RETURN
  (
  SELECT
   ISNULL(SUM([Units Produced]),0)
  FROM
   PartsMaster Parts
   LEFT OUTER JOIN [4D_Romulus].dbo.[ORDER DATA] Orders ON Orders.[PartNum] = Parts.partnum
   LEFT OUTER JOIN [4D_Romulus].dbo.[PLANT SURVEY] Production ON Production.[Cust Job Num] = Orders.[Cust Job Num] AND Production.[Date] <= @enddate
  WHERE
   Parts.partnum = @partnum
  Wink
END

I moved the date limitation into the join rather than the where clause.  Can I still do better than 25 seconds?

Bill

Trent Taylor
Trent Taylor
StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)
Group: StrataFrame Developers
Posts: 6.6K, Visits: 7K
moved the date limitation into the join rather than the where clause.  Can I still do better than 25 seconds?

Yeah... you can get it much snappier that 25 seconds.  This is not an answer that I can just plop out here for you, but I can give you some suggestions to work towards a resolution.  First, SQL Server Management Studio has a feature that shows you the execution plan.  You can have SQL Server show you how it plans to execute the query:

Secondly, SQL Server doesn't like anything negative (i.e. NOT statements).  And if I recollect, a LEFT OUTER JOIN is somewhat negative.  Depending on what you are trying to accomplish, you would be better of doing a LEFT JOIN and then using COALESCE to make sure no NULL values end up in your results...or doing more than one query versus trying to get it all into one statement...thus the beauty of a UDF.

Third, dates can be picky as well.  When you have a date that uses a < or > operator, the query has to sort itself before testing (I am pretty sure...I know that a BETWEEN or doing a <= and >= on the same date does).  So this can be a performance issue as well.  So in this case, you can create a UDF that just does this test for you on a passed parameter so that it isn't part of the query and performance can increase as well.

Finally, SQL Server comes with some statistics command that can help.  For example, you can turn on the following:

SET STATISTICS PROFILE ON - This will spit out the actual execution path that was used when the query takes place.  You will use this within the Management Studio.

SET STATISTICS TIME ON - This will show you exactly how long each portion of the execution took in the Messages window when a query is performed. 

On both of these commands you will have to turn it on for each connection that you have (i.e. if you have more than one query window open).

But that is basically how we work through a speed and performance issue when dealing with large datasets and complicated queries.

Trent Taylor
Trent Taylor
StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)
Group: StrataFrame Developers
Posts: 6.6K, Visits: 7K
After looking a few things up, the LEFT OUTER JOIN isn't necessarily considered negative and many times a LEFT JOIN is just flat out converted into a LEFT OUTER JOIN.  But playing with your configurations is always a good idea when you are dealing with performance issues.
Bill Cunnien
Bill Cunnien
StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)
Group: Forum Members
Posts: 785, Visits: 3.6K
. . . you can create a UDF that just does this test for you on a passed parameter so that it isn't part of the query and performance can increase as well . . .

Thanks for the tips, Trent.  Can you explain how you handle the date constraint?  Not sure what you mean by your statement above.  Sorry for my density.

Bill

Trent Taylor
Trent Taylor
StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)StrataFrame Developer (14K reputation)
Group: StrataFrame Developers
Posts: 6.6K, Visits: 7K
It is just a matter of you create a UDF that doesn't perform a query, it would just test on the date and return a true or false.  This way the sorting of the date is not part of the query...your UDF is just called and determines if the record will be included or not. 
Bill Cunnien
Bill Cunnien
StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)StrataFrame VIP (1.2K reputation)
Group: Forum Members
Posts: 785, Visits: 3.6K
Couple of things...first, the checking of the date via UDF doubled the time involved in the query (I am likely not doing it right).  Second, I do not see a way to view the stats generated on the actual UDF used in my query.  Perhaps there is a bottleneck in there, but I am not seeing it.

I will keep tweaking my UDF.  Somehow, this thing should generate results in under 3 seconds.  I can do it manually; however, I need to do this with UDFs.

Thanks!
Bill

GO

Merge Selected

Merge into selected topic...



Merge into merge target...



Merge into a specific topic ID...




Similar Topics

Reading This Topic

Login

Explore
Messages
Mentions
Search