StrataFrame Forum

Optimizing a VFP Linked Server Query

http://forum.strataframe.net/Topic29646.aspx

By William Fields - 3/6/2011

Hello,

This is not a SF question, but I'm hoping someone may have run into the same issue...

I have a VFP linked server in SQL2008 (the databases/tables are all on the same workstation)and am trying to optimize a query that filters on a datetime field. The VFP table has an index on this field, but I'm not certain how to speed up my query.

For example:

SELECT *
FROM MyVFPLinkedServer...MyTable
WHERE
MyDateTimeField >= '03/01/2011 00:00:00' AND
MyDateTimeField <= '03/31/2011 23:59:59'

This returns 75 rows and takes 30 seconds to execute.

Filtering on an integer field in the same table is very fast, so I guess I'm just not formatting the WHERE clause properly to take advantage of the DATETIME field index.


Can anyone suggest how I can speed up the linked server query with DATETIME values?

Thanks.
By Greg McGuffey - 4/5/2011

I was poking around and decided to investigate this. Turns out William got an answer over on a msdn forum.

The answer is to use OpenQuery, which passes the query to VFP, which can optimize the query. Apparently SQL Server can't do that on datetime fields with a linked VFP server.  Here is the new code from the post:

select * from openquery(VCAL, 'select * FROM csHearin WHERE he_tdthearing = {06-02-1992 02:30 PM}')
By William Fields - 4/5/2011

Hi Greg,

This solution is not ideal, but yes, it works. I would much prefer finding the right view definition that would work properly with VFP DATETIME parameters (if that indeed exists).

The problem is that it precludes me from using an SQL Server View based on the linked table. The OPENQUERY method complicates passing parameters, and seems to be problematic for chained linked servers.

Thanks.
By Greg McGuffey - 4/5/2011

Thanks for the additional info. I don't have any solutions at this time though Sad
By Ger Cannoll - 4/5/2011

Hi William.

I have seen a few references to 'Linked Servers' here and wonder what they are and in what circumstances you would use them
By William Fields - 4/5/2011

Hello Gerard,

Here is a link to a design schematic I'd like to develop:

http://forum.strataframe.net/Attachment1611.aspx

Here is a link to the dicussion:

http://forum.strataframe.net/Topic13841.aspx

Basically, it would be nice if I could treat a VFP database on each side of a WAN connection as a Linked Server, giving my StrataFrame application access to the VFP data on each side simultaneously.

I've run into problems attempting to "chain" linked servers (as depicted in the schematic) and the only solution I've been able to arrive at so far is to have multiple SQL connections in my SF app. One for each SQL server.
By Ger Cannoll - 4/5/2011

Hi William.

I ahve a problem clicking on your link. When I click, all I get up is a window with all the threads again. (I think I have got this with other links here as well). Should I be able to click on it and see an image ?
By Ger Cannoll - 4/5/2011

Tried it again now...and the image comes up ok.

I see in the diagram that you have VFP databases 'LINKED' to Sql database.

Does this just mean your application us using some data from Sql and some data from VFP, or is there something else 'magical' going on
By William Fields - 4/5/2011

Yes, the idea is to have the StrataFrame app access the VFP data via SQL Linked Servers. The trick is accessing both the Phoenix and Tucson VFP databases at the same time.

Because of the DATETIME parameter issue talked about earlier in this thread, I'm unable to use a SQL View and have to resort to OPENQUERY. If I were able to design a SQL Server View optimized for a VFP DATETIME parameter, I believe I'd be able to make the design of my schema work.

But, the only solution I've been able to come up with so far is to have two database connections in the SF app. Which looks like it will work, but I end up having to copy data from one business object to another to get a single dataset with data from both Phoenix and Tucson.
By Michel Levy - 4/8/2011

Hi William,

I'm unable to get such results with a VFP linked server...

I've create a VFP table (ID int autoinc, ColDateTime Datetime), populate it with more than 10000 records where coldatetime value is randomized on insert. A single index (primary key) on ID. No index on ColDateTime.

Querying that table from within SSMS as :

SELECT
     [id],
     [coldatetime]
FROM [VFP_Northwind]...[testdates]
      where
        coldatetime >='20090101 01:01:01'
      and
        coldatetime <='20481231 23:59:59'

GO

requires less than 1 second to get a resultset of 44 lines between the 10142 (including VFP treatment, SQL treatment, and SSMS display)

It's a little slower with a filter as
        coldatetime >='01/01/2009 01:01:01'

No difference with and without index on coldatetime

I'ld suspect a pb of pagecode in VFP

(BTW, we presented the first alpha version of VFPinSQL at our last "Rencontres AtoutFox", are you still interested with it?)

By William Fields - 4/9/2011

Hello Michel!

> VFPinSQL

Yes, definitely! I imagine you had a lot of interest.

> suspect a pb of pagecode in VFP

CPDBF() says the table cp is 1252.

Attached in the zip file are two execution plans, one that has an integer field in the where clause, the other with a datetime field. As you can see, the integer query is passed on to the VFP engine and returns a single row. The datetime query brings the entire table into SQL which is then scanned.

Would you mind sending the execution plan for your query?

Thanks.
By Michel Levy - 4/9/2011

Hi William,

What are the statistics on this query? how many times did you run it?

Would you please send me (Private message if needed) your data (the dbf and its cdx) ?

For VfpInSql, we still have not found any workaround for varchar in dbf (unable to map it in SQL) but other fields run fine, for insert, update, delete (even with where type in the same way as in fox!). i need to check if all is ok for memo and blob.
By William Fields - 4/9/2011

I'm not sure how to look up the statistics, but I've run the query numerous (dozens of) times, all with about the same time to return the results (about 30 seconds).

Attached is a copy of the table in a new DBC. I created a new linked server in SSMS and ran this query with the same results:

SELECT *
  FROM [MYDBC]...[mytable]
  where he_tdthearing >= '20110301 00:00:00' and
  he_tdthearing <= '20110331 23:59:59'
GO

The new table/DBC query performed the same as the original.

I ran into the same VARCHAR issue with VFP Linked Server. We only use VARCHAR in 3 fields in our existing database, so we decided to go ahead and change them to CHAR fields since there would be no real impact.

I don't have a write requirement yet, but I did read somewhere that VFP linked servers do not support updates. I searched for the MS Knowledgebase article, but could not find it again. Is this true?

Thanks.
By Michel Levy - 4/9/2011

In a query in SSMS, you'll find the button for statistics near the "Real execution plan" button (see attachments)

I'll have a look to your files on Monday.

And for VFP linked servers, tou're right, they don't support update... and that's why we wrote VFPinSQL, which is a VFP linked server in SQL, that support update, insert, delete, in full respect with referential integrity triggers in VFP database.
By William Fields - 4/9/2011

Attached are the Statistics.

Thanks.

Bill
By Michel Levy - 4/13/2011

Hi,

I wanted to understand why you had such bad results with your VFP Linked server, so I've done a benchmark test with your table, and with other tables based on yours, but with some variations in the structure (see in MyDBC.prg.txt attached).

I got some expected results, and other unexpected!

first unexpected result was the tiny difference provided by the lack (or the presence) of the index, or the number or complexity of the tags in the cdx (you'll see it in the excel sheets named MyTable SimpleCdx and MyTableCdxNoDate).

Expected result was the benefit of a better choice of data type for numeric fields. Tou have 3 fields with N(x,0), and I've altered these fields as Integer. I was expecting an improvement, but not as important as what I found! Look at the sheet named "MyTableInt", it's amazing.

And the most unexpected result was on the datetime field. I created a new field of C(14), and update it with TTOC(he_tdthearing,1), and then delete the column "he_tdthearing". Wow! Read the results in the sheets named MyTableIdTTOC, MyTableTTOC.

So finally, the global result is in the comparison between MyTable and MyTableIntTTOC...

The complete benchmark is in the xlsx file in the attached zip.

By William Fields - 4/16/2011

Hello Michel,

I've reviewed the results. Thanks for taking the time to look at this with me.

> tiny difference provided by the lack (or the presence) of the index

Do you think this is because there was no WHERE clause in the SELECT statements?

> benefit of a better choice of data type for numeric fields

Yes, this makes sense (in my mind anyway). Integer fields only use 4 bytes of storage, whereas Numeric fields use 8 bytes in memory and 1 to 20 bytes in the table. 

Since the query is returning all rows, I would expect that it would finish faster if there were less total bytes to be returned. Is this the right way to look at it?

> comparison between MyTable and MyTableIntTTOC

That IS unexpected. From my perspective, the TTOC field should have had a negative impact because of the larger field size in bytes.

However, what I find more amazing is the difference between MyTableIntTTOC and MyTableInt. Going by the "it takes more time to return more bytes" theory, this result should not be.

I did the same tests between N(X,x), Int and TTOC and came up with result similar to yours:

MyTable - 75 seconds

MyTableInt - 50 seconds

MyTableIntTTOC - 41 seconds

So perhaps the VFP DateTime data type is not as efficient to retrieve via OLE DB, and/or takes more time to convert into an equivalent SQL Server datatype..?

And while the results are interesting, I am not sure how this directly applies to my original issue...

Did you get similar results as I when querying with a WHERE on the he_tDtHearing field?

In my scenario, I will be using a WHERE clause in each query. For Integer and Numeric fields, the indexes are used and the results are returned as fast as I would expect. But, for DateTime fields, I cannot find a way to optimize the WHERE clause so that an index is used.

If I were able to figure this out, I would be able to set up a view in SQL Server on the linked VFP table, then run queries on it with normal SQL syntax with parameters. Without this, I'm forced to use OPENQUERY - which works, but will be more effort to assemble each query.

Thanks.
By Michel Levy - 4/16/2011

Hi William,
  • about the integers:

    The difference of length in memory is unsufficent to explain the difference of performance.
    More than the length of the storage in memory, I believe that there is an automatic cast (made by VFPOLEDB or by SQL Server, I don't know). When the VFP field is N(x,0), it seems that it is cast in Integer, and consequently, you benefit of a good initial data typing, not having to cast.
    I say "I believe...", I'm not sure nor I have any proof!
  • about TTOC:

    here again, we probably have an implicit cast, because we give the date value to SQL server as a string. When we query, the internal parser transforms the string in a datetime in the best way for a SQL native table. Does SQL transmit the query to VFPOLEDB before parsing (in the original string format)? It would explain why we get the best results with a TTOC datetime in VFP field, because no implicit cast would be required. But once more, it's only an assumption.
Now I need to test with a WHERE clause.