<$BlogRSDUrl$>

Friday, September 24, 2004

A throwaway comment on Oracle-L to the effect that the CBO will make better choices if you use an artificial value to indicate an unknown value (in this case a current date) led me to knock together this demonstration that this is in general probably not true.The test goes as followscreate a table with a pk and 4 columns - 2 of type date and one each of numeric and character.Then populate these columns with 20000 values that represent date data for (approx) the last 5 years. In the case of the character and numeric columns we use 'dates' of the format YYYYMMDD so that they sort properly.

then I have updated the last 30 days worth of history for records that have not yet had a date assigned (think payment date for invoices).the first column is updated to 31st December 9999 (obviously artificial), the second to NULL the last two to ' ' and 0 respectively. I also attempted to simulate data error/disputed payment or whatever by incorrectly updating 20% of the remaining data to the same value. Finally like a good dba I collect stats.I then explain two queries against the data. The first just tries to select those records with 'unassigned' dates - current records. This should return 1279 records out of 20000. The second query covers (and admittedly somewhat artificial) range and should return 104 records. This gives me the following results for the expected cardinalities of the queries

Query Actual Cardinality Artificial Date Date Artificial Char Artificial Number
Current 1279 11 1179 11 11
Access Path IDX FTS IDX IDX
Range Scan 104 21 124 11 20
IDX IDX IDX IDX
Current - Histograms 1279 1404 1331 1248 1248
FTS FTS FTS FTS
Range Scan - Histograms 104 103 42 195 195
IDX IDX IDX IDX

There are also two things to consider. The first is how does the query under test perform, but the second is how would the figures computed above affect the optimizer if the results from this query were to feed another operation in a more complex query plan. There are a number of interesting things that one can draw from this, but I'll limit myself to the following suggestions.

  • In the absence of histograms only using the correct combination of datatype and null gives the correct access path for query one or anything like the correct cardinality. Using the other 'strategies' would likely result in inappropriate hash (or less likely sort-merge) joins due to the artifically low cardinalities.
  • Histograms make life a lot easier (with the curious exception of range scans when the correct datatype and null value was used). but of course maintaining histograms is not a free operation.
  • the full results from my 10g system are here. If you were wondering about the last 2 columns, yes I have seen - indeed use - a third party system that uses characters and numbers to represent dates.


    1 Comments

    Tuesday, September 14, 2004

    HTMLDB

    If not exactly a best-kept secret, certainly one of the coolest new features of the 10g release is htmldb. This is a declarative programming environment for producing database driven web applications. Perhaps the most compelling example of it in use is Ask Tom which is the Q&A web site run by Tom Kyte of Oracle.

    We had a requirement in house just recently to produce a small app that would be used by appropriately authorised systems administrators to populate a couple of tables with small amounts of lookup data for a new system.

    Using HTMLDB we produced an app that

    1. Authenticates the user from the Enterprise wide LDAP compliant directory.
    2. Provides a consistent look and feel using CSS.
    3. Uses bind variables throughout for scalability.
    4. Could be hosted in an enterprise portal if necessary.

    In less than 2 days development time by a DBA (not a developer).

    Other fetures that htmldb makes available almost on a pick and choose basis are

    1. Full text searching.
    2. SSO Integration.
    3. Workflow.
    4. Email Notifications.
    5. File Upload/Download to database BLOB columns.
    6. SVG Charting.

    Finally, though the product is available on the 10g Companion CD (available FOC from OTN). It is supported (and works just fine) from 9.2.0.3 onwards - so you don't need to migrate a production database to take advantage of it.

    0 Comments

    Monday, September 13, 2004

    Quick admin note. I have enable comments here - unfortunately although I get a notification of them, I haven't got the code right to display them. This will be worked on this week.

    0 Comments