Skip to main content

Joe Celko's SQL for Smarties

Advanced SQL Programming Second Edition

  • 1st Edition - October 11, 1999
  • Latest edition
  • Author: Joe Celko
  • Language: English

SQL for Smarties was hailed as the first book devoted explicitly to the advanced techniques you need to transform yourself into an expert SQL programmer. Now, in this fully upd… Read more

Purchase options

Sorry, this title is not available for purchase in your country/region.

World Book Day celebration

Where learning shapes lives

Up to 25% off trusted resources that support research, study, and discovery.

Description

SQL for Smarties

was hailed as the first book devoted explicitly to the advanced techniques you need to transform yourself into an expert SQL programmer. Now, in this fully updated second edition, SQL mastermind Joe Celko keeps you moving forward, using his entertaining, conversational style to teach you the best solutions to old and new challenges and to convey the way you need to think if you really want to get the most out of your SQL programming efforts.

Inside, logic- and set-based analyses replace the traditional, procedural approach to problem-solving, helping you make the conceptual leap that separates an SQL guru from the rest of the pack. As you catch on to Celko's approach, you'll devour what he has to say about some of SQL's toughest topics: how aggregate functions really work, the best way to work with NULLs, how and why to fake array structures, and much more.

This book gives special emphasis to SQL-92 and product-independent techniques that let you optimize performance or achieve highly specialized behavior, regardless of the RDBMS with which you work. If you're serious about SQL, you won't let SQL for Smarties out of your sight.

Key features

* Presents all-new war stories that give you insight into real-world SQL programming challenges.
* Continues to cover SQL-89 but focuses heavily on the SQL-92 standard.
* Offers still more undocumented tips for working around system deficiencies.
* Teaches scores of advanced techniques that can be used with any product, in any SQL environment.
* Offers expert advice from a noted SQL authority and award-winning columnist.

Table of contents

Preface






Introduction

The Nature of the SQL Language

Programming Tips






1 Database Design





1.1 Schema and Table Creation


1.1.1 Schemas




1.1.2 Manipulating Tables

DROP TABLE

ALTER TABLE

CREATE TABLE






1.1.3 Column Definitions

DEFAULT Clause

Column Constraints

NOT NULL Constraint

CHECK( ) Constraint

UNIQUE and PRIMARY KEY Constraints

REFERENCES Clause




1.1.4 A Remark on Duplicate Rows




1.1.5 A Remark on UNIQUE Constraints versus

Unique Indexes






1.2 Generating Sequential Numbers




2 Normalization



2.1 Functional and Multivalued Dependencies




2.2 First Normal Form (1NF)


2.2.1 Note on Repeated Groups




2.3 Second Normal Form (2NF)


2.4 Third Normal Form (3NF)


2.5 Case Tools for Normalization


2.6 Boyce-Codd Normal Form (BCNF)


2.7 Fourth Normal Form (4NF)


2.8 Fifth Normal Form (5NF)


2.9 Domain-Key Normal Form (DKNF)


2.10 Practical Hints for Normalization


2.11 Practical Hints for Denormalization




3 Numeric Data in SQL



3.1 Numeric Types




3.2 Numeric Type Conversion


3.2.1 Rounding and Truncating


3.2.2 CAST( ) Function




3.3 Four Function Arithmetic


3.4 Arithmetic and NULLs


3.5 Converting Values to and from NULL


3.5.1 NULLIF( ) Function


3.5.2 COALESCE( ) Function




3.6 Vendor Math Functions


3.6.1 Number Theory Operators


3.6.2 Exponential Functions


3.6.3 Scaling Functions


3.6.4 Generator Functions








4 Temporal Datatypes in SQL



4.1 Tips for Handling Dates, Timestamps, and Times


4.1.1 Date Format Standards


4.1.2 Handling Timestamps


4.1.3 Handling Times




4.2 Queries with Dates


4.3 Personal Calendars


4.4 Time Series


4.4.1 Gaps in a Time Series


4.4.2 Continuous Time Periods


4.4.3 Locating Dates


4.4.4 First and Last Days of a Month




4.5 Julian Dates


4.6 Date and Time Extraction Functions


4.7 Other Temporal Functions




4.8 Problems with the Year 2000


4.8.1 The Zeros


4.8.2 Leap Year


4.8.3 The Millennium


4.8.4 Weird Dates in Legacy Data






5 Character Datatypes in SQL





5.1 Problems with SQL Strings


5.1.1 Problems of String Equality


5.1.2 Problems of String Ordering


5.1.3 Problems of String Grouping




5.2 Standard String Functions




5.3 Common Vendor Extensions




5.3.1 Phonetic Matching

Soundex Functions

The Original Soundex

An Improved Soundex

Metaphone

Other Pattern-Matching Predicates






5.4 Cutter Tables






6 NULLs-Missing Data in SQL



6.1 Empty and Missing Tables


6.2 Missing Values in Columns


6.3 Context and Missing Values


6.4 Comparing NULLs




6.5 NULLs and Logic


6.5.1 NULLS in Subquery Predicates


6.5.2 SQL-92 Solutions




6.6 Math and NULLs


6.7 Functions and NULLs


6.8 NULLs and Host Languages




6.9 Design Advice for NULLs


6.9.1 Avoiding NULLs from the Host Programs




6.10 A Note on Multiple NULL Values




7 Other Expressions





7.1 The CASE Expression


7.1.1 The COALESCE( ) and NULLIF( ) Functions


7.1.2 CASE Expressions with GROUP BY


7.1.3 CASE, CHECK( ) Clauses and Logical Implication


7.1.4 The Oracle DECODE( ) Function




7.2 Subquery Expressions and Constants


7.3 Rozenshtein Characteristic Functions






8 Other Schema Objects





8.1 Schema Creation


8.1.1 Schema Tables




8.2 Temporary Tables


8.3 CREATE ASSERTION


8.4 CREATE DOMAIN


8.5 TRIGGERs


8.6 CREATE PROCEDURE




9 Table Operations





9.1 DELETE FROM Statement


9.1.1 The DELETE FROM Clause


9.1.2 The WHERE Clause


9.1.3 Deleting Based on Data in a Second Table




9.1.4 Deleting within the Same Table

Redundant Duplicates in a Table

Redundant Duplicates Removal with ROWID




9.1.5 Deleting in Multiple Tables without Referential Integrity






9.2 INSERT INTO Statement


9.2.1 INSERT INTO Clause


9.2.2 The Nature of Inserts


9.2.3 Bulk Load and Unload Utilities






9.3 UPDATE Statement


9.3.1 The UPDATE Clause


9.3.2 The WHERE Clause


9.3.3 The SET Clause


9.3.4 Updating with a Second Table


9.3.5 Using the CASE Expression in UPDATEs


9.3.6 Updating within the Same Table


9.3.7 Updating a Primary Key




9.4 A Note on Flaws in a Common Vendor Extension




10 Comparison or Theta Operators



10.1 Converting Datatypes


10.2 Row Comparisons in SQL-92




11 Valued Predicates





11.1 IS NULL Predicate


11.1.1 Sources of NULLs








12 LIKE and SIMILAR Predicates



12.1 Tricks with Patterns


12.2 Results with NULL Values and Empty Strings


12.3 LIKE Is Not Equality


12.4 Avoiding the LIKE Predicate with a Join


12.5 Other Pattern-Matching Predicates






13 BETWEEN and OVERLAPS Predicates





13.1 BETWEEN Predicate


13.1.1 Results with NULL Values


13.1.2 Results with Empty Sets


13.1.3 Programming Tips






13.2 OVERLAPS Predicate


13.2.1 Time Periods and OVERLAPS Predicate








14 The [NOT] IN Predicate



14.1 Optimizing the IN Predicate


14.2 Replacing ORs with the IN Predicate


14.3 NULLs and the IN Predicate


14.4 IN Predicate and Referential Constraints






15 EXISTS ( ) Predicate



15.1 EXISTS and NULLs


15.2 EXISTS and JOINs


15.3 EXISTS and Quantifiers


15.4 EXISTS( ) and Referential Constraints




16 Quantified Subquery Predicate



16.1 Scalar Subquery Comparisons


16.2 Quantifiers and Missing Data


16.3 The ALL Predicate and Extrema Functions


16.4 UNIQUE Predicate






17 The SELECT Statement





17.1 SELECT and JOINs


17.1.1 One-Level SELECT Statement


17.1.2 Correlated Subqueries in a SELECT Statement


17.1.3 SQL-92 SELECT Statement


17.1.4 The ORDER BY Clause






17.2 OUTER JOINs


17.2.1 Vendor Syntax for OUTER JOINs


17.2.2 SQL-92 Syntax for JOINs


17.2.3 NULLs and OUTER JOINs


17.2.4 NATURAL versus Conditional OUTER JOINs


17.2.5 Self-OUTER JOINs


17.2.6 Two or More OUTER JOINs


17.2.7 OUTER JOINs and Aggregate Functions


17.2.8 FULL OUTER JOIN


17.2.9 WHERE Clause OUTER JOIN Operators




17.3 Old versus New JOIN Syntax




17.4 Exotic JOINs


17.4.1 Self Non-Equi-JOINs


17.4.2 Range JOINs


17.4.3 JOINs by Function Calls


17.4.4 The UNION JOIN




17.5 Dr. Codd's T-JOIN


17.5.1 The Croatian Solution


17.5.2 The Swedish Solution


17.5.3 The Columbian Solution








18 VIEWs and TEMPORARY TABLEs



18.1 VIEWs in Queries


18.2 Updatable and Read-Only VIEWs


18.3 Types of VIEWs


18.3.1 Single-Table Projection and Restriction


18.3.2 Calculated Columns


18.3.3 Translated Columns


18.3.4 Grouped VIEWs


18.3.5 UNION VIEWs


18.3.6 JOINs in VIEWs


18.3.7 Nested VIEWs






18.4 How VIEWs Are Handled in the Database System


18.4.1 View Column List


18.4.2 VIEW Materialization


18.4.3 In-Line Text Expansion


18.4.4 Pointer Structures


18.4.5 Indexing and Views




18.5 WITH CHECK OPTION Clause


18.6 Dropping VIEWs


18.7 TEMPORARY TABLEs




18.8 Hints on Using VIEWs and TEMPORARY TABLEs


18.8.1 Using VIEWs


18.8.2 Using TEMPORARY TABLEs


18.8.3 Flattening a Table with a VIEW






19 Partitioning Data



19.1 Coverings and Partitions


19.1.1 Partitioning by Ranges

Range Tables

Single-Column Range Tables




19.1.2 Partition by Functions




19.2 Relational Division


19.2.1 Division with a Remainder


19.2.2 Exact Division


19.2.3 Note on Performance


19.2.4 Todd's Division


19.2.5 Division with JOINs


19.2.6 Division with Set Operators






20 Grouping Operations





20.1 GROUP BY Clause


20.1.1 NULLs and Groups


20.1.2 GROUP BY and HAVING


20.1.3 Grouped VIEWs for Multiple Aggregation Levels


20.1.4 Sorting and GROUP BY


20.1.5 Grouped Subqueries for Multiple Aggregation Levels


20.1.6 Grouping on Computed Columns




20.2 Ungrouping a Table


20.2.1 Ungrouping by Splitting a Table


20.2.2 Ungrouping Using a Join








21 Aggregate Functions



21.1 COUNT Functions


21.2 SUM Functions




21.3 AVG Functions


21.3.1 Averages with Empty Groups






21.4 Extrema Functions


21.4.1 Simple Extrema Functions


21.4.2 Generalized Extrema Functions


21.4.3 Multiple-Criteria Extrema Functions






21.5 Other Aggregate Functions




21.5.1 The LIST() Aggregate Function

The LIST() Function with a Procedure

The LIST() Function by Crosstabs






21.5.2 The PROD() Aggregate Function

PROD() Function by Expressions

The PROD() Aggregate Function by Logorithms




22 Auxiliary Tables



22.1 The Sequence Table


22.1.1 An Example of the Sequence Table






22.2 The Calendar Table


22.2.1 An Example of the Calendar Table




22.3 Interpolation with Auxiliary Function Tables






23 Statistics in SQL



23.1 The Mode


23.2 The Median


23.2.1 Date's First Median


23.2.2 Celko's First Median


23.2.3 Date's Second Median


23.2.4 Murchison's Median


23.2.5 Celko's Second Median


23.2.6 Vaughan's Median with VIEWs


23.2.7 Median with Characteristic Function


23.2.8 Celko's Third Median




23.3 Variance and Standard Deviation


23.4 Average Deviation


23.5 Cumulative Statistics


23.5.1 Running Totals


23.5.2 Running Differences


23.5.3 Cumulative Percentages


23.5.4 Rankings and Related Statistics




23.6 Cross Tabulations


23.6.1 Crosstabs by Cross Join


23.6.2 Crosstabs by Outer Joins


23.6.3 Crosstabs by Subquery






24 Regions, Runs and Sequences



24.1 Finding Subregions of Size n


24.2 Finding Regions of Maximum Size


24.3 Bound Queries


24.4 Run and Sequence Queries




25 Array Structures in SQL



25.1 Representing Arrays in SQL




25.2 Matrix Operations in SQL


25.2.1 Matrix Equality


25.2.2 Matrix Addition


25.2.3 Matrix Multiplication


25.2.4 Other Matrix Operations




25.3 Flattening a Table into an Array


25.4 Comparing Arrays in Table Format




26 Set Operations





26.1 UNION and UNION ALL


26.1.1 Duplicates and Union Operators


26.1.2 Order of Execution


26.1.3 Mixed UNION and UNION ALL Operators




26.2 Set Difference Operator


26.2.1 Set Difference with Outer Join




26.3 Intersection


26.4 A Note on ALL and SELECT DISTINCT




27 Subsets



27.1 Every nth Item in a Table


27.2 Picking Random Rows from a Table


27.3 The Contains Operators


27.3.1 Proper Subset Operators


27.3.2 Set Equality






28 Adjacency List Model of Trees in SQL



28.1 Adjacency List in a Single Table


28.2 Finding the Root Node


28.3 Finding Leaf Nodes


28.4 Finding Levels in a Tree


28.5 Functions in the Adjacency List Model


28.6 Tree Operations


28.6.1 Subtree Deletion


28.6.2 Subtree Insertion




28.7 Vendor Tree Extensions


28.7.1 Oracle Tree Extensions


28.7.2 XDB Tree Extension


28.7.3 DB2's WITH Operator


28.7.4 Date's Explode Operator


28.7.5 Tillquist and Kuo's Proposals




28.8 The Transitive Closure Model


28.8.1 Estimating Table Size


28.8.2 Deleting Nodes


28.8.3 Subtree Insertion


28.8.4 Summary Functions


28.8.5 The Transitive Closure Model with Fixed Depth






29 Nested Set Model of Trees in SQL



29.1 Finding Root and Leaf Nodes


29.2 Finding Subtrees


29.3 Finding Levels and Paths in a Tree


29.3.1 Finding the Height of a Tree


29.3.2 Finding Immediate Subordinates


29.3.3 Finding Oldest and Youngest Subordinates


29.3.4 Finding a Path




29.4 Functions in the Nested Set Model




29.5 Deleting Nodes and Subtrees


29.5.1 Deleting Subtrees


29.5.2 Deleting a Single Node




29.6 Closing Gaps in the Tree


29.7 Summary Functions on Trees


29.8 Inserting and Updating Trees


29.9 The Linear Version of the Nested Set Model


29.10 Converting Adjacency List to Nested Set Model




30 Graphs in SQL



30.1 Two-Table Representation of a Graph


30.2 Path Enumeration in a Graph


30.3 Path Aggregation in a Graph


30.4 Node Splitting




31 Optimizing Code



31.1 Access Methods


31.1.1 Sequential Access


31.1.2 Indexed Access


31.1.3 Hashed Indexes


31.1.4 Bit Vector Indexes






31.2 Expressions and Unnested Queries


31.2.1 Use Simple Expressions


31.2.2 String Expressions




31.3 Give Extra Join Information in Queries


31.4 Index Tables Carefully


31.5 Watch the IN Predicate


31.6 Avoid UNIONs


31.7 Prefer Joins over Nested Queries


31.8 Avoid Expressions on Indexed Columns


31.9 Avoid Sorting


31.10 Avoid Cross Joins


31.11 Learn to Use Indexes Carefully


31.12 Order Indexes Carefully


31.13 Recompile Static SQL after Schema Changes


31.14 Temporary Tables Are Handy



Appendix: Readings and Resources


General References

Logic

Mathematical Techniques

Random Numbers

Scales and Measurements

Missing Values

Graph Theory

Introductory SQL Books

Optimizing Queries

Temporal Data and the Year 2000 Problem

Books

Newsletters

SQL Programming Techniques

Classics

Updatable Views

Theory, Normalization, and Advanced Database Topics

Books on SQL-92

Standards and Related Groups

Web Sites Related to SQL


References

Index


About the Author

Product details

  • Edition: 1
  • Latest edition
  • Published: October 11, 1999
  • Language: English

About the author

JC

Joe Celko

Joe Celko served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.

Mr. Celko is author a series of books on SQL and RDBMS for Elsevier/MKP. He is an independent consultant based in Austin, Texas.

He has written over 1200 columns in the computer trade and academic press, mostly dealing with data and databases.

Affiliations and expertise
Independent Consultant, Austin, Texas