Mimer SQL Documentation TOC PREV NEXT INDEX

Mimer SQL Developer Site


Retrieving Data from More than One Table


The examples presented up to now in this chapter have illustrated the essential features of simple SELECT statements with data retrieval from single tables. However, much of the power of SQL lies in the ability to perform joins through a single statement, i.e. to select data from two or more tables, using the search condition to link the tables in a meaningful way.

The Join Condition

In retrieving data from more than one table, the search condition or join condition specifies the way the tables are to be linked. For example:

List the product name in addition to the EAN and price for any items with a release date in the future:
 SELECT product, ean_code, price
    FROM items, products
    WHERE items.release_date > CURRENT_DATE
    AND products.product_id = items.product_id;
 

The join condition here is PRODUCTS.PRODUCT_ID = ITEMS.PRODUCT_ID, which relates the product identifier in table ITEMS (where codes are listed) to the product identifier in table PRODUCTS (where names are listed).

Returns:

 PRODUCT
EAN_CODE
PRICE
 Greatest Hits
7298976754871
13.98
 Pieces Of Fish
7464376662256
15.98
 The Sql Quiz Book
9781990789861
13.99
 The Future Foretold
9781993789639
6.99

Conceptually, the join first establishes a table containing all combinations of the rows in PRODUCTS with the rows in ITEMS, then selects those rows in which the two PRODUCT_ID values are equal. See Conceptual Description of the Selection Process for a fuller description of the conceptual SELECT process.

This does not necessarily represent the order in which the operations are actually performed; the order of evaluation of a complex SELECT statement is determined by the SQL optimizer, regardless of the order in which the component clauses are written.

Cross Products

Without the join condition, the result is a cross product of the columns in the tables in question, containing all possible combinations of the selected columns, for example:

 SELECT product, ean_code, price
    FROM items, products
    WHERE items.release_date > CURRENT_DATE;
Returns:

 PRODUCT
EAN_CODE
PRICE
 'Murder In The Cathedral'
7298976754871
13.98
 'Reave The Just' And Other Tales
7298976754871
13.98
 100 Anos
7298976754871
13.98
 12 Golden Country Greats
7298976754871
13.98
 12 Super Exitos
7298976754871
13.98
 1492: Conquest Of Paradise
7298976754871
13.98
 ...
...
...

It is easy to see that a carelessly formulated join query can produce a very large result table. Two tables of 100 rows each, for instance, give a cross product with 10,000 rows; three tables of 100 rows each give a cross product with 1,000,000 rows!

The risk of generating large (erroneous) result tables is particularly high in interactive SQL (e.g. when using Mimer BSQL), where queries are so easily written and submitted.

Simple Joins

In simple joins, all tables used in the join are listed in the FROM clause of the SELECT statement. This is in distinction to nested joins, where the search condition for one SELECT is expressed in terms of another SELECT, see Nested Selects.

An example of a simple join is the query described in The Join Condition.

Example:
 SELECT product, ean_code, price
    FROM items, products
    WHERE items.product_id = products.product_id
    AND release_date > CURRENT_DATE;
Returns:

 PRODUCT
EAN_CODE
PRICE
 Greatest Hits
7298976754871
13.98
 Pieces Of Fish
74643766622
15.98
 The Sql Quiz Book
9781990789861
13.99
 The Future Foretold
9781993789639
6.99

SELECT *

The form SELECT * may be used in a join query, but since this selects all columns in the result set, at least one column is usually duplicated.

Example:
 SELECT *
       FROM items
       JOIN products ON items.product_id = products.product_id
       AND release_date > CURRENT_DATE;
Returns:

From ITEMS:

 ITEM_ID
 PRODUCT_ID
 FORMAT_ID
 RELEASE_DATE
 STATUS
 PRICE
 STOCK
 REORDER_LEVEL
 EAN_CODE
 PRODUCER_ID
 IMAGE_ID
 

From PRODUCTS:

 PRODUCT
 PRODUCT_ID
 PRODUCT_SEARCH

Columns in the join query that are uniquely identified by the column name may be specified by name alone. Columns that have the same name in the joined tables must be qualified by their respective table names.

There is an alternative formulation of the query above:

 SELECT product, ean_code, price
    FROM items JOIN products ON items.product_id = products.product_id
    AND release_date > CURRENT_DATE;
 

All predicates that can be used in a where clause, except sub-selects, can be used in an on clause.

About Natural Joins

A natural join, joins the table on the condition of equality between any columns with the same name, in the two tables. In the first example, all columns from the two tables are present in the result. In the second example the join columns will only occur once. Thus, in the first case, the PRODUCT_ID column appears twice in the result, while there is only one occurrence of this column in the second result.

Nesting Join Clauses

It is possible to nest join-clauses, for example:

List the category in addition to the EAN and price for any items with a release date in the future:
 SELECT ean_code, price, category
    FROM items
    NATURAL JOIN formats
    JOIN categories ON categories.category_id = formats.category_id
    WHERE release_date > CURRENT_DATE;
Result:

 EAN_CODE
PRICE
 CATEGORY
 7298976754871
13.98
 Music
 7464376662256
15.98
 Music
 9781990789861
13.99
 Books
 9781993789639
6.99
 Books

Complex Search Conditions and Joins

A join query can join any number of tables using complex search conditions to select the relevant information from each table.

List the product for any items with a release date in the future along with the item price in both Swedish and Danish crowns (SEK and DKK respectively):
 SELECT product,
        CAST(price * exchange_rate AS DECIMAL(12,2)) AS cost,
        currency
    FROM items, products, currencies
    WHERE release_date > CURRENT_DATE
    AND products.product_id = items.product_id
    AND (currencies.code = 'SEK'
      OR currencies.code = 'DKK')
    ORDER BY product, currency;
Result:

 PRODUCT
COST
 CURRENCY
 Greatest Hits
99.42
 Danish Kronor
 Greatest Hits
125.61
 Swedish Kronor
 Pieces Of Fish
113.64
 Danish Kronor
 Pieces Of Fish
143.58
 Swedish Kronor
 The Future Foretold
49.71
 Danish Kronor
 The Future Foretold
62.80
 Swedish Kronor
 The Sql Quiz Book
99.49
 Danish Kronor
 The Sql Quiz Book
125.70
 Swedish Kronor

In formulating a search condition for a join query, it can help to write out the columns that would appear in a complete cross-product of the tables. The search condition is then formulated as though the query was a simple SELECT from the cross-product table.

Outer Joins

The joins in the previous sections were all inner joins. In an inner join between two tables, only rows that fulfill the join condition are present in the result.

An outer join, on the contrary, contains non-matching rows as well. The outer join has two options, LEFT and RIGHT.

Left Outer Join

Example:
 SELECT ean_code, release_date, producer
    FROM items
    LEFT OUTER JOIN producers
       ON items.producer_id = producers.producer_id
    WHERE ean_code >= 800000000000
    ORDER BY ean_code;
Result:

EAN_CODE
 RELEASE_DATE
 PRODUCER
800488327626
 1998-08-11
 Giants Of Jazz (Ita)
801061007720
 2000-10-31
 Warp Records
4988002364947
 1999-09-28   
 -
4988011353147
 1998-06-30
 -
5013145800423
 2000-03-14   
 Mint / Cherry Red
5013929112322
 1999-10-12   
 Cherry Red
5014438710221
 1994-12-27   
 Receiver Records
5019317001728
 1994-12-15   
 Receiver Records
7157761806273
 1996-01-18   
 Status Records
...
 ...
 ...

In the example above all rows from the table to the left in the join clause, i.e. ITEMS, are present in the result; non-matching rows from the PRODUCERS table are filled with null values in the result.

Observe the difference in result for the next statement and the previous one.

 SELECT ean_code, release_date, producer
    FROM items
    LEFT OUTER JOIN producers
       ON items.producer_id = producers.producer_id
       AND ean_code >= 800000000000
    ORDER BY ean_code;
Result:

EAN_CODE
 RELEASE_DATE
 PRODUCER
8811038120
 1991-08-27
 -
8811042127
 1991-10-22
 -
8811061326
 1992-05-19
 -
8811067021
 1992-12-22
 -
...
 ...
 ...
800488327626
 1998-08-11
 Giants Of Jazz (Ita)
801061007720
 2000-10-31
 Warp Records
4988002364947
 1999-09-28   
 -
4988011353147
 1998-06-30
 -
5013145800423
 2000-03-14   
 Mint / Cherry Red
...
 ...
 ...

The reason is that conditions in the where clause are applied to the result of the join-clause and not to the joined tables as is the case with the conditions in the on-clause.

Right Outer Join

A right outer join will take all records from the table to the right in the join-clause.

Nesting Outer Joins

As with inner joins, it is possible to nest join-clauses. Nested joins can be of different types, i.e. both inner and outer joins.

The result of nested outer joins can be somewhat unexpected though, as it is the result of the first join-clause that is the left table in the next join, and not the right table in the first join-clause.

Nested Selects

A form of SELECT, called a subselect, can be used in the search condition of a SELECT statement to form a nested query.

The main SELECT statement is then referred to as the outer select.

For example:

Select the products that have a release date in the future.
 SELECT product
    FROM products
    WHERE product_id IN (SELECT product_id
                            FROM items
                            WHERE release_date > CURRENT_DATE);
Result:

 PRODUCT
 Greatest Hits
 Pieces Of Fish
 The Future Foretold
 The Sql Quiz Book

To see how this works, evaluate the subselect first:

 SELECT product_id
    FROM items
    WHERE release_date > CURRENT_DATE;
Result:

 PRODUCT_ID
 30206
 30618
 31082
 31083

Then use the result of the subselect in the search condition of the outer select:

 SELECT product
    FROM products
    WHERE product_id IN (30206, 30618, 31082, 31083);
Result:

 PRODUCT
 Greatest Hits
 Pieces Of Fish
 The Future Foretold
 The Sql Quiz Book

Using Subselects

A subselect can be used in a search condition wherever the result of the subselect can provide the correct form of the data for the search condition.

Thus a subselect used with `=' must give a single value as a result.

A subselect used with IN, ALL or ANY must give a set of single values, see Retrieval with ALL, ANY, SOME.

A subselect used with EXISTS may give any result, see Retrieving Data Using EXISTS and NOT EXISTS.

Examples:
 WHERE column = (subselect)
 WHERE column IN (subselect)
 WHERE column = ALL (subselect)
 WHERE column = ANY (subselect)
 WHERE EXISTS (subselect)

Subselects and ORDER BY Clauses

Subselects cannot include ORDER BY clauses. The UNION operator can be used to combine two or more subselects in more complex statements, see Union Queries.

Nested Queries

Many nested queries can equally well be written as simple joins. For example:

Select the products that have a release date in the future.
 SELECT product
    FROM products
    WHERE product_id IN (SELECT product_id
                            FROM items
                            WHERE release_date > CURRENT_DATE);
 

or alternatively

 SELECT DISTINCT product
    FROM products, items
    WHERE products.product_id = items.product_id
    AND items.release_date > CURRENT_DATE;
 

Both these queries give exactly the same result. In most cases, the choice of which form to use is a matter of personal preference. Choose the form which you can understand most easily; the clearest formulation is least likely to cause problems.

Subselects in Queries

Queries may contain any number of subselects, for example:

List the producers (manufacturers) which have items that are more expensive than any of the items produced by Sony.
 SELECT producer
    FROM producers
    WHERE producer_id IN
          (SELECT producer_id
              FROM items
              WHERE price >
                       (SELECT MAX(price)
                           FROM items
                           WHERE producer_id =
                                    (SELECT producer_id
                                        FROM producers
                                        WHERE producer = 'SONY')));
 

Note the balanced parentheses for the nested levels.

It is particularly important at this level of complication to think carefully through the query to make sure that it is correctly formulated.

Often, writing some of the levels as simple joins can simplify the structure. The previous example may also be written:

 SELECT DISTINCT producer
    FROM producers, items
    WHERE producers.producer_id = items.producer_id
    AND price > (SELECT MAX(price)
                    FROM items, producers
                    WHERE items.producer_id = producers.producer_id
                    AND producer = 'SONY');

Correlation Names

A correlation name is a temporary name given to a table to represent a logical copy of the table within a query.

Correlation names can be up to a maximum of 128 characters long.

There are three uses for correlation names:

Simplifying Complex Queries Using Correlation Names

Using short correlation names into complicated queries can make the query easier to write and understand, particularly when qualified table names are used:

 SELECT mimer_store_music.artists.artist,
        mimer_store.product_details.*
    FROM mimer_store.product_details,
         mimer_store_music.titles,
         mimer_store_music.artists
    WHERE mimer_store_music.titles.item_id =
          mimer_store.product_details.item_id
    AND mimer_store_music.artists.artist_id =
          mimer_store_music.titles.artist_id
    ORDER BY mimer_store_music.artists.artist;
 

may be rewritten

 SELECT art.artist, pdt.*
    FROM mimer_store.product_details AS pdt,
         mimer_store_music.titles AS ttl,
         mimer_store_music.artists AS art
    WHERE ttl.item_id = pdt.item_id
    AND art.artist_id = ttl.artist_id
    ORDER BY art.artist;
 

The keyword AS in the FROM clause may be omitted, but is recommended for clarity. Do not confuse AS in the FROM clause (defining a correlation name) with AS in the select list, see Setting Column Labels, defining a label.

About Correlation Names

Correlation names are local to the query in which they are defined.

When a correlation name is introduced for a table name, all references to the table in the same query must use the correlation name.

The following expression is not accepted:

    ...
    FROM mimer_store.product_details AS pdt,
         mimer_store_music.titles AS ttl,
    ...
    WHERE ttl.item_id = mimer_store.product_details.item_id

Joining a Table with Itself Using a Correlation Name

Joining a table with itself allows you to compare information in a table with other information in the same table. This can be done with a correlation name.

Select all currencies with the same exchange rate:
 SELECT currencies.currency, currencies.code, currencies.exchange_rate
    FROM currencies, currencies AS copy
    WHERE currencies.exchange_rate = copy.exchange_rate
    AND currencies.currency <> copy.currency;
Result:

 CURRENCY
 CODE
EXCHANGE_RATE
 Croatian Kuna
 HRK
7.0820
 Gourdes
 HTQ
7.0820
 Iraqi Dina
 IQD
1551.0000
 Uganda Shillings
 UGX
1551.0000

Here, the table CURRENCIES is joined to a logical copy of itself called COPY.

The first search condition finds pairs of currencies with the same exchange rate against the euro, and the second eliminates 'pairs' with the same currency name. Without the second condition in the search criteria, all currencies would be selected!

Without correlation names, this kind of query cannot be formulated. The following query would select all the currencies from the table:

 SELECT currency, code, exchange_rate
    FROM currencies
    WHERE currencies.exchange_rate = currencies.exchange_rate;

Outer References in Subselects Using Correlation Names

In some constructions using subselects, a subselect at a lower level may refer to a value in a table addressed at a higher level. This kind of reference is called an outer reference.

 SELECT currency
    FROM currencies
    WHERE EXISTS (SELECT *
                     FROM countries
                     WHERE currency_code = currencies.code);
 

This kind of query processes the subselect for every row in the outer select, and the outer reference represents the value in the current outer select row. In descriptive terms, the query says 'For each row in CURRENCIES, select the CURRENCY column if there are rows in COUNTRIES containing the current CODE value'.

If the qualifying name in an outer reference is not unambiguous in the context of the subselect, a correlation name must be defined in the outer select.

A correlation name may always be used for clarity, as in the following example:

 SELECT currency
    FROM currencies AS c
    WHERE EXISTS (SELECT *
                     FROM countries
                     WHERE currency_code = c.code);

Retrieving Data Using EXISTS and NOT EXISTS

EXISTS is used to check for the existence of some row or rows which satisfy a specified condition. EXISTS differs from the other operators in that it does not compare specific values; instead, it tests whether a set of values is empty or not. The set of values is specified as a subselect.

The subselect following the EXISTS clause most often uses of `SELECT *' as opposed to `SELECT column-list' since EXISTS only searches to see if the set of values addressed by the subselect is empty or not - a specified column is seldom relevant in the subquery.

EXISTS (subselect) is true if the result set of the subselect is not empty

NOT EXISTS (subselect) is true if the result set of the subselect is empty

SELECT statements with EXISTS almost always include an outer reference linking the subselect to the outer select.

Examples of EXISTS

Find all currencies that are used in the COUNTRIES table:
 SELECT currency
    FROM currencies AS c
    WHERE EXISTS (SELECT *
                     FROM countries
                     WHERE currency_code = c.code);
 

Without the outer reference, the select becomes a conditional `all-or-nothing' statement: perform the outer select if the subselect result is not empty, otherwise select nothing.

List all products where the producer (manufacturer) is not known:
 SELECT product
    FROM products
    WHERE EXISTS (SELECT *
                     FROM items
                     WHERE producer_id IS NULL
                     AND product_id = products.product_id);

Examples of NOT EXISTS

The next example illustrates NOT EXISTS:

List all products where the producer (manufacturer) is not known:
 SELECT product
    FROM products
    WHERE NOT EXISTS (SELECT *
                         FROM items
                         NATURAL JOIN producers
                         WHERE product_id = products.product_id);
Result:

 PRODUCT
 Invictus
 Middle Of Nowhere

Negated EXISTS

Negated EXISTS clauses must be handled with care. There are two semantic `opposites' to EXISTS, with very different meanings:

 WHERE EXISTS (SELECT *
                  FROM artists
                  WHERE artist = 'Enigma')
 

is true if at least one artist is called Enigma.

 WHERE NOT EXISTS (SELECT *
                      FROM artists
                      WHERE artist = 'Enigma')

is true if no artist is called Enigma.

But

 WHERE EXISTS (SELECT *
                  FROM artists
                  WHERE artist <> 'Enigma')
 

is true if at least one artist is not called Enigma.

 WHERE NOT EXISTS (SELECT *
                      FROM artists
                      WHERE artist <> 'Enigma')
 

is true if no artist is not called Enigma, that is if every artist is called Enigma.

FORALL

The double negative in the previous example is an SQL implementation of the universal quantifier FORALL.

Retrieval with ALL, ANY, SOME

Subselects that return a set of values may be used in the quantified predicates ALL, ANY or SOME. Thus

 WHERE PRICE < ALL (subselect)
 

selects rows where the price is less than every value returned by the subselect

 WHERE PRICE < ANY (subselect)
 

selects rows where the price is less than at least one of the values returned by the subselect

Select countries that have an exchange rate of less than one:
 SELECT country
    FROM countries
    WHERE currency_code <> ALL (SELECT code
                                   FROM currencies
                                   WHERE exchange_rate >= 1.0);
 

If the result of the subselect is an empty set, ALL evaluates to true, while ANY or SOME evaluates to false.

An alternative to using ALL, ANY or SOME in a value comparison against a general sub-select, is to use EXISTS or NOT EXISTS to see if values are returned by a sub-select which only selects for specific values. For example:

Select countries where the associated currency code contains the letter 'E' as the middle character in the code:
 SELECT country
    FROM countries
    WHERE currency_code = ANY (SELECT code
                                  FROM currencies
                                  WHERE code LIKE '_E_');

is equivalent to:

 SELECT country
    FROM countries AS c
    WHERE EXISTS (SELECT *
                    FROM currencies
                    WHERE code LIKE '_E_'
                    AND code = c.currency_code);

Union Queries

The UNION operator combines the results of two or more subselect clauses.

UNION first merges the result tables specified by the separate subselects and then eliminates duplicate rows from the merged set.

Select the codes for currencies and countries that start with the letter 'D':
 SELECT code
    FROM currencies
    WHERE code LIKE 'D%'
 
 UNION
 
 SELECT currency_code
    FROM countries
    WHERE country LIKE 'D%';
 

The result is obtained by merging the results of the two subselects and eliminating duplicates:

 SELECT code                   SELECT currency_code
    FROM currencies               FROM currencies
    WHERE code LIKE 'D%;          WHERE country LIKE 'D%';
 

CODE
CODE
DJF
DJF
DKK
DKK
DOP
XCD
DZD
DOP

and the UNION gives the result table:

 CODE
 DJF
 DKK
 DOP
 DZD
 XCD

To retain duplicates in the result table, use UNION ALL in place of UNION, see the Mimer SQL Reference Manual, UNION or UNION ALL, for details.

Columns which are merged by UNION must have compatible data types (numerical with numerical, character with character).

Subselects addressing more than one result column are merged column by column in the order of selection. The number of columns addressed in each subselect must be the same.

About Column Names

The column names in the result of a UNION are taken from the names in the first subselect. Use labels in the first subselect to assign different column names to the result table:

Merge the codes and names of currencies where the code begins with 'D' with the codes and names of the countries where the country begins with 'D':
 SELECT code, currency AS currency_or_country
    FROM currencies
    WHERE code LIKE 'D%'
 
 UNION
 
 SELECT currency_code, country
    FROM countries
    WHERE country LIKE 'D%';
Result:

CODE
 CURRENCY_OR_COUNTRY
DJF
 Djibouti
DJF
 Djibouti Francs
DKK
 Danish Kronor
DKK
 Denmark
DOP
 Dominican Pesos
DOP
 Dominican Republic
DZD
 Algerian Dinars
XCD
 Dominica

UNION Restrictions

Subselects merged by UNION may not include an ORDER BY clause. However, the result of the UNION query may be ordered with an ORDER BY clause placed after the last query in the UNION.

UNION may not be used within a nested subselect. However, the results of nested queries may be joined by UNION.

More about Unions

Unions can also be used to combine information from the same table:

Find the lowest and highest exchange_rates:
 SELECT 'Highest', MAX(exchange_rate) AS rate
    FROM currencies
 
 UNION
 
 SELECT 'Lowest', MIN(exchange_rate)
    FROM currencies
 
 ORDER BY rate;
Result:

 
 RATE
 Lowest
 0.2644
 Highest
 1035000.0000

Outer Joins and Unions

Unions can also be used to perform outer joins, joining information in a table or tables with information not listed in those tables (i.e. information that is null). For example:

List the EAN, price and producer for each item where the EAN is 800000000000 or greater. Include a row for items where the producer is not known:
 SELECT ean_code, release_date, producer
    FROM items
    NATURAL JOIN producers
    WHERE ean_code >= 800000000000
 
 UNION
 
 SELECT ean_code, release_date, '*** undefined ***'
    FROM items
    WHERE ean_code >= 800000000000
    AND NOT EXISTS (SELECT *
                       FROM producers
                       WHERE producer_id = items.producer_id)
 
 ORDER BY ean_code;
Result:

 EAN_CODE
 RELEASE_DATE
 PRODUCER
 800488327626
 1998-08-11
 Giants Of Jazz (Ita)
   801061007720
 2000-10-31
 Warp Records
 4988002364947
 1999-09-28   
 *** undefined ***
 4988011353147
 1998-06-30
 *** undefined ***
 5013145800423
 2000-03-14   
 Mint / Cherry Red
 5013929112322
 1999-10-12   
 Cherry Red
 5014438710221
 1994-12-27   
 Receiver Records
 5019317001728
 1994-12-15   
 Receiver Records
 7157761806273
 1996-01-18   
 Status Records
 ...
 ...
 ...

Note: UNION statements including DISTINCT treat NULL values as duplicates.
Unions and NULL

In UNION queries, the keyword NULL can be included in the column list of one or both of the queries, so that columns not represented in all of the queries in the statement are retained in the result set.


Mimer
Mimer Information Technology AB
Voice: +46 18 780 92 00
Fax: +46 18 780 92 40
info@mimer.se
Mimer SQL Documentation TOC PREV NEXT INDEX