How to sort an SQL UNION query with special ORDER BY sequence

SQL expert Rudy Limeback explains how to sort an SQL UNION query using a special ORDERY BY sequence.

I have the following query:

SELECT * FROM mytable WHERE col1 = 'c'
UNION
SELECT * FROM mytable WHERE col1 = 'a'
UNION
SELECT * FROM mytable WHERE col1 = 'd'
UNION
SELECT * FROM mytable WHERE col1 = 'b'

How can I get the result in the same sequence of select queries rather than getting sorted by col1? i.e. I need the result to appear in this sequence: c,a,d,b.

In the execution of an SQL query, the ORDER BY clause is done last. Like the old saying goes, "It's all over but the sorting." The database server has retrieved table rows, joined them together as necessary, filtered and grouped them if required and the results are now to be sorted.

Some database developers believe that sorting should be done in the application server, not in the database server. They have a very good point. Sometimes, the presence of an ORDER BY clause requires the database server to write the results into a temporary table, and this temporary table is then read back and sorted. Managing temporary tables consumes very costly database server cycles, whereas letting the application server do it would be cheaper and probably faster. A centralized resource such as a database server should never be squandered on tasks best done elsewhere.

Now consider this UNION query. Whether we do the sorting with the ORDER BY clause or in the application, we still need an algorithm, a method, to determine the sort order. We need our logic to do the following: if the value in col1 is 'c', then these rows go first; if the value in col1 is 'a', then these go second; and so on. This is quite easy to do in a UNION query:

SELECT * , 1 FROM mytable WHERE col1 = 'c'
UNION
SELECT * , 2 FROM mytable WHERE col1 = 'a'
UNION
SELECT * , 3 FROM mytable WHERE col1 = 'd'
UNION
SELECT * , 4 FROM mytable WHERE col1 = 'b'

Can you see how that's going to work? Each row gets an extra column, a number, and the results will be sorted based on the values of these numbers. Thus the rows will be sorted into the desired, special sequence. We could now either add the ORDER BY clause to the SQL statement, or return the results to the application and let it do the sorting.

The application would need to know which column to sort on, so it's a good idea to name the column:

SELECT * , 1 AS sortcol
             FROM mytable WHERE col1 = 'c'
UNION
SELECT * , 2 FROM mytable WHERE col1 = 'a'
UNION
SELECT * , 3 FROM mytable WHERE col1 = 'd'
UNION
SELECT * , 4 FROM mytable WHERE col1 = 'b'

Result set column names for a UNION query are taken from the first SELECT in the UNION, so it's not necessary to assign the column alias in the other SELECTs.

And now for something completely different:

SELECT *
     , CASE WHEN col1 = 'c' THEN 'Curly'
            WHEN col1 = 'a' THEN 'Larry'
            WHEN col1 = 'd' THEN 'Moe'
            WHEN col1 = 'b' THEN 'Shemp'
        END AS sortcol
  FROM mytable
 WHERE col1 IN ( 'c','a','d','b' )

This is possible only, of course, because all four original SELECTs retrieved rows from the same table. If there were four different tables, then we would have to use the UNION. Here, we have only one SELECT against the table. The CASE expression takes only a few extra nanoseconds to evaluate, compared with no evaluation for hardcoded values. However, four SELECTs, even though it's the same table, might actually get executed as four passes against the data. Most optimizers would spot this, but why give them a chance to go wrong.

And yes, I could have assigned numbers as THEN values in the CASE expression. Sorting a VARCHAR(5) column might be marginally slower than sorting 1-byte numbers (although the difference is likely not noticeable), but I think the names are pretty neat.

Dig Deeper on Oracle development languages