Friday, July 25, 2008

The NULL comparison trap: Escape with NVL (Oracle) and Nz (Access)

This is disgusting. I’m sure I once knew better; maybe writing this up will help. A quick Google search tells me I’m not alone in my stupidity, so maybe this will help others who aren’t truly DBAs, but have to play with data.

I got caught be the NULL trap. I think this might be why some databases are designed to never allow NULL values, but to use, for example, an empty string.

The NULL trap is impeccably logical, but infuriating.

Say you compare the values of two fields, one of which is Non-Null and one is Null. Are they different?

I think so when I look at them, but I think the “proper” answer is “NULL”. Meaning – “I don’t know”.

That’s because NULL doesn’t mean “nothing”, it means “could be anything”.

Consider this Microsoft Access 2003 query:
SELECT pub_CC_qry.CG_CC_ID, pub_CC_prior_qry.[Abbreviation prior], pub_CC_qry.Abbreviation
FROM pub_CC_qry INNER JOIN pub_CC_prior_qry ON pub_CC_qry.CG_CC_ID = pub_CC_prior_qry.CG_CC_ID
WHERE (((pub_CC_prior_qry.[Abbreviation prior])<>[abbreviation]));
It returns 62 rows. It doesn’t return 28 rows where one of the two columns contains a null value.
This qeury returns 90 rows, but it’s a mess:
SELECT pub_CC_qry.CG_CC_ID, pub_CC_prior_qry.[Abbreviation prior], pub_CC_qry.Abbreviation
FROM pub_CC_qry INNER JOIN pub_CC_prior_qry ON pub_CC_qry.CG_CC_ID = pub_CC_prior_qry.CG_CC_ID
WHERE (((pub_CC_prior_qry.[Abbreviation prior]) Is Null) AND ((pub_CC_qry.Abbreviation) Is Not Null)) OR (((pub_CC_prior_qry.[Abbreviation prior]) Is Not Null) AND ((pub_CC_qry.Abbreviation) Is Null));
The secret is the Nz function. Wrap any column that might contain a NULL in Nz, and LOT of things, including iff comparison functions and SQL comparisons work a lot nicer:
SELECT pub_CC_qry.CG_CC_ID, Nz([Abbreviation prior]) AS Abbreviation_Prior, pub_CC_qry.Abbreviation
FROM pub_CC_qry INNER JOIN pub_CC_prior_qry ON pub_CC_qry.CG_CC_ID = pub_CC_prior_qry.CG_CC_ID
WHERE (((Nz([Abbreviation prior]))<>Nz([abbreviation])));
Arrgggh.

So what about Oracle?

I’m not precisely sure how Oracle is handling this situation in comparisons. I’m still researching Oracle’s behavior. It does, however, have a function similar to the Access Nz function; it’s the nvl function:
How to use the nvl function in Oracle
The NVL function is used to replace NULL values by another value.
The syntax for the NVL function is:
NVL( value_in, replace_with )
value_in if the function to test on null values. The value_in field can have a datatype char, varchar2, date or number datatype.
replace_with is the value that is returned if value_in has a null value. The NVL statement works like this pl/sql code:
if (value_in is NULL) thenreturn replace_with;elsereturn value_in;end if;Sample code: select nvl(salary, 0)from employees;select nvl(ref_code,'Unknown')from users;
Here’s an example, and a response from a friendly DBA:
Select * from A, B where A.ID = B.ID and A.name != B.nom

Imagine

A.ID = B.ID = 7
A.name is NULL
B.nom is not Null

How do you get Oracle to return the row in B where B.ID=7?
Answer (I think for our purposes this would work if we just used NVL([column_name],’’) for our string comparisons)
I think what you're after is the NVL function, which converts a NULL to a substitute value for comparison's sake.

So in your query, we would replace a null A.NAME with an outlandish value like 'A1B2C3', and replace a null B.NOM with same.

SELECT * FROM A, B WHERE A.ID = B.ID AND NVL(A.NAME, 'A1B2C3') != NVL(B.NOM, 'A1B2C3');

The above query will *not* return rows where A.ID = B.ID and A.NAME and B.NOM are *both* null. If you also want those rows, then you need to tag on that condition to the end of the query like this:

SELECT * FROM A, B WHERE A.ID = B.ID AND ((NVL(A.NAME, 'A1B2C3') != NVL(B.NOM, 'A1B2C3')) OR ((A.NAME IS NULL) AND (B.NOM IS NULL)));

No comments: