Thinking About NULL

Duncan Greaves on NULL:

NULL exists because the following general conditions apply:

Existence –The attribute does not exist in the domain, or domain understanding is wrong. This means there is a missing entity in our domain model or entites are mixed in a table. E.g table contains hair colour for a car entity, Number of pregnancies for male patients.

Missing – The information has not been given at the time a row was created. E.g. A customer may decline to give their age.

Not Yet – Data is contingent upon an unknown event in the future, E.g. Termination date or Date of death.

Does not apply– Is not applicable for this instance of a record. E.g. Hair colour for bald people.

Placeholders – Indicates that we know that a bit of data exists, but we don’t know what it is, in this case keeping a NULL is useful for CUBE or ROLLUP queries.

In the real world applications of data structures NULLs are often unavoidable. However, it confuses users, designers and DBA’s (generally) hate it. It complicates Reporting, ETL, Business Intelligence and Data Science initiatives. As such, users need to be aware of the design and query compromises they need to use.

I think there’s significance in what NULL represents, but it’s a concept with its fair share of complexity.  Read the whole thing.

Related Posts

Avoid Scalar Functions In Computed Columns

Daniel Hutmacher shows why you should not include scalar functions inside computed column definitions: Scalar functions can be a real headache when you’re performance tuning. For one, they don’t parallelize. In fact, if you use a scalar function in a computed column, it will prevent any query that uses that table from going parallel – even if you […]

Read More

Azure And The Kappa Architecture

Jared Zagelbaum describes the Kappa architecture and points out how there’s limited built-in support in Azure for it: You can’t support kappa architecture using native cloud services. Cloud providers, including Azure, didn’t design streaming services with kappa in mind. The cost of running streams with TTL greater than 24 hours is more expensive, and generally, […]

Read More

Categories

April 2017
MTWTFSS
« Mar May »
 12
3456789
10111213141516
17181920212223
24252627282930