Multiple kinds of BLANK

Discussions on the future of databases, computing , the meaning of life ........

Moderators: Phil Winkler, Graham Smith, Pete Tabord

Multiple kinds of BLANK

Postby Pete Tabord » Wed Jan 23, 2013 6:36 pm

Some SQL servers support the empty string as a separate thing to NULL. Ffenics does not. Here's my view on such a thing which arises out of some private correspondence. It's a long time since we involved ourselves in an abstract debate!

As a matter of interest, Oracle (I mean real Oracle) also treats empty strings automatically as NULL. Which explains a lot. Like, maybe why it works better than all the rest, and, very likely, why it seems to work better with Ffenics than any other SQL engine.

It really is a nonsense to permit what is in effect two types of null on a SQL server – and it is exactly one of the traps Chris Date warned about. I will try to explain why.

The misunderstanding seems to come about, having read some of the articles about it, that people have imagined all sorts of meanings for ‘NULL’ for example that the data is not applicable, there will be data some time, or that the data is not yet known, and having in their heads (but not in Codd or Date’s) assigned a meaning they want to use empty string for ‘we know there is nothing to go in here’. Or one of the other myriad possible reasons for there being no data that they can imagine.

Actually, NULL means or implies absolutely nothing except ‘nothing to see or process here’. It does not imply data never entered, or anything else. Just that there is no data there. It implies no reason whatsoever for there being no data there.

Which, incidentally, is exactly the same meaning – or lack of - as the null string. It will contain the value 0x00 in C/C++ and in Ffenics, but that isn't actually a value, its a terminator. It _is_ NULL. It tells you no more than 'there is nothing here'. It does NOT on its own mean ‘there is known to be no data here’. That would require some kind of separate state flag confirming that data - or the lack of - had definitely been entered for that field.

Chris Date, for once, may have added to this confusion by frequently referring to ‘NULL as ‘UNK’ which people have assumed means ‘an unknown value’. It does not, it means the result of comparing NULL with any value other than NULL is unknown. 1 = 1 is an understandable and verifiable statement. But nothing equals nothing is rather less so, although we do specifically allow it. (Try pronouncing it with the emphasis on the first nothing and you will see what I mean!) Nothing equals one is patently untrue, but so is 'nothing is less than 1' - there are lots of things less than one - or 'nothing is greater than 1'. Nothing = 0 is equally wrong because 0(zero) is something - it's a number.

And 'nothing not = a different kind of nothing that also equals nothing' is simply illogical at any level.

BLANK I prefer, because it seems to be less likely to set people off down a route of ‘why is it BLANK’ – its just blank!

As ever, when people go beyond, or simply never bothered to learn, about the theory they make nightmares for themselves. That’s the real reason why, if apps in them are constantly made more complex, all the database products suffer the fate that their design compromises eventually make them unworkable. Full size Oracle goes furthest because arguably it’s the most purist (and one of the oldest) designs. It doesn’t do all of the 40 relational rules, but the ones it does do are very close to spec..

Of course, the better rules you as a developer apply, the further you’ll get before it bites your bum.

To sum up, NULL is clearly a compromise that causes some uncertainty. But attempts to refine the concept beyond 'there is nothing here' are both illogical in themselves and are in breach of the basic rules of abstraction which make our relational databases work in the first place. We have to remember that underneath a computer is actually a calculator, and if you can't at some level express what you want as an mathematical equation it will not have predictable results.

A separate state flag for every field (or every non-required field) might be an answer, but it would be hideously complex to implement yourself. When I get funding for Ffenics Next Generation I might consider it. But of course it would then make us incompatible with everything else.
Peter J. Tabord
Head of Development
Database Software Ltd.
ptabord@ffenics.com
Pete Tabord
 
Posts: 1881
Joined: Fri Sep 07, 2007 12:48 pm
Location: Caernarfon, Gwynedd, UK
Has thanked: 0 time
Been thanked: 3 times
 

Re: Multiple kinds of BLANK

Postby Graham Smith » Thu Jan 24, 2013 2:26 pm

I'm going to beat Phil to the punch here and point out that, once again, we are debating nothing.

It's been quite a while since I had to deal with this but I have spent a fair amount of time on this issue and, in particular, how different front ends deal with SQL NULL. And why there can be a difference between a NULL column and a "blank" column. And why there is the absurdity of True/False/Null logic. And why some "experts" insist that all column should be required.

So, while I'm not sure why this came up, I'm going to take this as an opportunity to kvetch about blank vs 0 in Ffenics. Because, even though it's not the same thing, it has similar issues.

Prior to Ffenics, DataEase always treated blank and zero as the same thing in a comparison. But that's no longer the case so, as a developer I am stuck with the same conundrum as to how to deal with this.

One is to default all number fields to zero. But that's not exactly correct because zero is a numerical value and blank is not a numerical value.

The other is to use a more complex evaluation than just =0 when doing a comparison. The simplest being (0 + Field) = 0, which will be true regardless of whether the field is zero or blank.

Anyway, the only real problem this poses is for people, like me, that have been doing this for so long that we tend to forget differences between programs.

So, Pete, be careful where you throw stones because you might hit a glass wall of your own. :mrgreen:
Graham Smith
DataSmith, Delaware
"For every expert there is an equal and opposite expert.", Arthur C. Clarke (1917 - 2008)
"X-Clacks-Overhead: GNU Terry Pratchett"
User avatar
Graham Smith
 
Posts: 2501
Joined: Fri Sep 07, 2007 11:31 am
Location: Delaware, USA
Has thanked: 0 time
Been thanked: 1 time
 

Re: Multiple kinds of BLANK

Postby Adrian Jones » Thu Jan 24, 2013 3:26 pm

I presume I am the other correspondent in this private mailing.

My issue has nothing to do with this theory.

What I am looking for is a practical solution to two bugs/glitches in how Ff does SQL updates: namely, it insists on generating where clauses that references the values of all fields at the time the row was read, and, having done that, it checks that a value is null even if that value was in fact an empty string.

I have solved this issue by 'correcting' the data before Ffenics reads it. It's
a tail wagging the dog solution.
User avatar
Adrian Jones
 
Posts: 2000
Joined: Tue Sep 11, 2007 2:38 pm
Location: Cornwall, UK
Has thanked: 5 times
Been thanked: 4 times
 

Re: Multiple kinds of BLANK

Postby Pete Tabord » Sat Jan 26, 2013 9:05 am

That is why this is in 'philosophy' Adrian, rather than a technical issue. I replied to the technical part privately.

Graham, as I said, blank/null is a compromise which I reluctantly accept because all the alternatives look worse :) It isn't a very good compromise, which is why I suppose over the years its resulted in so much discussion and in so many problems in the code.

You actually raise one of the issues yourself which made it necessary, which was distinguishing between 'no data' and 0 in a numeric field. There is also the issue of calculations like average and count. I'd be surprised if we handle everything to do with blanks in a 'logical' way in Ff because once you even wander round the edges of this three value logic stuff you actually become, to a normal commonsensical human being, illogical :)

But a clear distinction can be made between zero (a number - implying you _know_ its Zero) and blank (no data entered, for whatever reason). Well, at least an understandable and mathematically acceptable distinction that can be made in the code and which also makes real world sense - that is, you can explain it to customers without their eyes glazing over.

Distinguishing between different kinds of 'no data' is where it gets too daft for me. And I would suggest certainly too bizarre for the customer. Even if it does make sense to programmers of a certain way of thinking. Some people _like_ to grapple with n-valued logic and bit hacking.

Chris Date years ago suggested using 'default values' to indicate the various reasons why data might be missing, but I think that is pretty unworkable as well, especially in a number field or, for example, in an address line (where you wouldn't want a blank line to print as , say, Not Applicable, and you wouldn't want the overhead of checking for it either).

I prefer the concept mentioned above, which I think I thought up myself years ago, a separate state flag for all non-required fields where you desperately want to know _why_ it is blank.

This could have 'not applicable' 'record not entered yet' and whatever other states were valid. It would be user-configurable and of course the various states could be application or even field specific. The extra disk space required would have been trivial 15 years ago, its hardly worth mentioning now. Actually using it wouldn't be compulsory, of course. I haven't implemented it for fairly obvious reasons.
Peter J. Tabord
Head of Development
Database Software Ltd.
ptabord@ffenics.com
Pete Tabord
 
Posts: 1881
Joined: Fri Sep 07, 2007 12:48 pm
Location: Caernarfon, Gwynedd, UK
Has thanked: 0 time
Been thanked: 3 times
 

Re: Multiple kinds of BLANK

Postby Graham Smith » Mon Jan 28, 2013 3:41 pm

Even though I don't like dealing with null or with blank vs zero (which I see as being roughly the same thing), I still appreciate the need for having the distinction in some circumstances.

As to the issue of null vs blank, I first ran across that when I took an Access class. As part of that, the instructor tried (and failed) to explain the issue of null vs blank. What he was trying to explain was that a column in MS SQL can only be null if it has never had anything in it at all. Once it has had something in it, it cannot ever be null again because there is no command you can use that will actually "null" the field, it will only remove any contents (e.g. make it blank) which results in the column being in a different "state" than it was when it was null.

For most people, this is pretty much an academic discussion with no relevance. But it does have some relevance in the context of the difference between blank and zero for those who are used to the way it was done in DataEase. This way was actually incorrect and did actually create some problems for some people. But for most people the problem never came up so they could safely use Field = 0 in a formula without having to be concerned with blank fields.

BUT, Ff does properly recognize that blank and zero are NOT the same thing. Blank is the lack of a value and zero is a value. All this is by way of saying that if you are doing comparisons of number fields where it is possible that some fields may be blank, you have to take that into consideration. It's really not difficult because, fortunately, Ff does not deal with blank the same way as SQL deals with null and there is only one kind of blank in Ff.

Just remember that Field =0 will only be true if the field actually is zero if the field is blank it will be false. And Field not=0 will be true if the field is blank OR contains a non-zero number.
Graham Smith
DataSmith, Delaware
"For every expert there is an equal and opposite expert.", Arthur C. Clarke (1917 - 2008)
"X-Clacks-Overhead: GNU Terry Pratchett"
User avatar
Graham Smith
 
Posts: 2501
Joined: Fri Sep 07, 2007 11:31 am
Location: Delaware, USA
Has thanked: 0 time
Been thanked: 1 time
 
 

Return to Philosophical Discussions

Who is online

Users browsing this forum: No registered users and 1 guest

cron