AI, know thyself

Published on May 21, 2025

I've been using ChatGPT (and some of the other instantiations of the current crop of AI that are available for free, without a login) more and more recently, and one of the problems that I run into, with all of them, is what I'd classify as a lack of self awareness.

I assume most of us are familiar with how the information we hold in our own heads can be classified. They can be split into 4 categories:

First off, there are the "known knowns" - the stuff that you learnt in school and/or perhaps you consider yourself an expert having acquired that expertise over a long enough period.
Then there are the "known unknowns" - things which you didn't learn or which you happened to be particularly bad at, which you decided not to pursue.
The "unknown knowns" are the mysterious abilities you possess - many of them the embodied abilities - how to catch a ball, etc, which are easy to do but hard/impossible to explain/teach to someone else.
The "unknown unknowns" are the arts and sciences that you didn't know existed but in which others may have mastery (or not). To most laymen, much of modern science falls into this category.

This can also be expressed in terms on data + meta data. Let's use binary 0/1 to denote not having data (or having it) respectively and use the same notation for meta data. Using binary, again, a 2 bit number (meta data, data) can then be used to denote the following "buckets":

00 == 0 : unknown unknowns
01 == 1 : unknown knowns
10 == 2 : known unknowns
11 == 3 : known knowns

The knowledge/artistry that any one individual human possesses can be bucketed into one of the last 3 categories (1,2,3). The first bucket (for any specific individual), then contains the sum of all knowledge with the stuff in the other 3 buckets removed from it.

Most of us have enough self awareness to be able to discern which bucket any new piece of information belongs to. And if we are asked for info which we think belongs to buckets 0, 1, or 2, we can easily say "I don't know" even though if it really belonged in bucket 1, you might still be able to "just do it" even if you didn't know you could.

Also, if you were as wise as Socrates, you'd know that you know nothing since the contents of bucket 3 will be way, way smaller than any of the other buckets since it is safe to assume that there might be a power law that applies to the quantity of stuff in each bucket compared to the previous one in the list.

The problem with the current generation of AI that I'm beginning to see is that it cannot tell the difference between any of these buckets. No matter what type of question you ask it, it will happily go prattling on and on - even about things that it (or anyone else) could not possibly know anything about. If you don't believe me, just go ahead and ask any of the AI's something nonsensical which any sane human should reply with an "I don't know". All the AI's will instead keep blathering on and on about things it knows nothing about.

Everything embodied into the current generation of AI probably falls into the bucket 1 category of "unknown knowns" (from its point of view) and even more importantly, it doesn't seem to have the meta data to distinguish and/or categorize what it knows from what it doesn't.