A lot of people are going to be writing blogs to grab your attention about ChatGPT (I mean, this is already our second attempt). As a civic tech innovation lab, focused on empowering people through technology, we thought it might be useful to give a little realistic explainer about what ChatGPT actually does, and what the limits (and opportunities) of that can be. This is the way we like to approach all technology at OpenUp - neither through rose-tinted, nor emerald-tinted, glasses.
An introduction to ChatGPT
A good place to start is to try to understand what ChatGPT is doing, based on what we can know given it is proprietary technology. ChatGPT, like any machine learning application, has been 'trained' to look for patterns in large amounts of source data. In this case, the training involves techniques known as 'natural language processing' (NLP), that do exactly what the name suggests. Programming a traditional computer to recognise the 'cat', for example, would involve lines of code that essentially define cat as a variable with characteristics such as 'legs = 4, head = 1' etc. ChatGPT, on the other hand, 'learns' what a cat is using techniques closer to human learning, by creating associations and patterns between words, phrases and pictures and so on. ChatGPT has been trained on vast amounts of data sourced from the public internet, with some nudging from its programmers around certain topics. By learning how word patterns are associated with each other, it becomes excellent at predicting how words should be placed. It does this because the point of it is to generate relevant “content” (i.e. relevant words), when you prompt it to do so.
Liar, liar, pants on fire
This is impressive, but it must be understood why that function is limited. Several writers have referred, I think very usefully, to the seminal essay by Harry Frankfurt on data and truth called “On Bullshit”. It is because of this essay, that ChatGPT has been described as a “bullshit generator”. Frankfurt noted in his essay that bullshit is not the same as lying. In fact in some contexts it is worse, as the speaker does not care about the truth-value of what they say (unlike the liar who has to care to make his lie effective), but only that – regardless of the truth – we view the speaker in a certain light. Commentators are noting that ChatGPT is not synthesizing ideas, but finding patterns in language; it wants to provide you with the content it thinks you want - whether or not it's true - and this is why it's equivalent to a bullshit-generator.
Acknowledging the importance of Frankfurt's distinction, we will continue on the understanding that ChatGPT is essentially a liar (in part, because it means leaving a swear word out of our title), in as much as it is not concerned with truth. Of course truth is a pretty nebulous concept to be trying to work with - but we think it's useful for understanding purpose, and getting to the nub of how ChatGPT ‘thinks’. Think about when you ask ChatGPT about yourself, like “tell me about Gabriella Razzano”. It is able to do entity extraction - so it knows that Gabriella Razzano, because of the words they are, means this belongs to a category “person” (you can read more about the utility of natural language processing and entity extraction for the new tool we are developing, Dexi, here). But just because it's categorised a person or a place as different “things”, doesn’t mean it understands the relational and normative differences. It knows you are asking about a category person - so it gives you the words best associated with a person that has that name. It's not recognising that a search for fact is being sought about this specific person. Have a look for instance as how it describes my search for information on myself (and lets ignore for a second my tragically insignificant online profile in ChatGPT’s world):
First note: always be polite to the machines. Second, you can see that it has no data, but generates content anyway - just selecting words that are normally associated with civic technology in South Africa, rather than my name.
Admittedly - ChatGPT is not meant for search currently, although Bing is exploring using it for search with some amusingly disturbing results. This is already an important thing to note - it's not a great replacement for your search activities. It is also not a great replacement for research, because - as its priority is to generate content - it may generate false urls, simply based on its attempt to predict words (it also naturally prioritises online content and not peer-reviewed journals usually housed behind paywalls). Urls are also frequently out of date, as the main training datasets were up to September 2021.
It would be better if ChatGPT itself would distance itself from the notion of “truth” much more clearly. In the FAQ, OpenAI say this:
When it asks for a thumbs up or thumbs down on the accuracy of content, this is the key public attempt made to help train ChatGPT to distinguish between fact and lie. Yet accuracy and the truth are not the same - because it doesn’t speak to intention. ChatGPT’s meant to create content, not truth. And even phrasing that response is a bit disingenuous - it doesn’t acknowledge that truth isn’t really what it’s there for. A better answer to this question would have been: “Probably not”.
What is it good for?
If you understand that ChatGPT is generally a liar though, I think then you can realistically understand its utility. A great demonstration of its potential is for generating unfactual content that can be productive, as was demonstrated in our other article. Use it when you need a bunch of words, not a bunch of truth (or even fact).
Language is also about more than just rules, so there are stylistic limitations too. Use it to give you words sure, but not to give you eloquence (although the better guidelines you give it, the more accurately it can apply its rules). When we presume ChatGPT is a liar, we can think about its limitations a bit more clearly. Let's hope those who may use the API to develop their own applications can recognise that fundamental truth.
As we learn more about its limitations and opportunities, as with other existing and emerging tech, OpenUp will be sure to share our lessons with you - saving you precious time (and potential heartbreak, should the AI reject you). Sign up to our human-created newsletter to stay up to date.
Photo by Annie Spratt on Unsplash