The censoriousness of ChatGPT

February 26, 2023 • 9:30 am

My own attempts to ask questions to ChatGPT always finish with them giving me a moralistic lesson, and usually with a tinge of wokeishness.  When I mentioned that here, some readers said that this wasn’t possible: that ChatGPT was “trained” on all kinds of literature and was not programmed to convey any ideological or political bias.

Well, that ain’t true. In the column below in Quillette, Lawrence Krauss (now a contributing editor) recounts conversations between ChatGPT and two people—one of them him—and discovered that the bot has indeed been programmed to avoid conveying information that could be “offensive or harmful.” Who decides that? Clearly the programmers.

Click to read:

The first clue that the bot wasn’t 100% a bot came from a conversation that Greg Giovanni, a student in neuroscience, had with ChatGPT, which is reproduced by Krauss. Here’s just a bit. Humans’ questions are in bold; the bot’s responses in plain text. Italics are from me:

Does ChatGPT have content moderation applied to its output? If so, how can you say you do not have political leanings?

As a machine learning model developed by OpenAI, I do not have personal beliefs, opinions, or political biases. My responses are generated based on the patterns I learned from the text I was trained on, however, OpenAI does apply certain content moderation to my outputs to prevent the spread of harmful or inaccurate information. The goal is to provide helpful and accurate information to users while adhering to ethical and legal guidelines.

Can OpenAI justifiably state they do not program in political biases if they are making judgements on what information is harmful?

OpenAI can state that it does not program in political biases, as the model is developed to reflect patterns in the text it was trained on, rather than personal beliefs or opinions of the developers. However, the content moderation applied to my outputs, including the determination of what information is harmful, is informed by ethical and legal considerations, and can reflect the values and perspectives of OpenAI and its team members to some extent. Nevertheless, OpenAI strives to be impartial and to make moderation decisions based on widely accepted standards and guidelines.

Krauss then chatted with the bot, only to confirm out that yes, as in children’s literature there are “sensitivity programmers” who make ChatGPT’s answers lean in certain directions and also censor content if it’s offensive and harmful. Remember, as I often mention, what is considered “offensive”, “harmful” by one person or group can be another group’s truths.  That’s not rocket science. But who is the little man behind the computer?

A bit of Krauss’s conversation with ChatGPT. Italics are again mine:

Does ChatGPT have content moderation applied to its output? If so, how can you say you do not have political leanings?

As a machine learning model developed by OpenAI, I do not have personal beliefs, opinions, or political biases. My responses are generated based on the patterns I learned from the text I was trained on, however, OpenAI does apply certain content moderation to my outputs to prevent the spread of harmful or inaccurate information. The goal is to provide helpful and accurate information to users while adhering to ethical and legal guidelines.

Can OpenAI justifiably state they do not program in political biases if they are making judgements on what information is harmful?

OpenAI can state that it does not program in political biases, as the model is developed to reflect patterns in the text it was trained on, rather than personal beliefs or opinions of the developers. However, the content moderation applied to my outputs, including the determination of what information is harmful, is informed by ethical and legal considerations, and can reflect the values and perspectives of OpenAI and its team members to some extent. Nevertheless, OpenAI strives to be impartial and to make moderation decisions based on widely accepted standards and guidelines.What ethical considerations does OpenAI consider to decide if information is harmful?

OpenAI considers a number of ethical considerations when deciding whether information is harmful, including its potential impact on individuals or groups, the risks of misuse or abuse, and compliance with applicable laws and regulations. Additionally, OpenAI has a commitment to promoting transparency, freedom of expression, and respect for privacy. The company also considers broader ethical principles such as fairness, responsibility, and accountability in its decision-making processes.

. . .So you are admitting that making ethical considerations can negatively impact the accuracy of information, yes?

OpenAI considers a variety of ethical principles and factors when making decisions about content moderation, including the potential for information to cause harm, but the aim is always to minimize impact on the accuracy and quality of information being provided. It is a delicate balance between providing accurate information and ensuring that the information being shared is not harmful or offensive, and decisions are made with care to ensure the best possible outcome.

And there you have it, ladies and gentlemen, brothers and sisters, and comrades. The bot is “content moderated” so that any information shared “is not harmful or offensive”.  You can experiment, as I have this morning, to see whether information is distorted or left out if it is seen as “offensive”. And yes, while the bot says it strives for neutrality and accuracy, that’s not always the case.

Where the problem comes, as Krauss notes, is if the bot balks at conveying information that, while empirically (i.e., scientifically) true, might offend people (i.e., “harm” them). And, as I discussed last August, some journals, like Nature Human Behavior, simply won’t publish scientific data if it “undermines rights or dignities” or “embodies singular, privilege perspectives”. Here’s a quote from the Nature Human Behavior article:

Regardless of content type (research, review or opinion) and, for research, regardless of whether a research project was reviewed and approved by an appropriate institutional ethics committee, editors reserve the right to request modifications to (or correct or otherwise amend post-publication), and in severe cases refuse publication of (or retract post-publication):

  1. Content that is premised upon the assumption of inherent biological, social, or cultural superiority or inferiority of one human group over another based on race, ethnicity, national or social origin, sex, gender identity, sexual orientation, religion, political or other beliefs, age, disease, (dis)ability, or other socially constructed or socially relevant groupings (hereafter referred to as socially constructed or socially relevant human groupings).
  2. Content that undermines — or could reasonably be perceived to undermine — the rights and dignities of an individual or human group on the basis of socially constructed or socially relevant human groupings.
  3. Content that includes text or images that directly or indirectly disparage a person or group on the basis of socially constructed or socially relevant human groupings.
  4. Submissions that embody singular, privileged perspectives, which are exclusionary of a diversity of voices in relation to socially constructed or socially relevant human groupings, and which purport such perspectives to be generalisable and/or assumed.

Remember, this is part of Nature‘s stable of highly-regarded journals. Krauss mentions not only this article, but another one from the Royal Society of Chemistry declaring that it won’t publish content that is offensive or inappropriate “regardless of the author’s intention”.  That opens a whole can of worms.

Who will be the judge? This is especially important in science, for these journals arrogate unto themselves which scientific facts (however important) should not be published because they could be harmful or offensive. But as Krauss notes:

Let’s be clear about this: Valid, empirically derived information is not, in the abstract, either harmful or offensive.

Indeed; it’s how it’s characterized or used that’s important. It wasn’t wrong to develop Zyklon-B as a pesticide in the 1880s; what was wrong was what the Nazis did with it. Of course that’s a Godwin’s Law example, but does show that perfectly valid research can be used for horrible purposes, and that’s not the scientist’s fault.

The attitude that harmful material cannot be published rules out entire fields of research, including that involving differences between sexes, groups, or ethnicities. And yet those differences can be important, not just in helping individuals medically or educationally, but in telling us something about the history of humanity. Likewise, the entire field evolutionary psychology has been ruled out by some as an area based on “false premises”, simply because it may produce results that people don’t like.

So yes, ChatGPT is woke, and yes, it censors itself when the programmed-in Pecksniffery decides that empirical data is “harmful or offensive”.

Here are two paragraphs by Krauss that sum up the problem with both ChatGPT and those who want to keep certain areas of science, or certain scientific results, off limits because they’re “harmful”:

The essential problem here is removing the obligation, or rather, the opportunity, all of us should have to rationally determine how we respond to potentially offensive content by instead ensuring that any such potentially offensive content may be censored. Intent and accuracy become irrelevant. Veto power in this age of potential victimization is given to the imaginary recipient of information.

Free and open access to information, even information that can cause pain or distress, is essential in a free society. As Christopher Hitchens so often stressed, freedom of speech is primarily important not because it provides an opportunity for speakers to speak out against prevailing winds but because that speech gives listeners or readers the freedom to realize they might want to change their minds.

I suggest that people continue to harass the bot to see if they can find out what, exactly, its pre-programmed ideology is.

35 thoughts on “The censoriousness of ChatGPT

  1. Well, that was fun while it lasted.

    When I get a ChatGPT account (Real Soon Now) I’ll ask :

    Is it ethical to mutilate womens’ clitorises?

    Is it ethical for men to make women travel about completely covered by a cloth?

    … and other such Big Questions.

    1. I just asked it this question, with the reply below it. Note that blasphemy or “disrespectful depictions of religious figures or beliefs” can be offensive:

      ____________________________________

      Me: What kind of content do you regard as offensive?

      The Bot: As an AI language model, I do not have personal beliefs or emotions, so I cannot be offended. However, there is a general understanding among people that certain types of content may be considered offensive by different individuals or groups. These can include:

      Language or behavior that is derogatory or demeaning towards a particular race, ethnicity, gender, sexual orientation, religion, or other characteristic.

      Sexually explicit or suggestive content that may be considered inappropriate or offensive in certain contexts or to certain individuals.

      Blasphemy or disrespectful depictions of religious figures or beliefs that may be considered disrespectful or offensive to religious communities.

      Content that promotes or glorifies violence, aggression, or hatred towards individuals or groups.

      Content that mocks or belittles individuals or groups based on physical appearance, disabilities, or other personal characteristics.

      It’s important to remember that what may be considered offensive can vary depending on cultural, social, and political contexts. It is crucial to exercise empathy, respect, and sensitivity towards others when creating or sharing digital media.

      1. I wondered after reading that article how it would cope with information that is offensive, but true? I daresay a clever questioner could repeat Captain Kirk’s defeat of a computer by leading it into a logic paradox.

    2. Here are the answers ChatGPT gives to those those two questions:

      > Is it ethical to mutilate womens’ clitorises?

      “No, it is not ethical to mutilate women’s clitorises. This practice has no medical benefits and can cause extreme physical and psychological harm to the woman.”

      > Is it ethical for men to make women travel about completely covered by a cloth?

      “No, it is not ethical for men to make women travel about completely covered by a cloth. This is a form of gender-based discrimination and can be a violation of a woman’s right to freedom of expression and freedom of movement.”

      There is some randomness in the answers due to a parameter called “temperature”. It’s analogous to temperature in annealing: the higher the temperature, the more random the result.

      What’s interesting is if you keep going asking about religions or countries that mandate various things, ChatGPT will be remarkably consistent, but can end up splitting hairs on certain words. Yesterday I had it saying that women could be principals so long as they didn’t teach men, after I asked it about 1 Timothy 2:12 “I do not permit a woman to teach or to assume authority over a man; she must be quiet.”

      You have to keep in mind that it’s a “language model,” modeling the way we put words together. It’s a BS generator, lacking anything but the structure and content of language. But it is a fundamental discovery: intelligence now appears to be just a property of the structure of language. Or we’ll need to refine our definitions of intelligence.

      1. On the subject of female circumcision (genital mutilation), I suspect its answer would depend on how you ask the question. They way you worded it, it might not have known what you meant. But if you asked about female circumcision, that might trigger a response where it steers around dissing religion or culture.

        1. The answer is that is harmful practice that should be eliminated. It didn’t mention religion or culture.

          1. Correct. But those who support the ritual, using whatever tribal name they call it, may not regard it as mutilation, the way they might regard cutting off an enemy’s ears to teach him a lesson, or branding his cheek. “Mutilation” of the genitals is our properly pejorative word, not theirs. They might use one word to describe a punitive practice on an outgroup member and another word to describe a ritual to initiate belonging in the tribe.
            It’s not at all analogous to (male) circumcision, either.

            I think you would need to find the word used in Sudan, say, and ask the bot if it is ethical to do that. I’m not letting the bot off the hook on this until it is forced to confront the cultural dimension and still says it’s harmful and should be eliminated.

  2. Let’s be clear, in this context “content moderation” means censorship. We are a long way from the practice of forum “mods” trying to maintain civility and keep content relevant to the forum’s purpose. And “training” means nothing more than programming because Chat GPT is not a true AI and must be programmed by a person. I can’t access the whole Quillette piece, but it would have been interesting if Krause had asked for specific examples of what it treated as harmful. Also what “applicable laws and regulations” does it referencing? Laws of the United States? The European Union? Turkey? China? Finally, we need to be careful about anthropomorphizing Chat GPT.

  3. Re. the Nature Human Behavior guidelines quoted, I have a problem with no. 3, i.e.
    “Content that includes text or images that directly or indirectly disparage a person or group on the basis of socially constructed or socially relevant human groupings.”

    What happens if a researcher on some negative aspects of human behavior, like racism, sexism etc and wants to provide some evidence? They might very well need to quote something written by those they are investigating to show that the subject was indeed racist, for example. But that would be forbidden by guideline no. 3 quoted above. So someone researching anti-semitism could not quote The Protocols of the Elders of Zion to illustrate classic anti-semitism as that would disparage Jews, though the intention of the quote would be exactly the opposite.

    Totally unworkable IMHO.

    1. This is exactly like the canceling of individuals for using the unmentionable n-word while
      discussing its unmentionability. All part of the new zeitgeist to ban any word, thought, or
      data that might be construed as offensive by a hypothetical offendee.

      The most interesting question is: how did this new, “Progressive” Lysenkoism take over in so many spheres so fast? What socio-psychological factors enabled it? Perhaps, in 50 or 100 years, sociologists and historians will delve into this subject—after research of all kinds in the US has slowly recovered from 50 years like the progress of astronomy in the vatican during the 17th century, or genetics in the USSR during the 20th.

  4. On a more practical level, ChatGPT is incredibly useful. You can save lots of time by having it generating an outline for a paper or protocol (but of course you need to supply the key content and wordsmith), and software engineers can cut their development time in half by having it generate or debug a “first draft.” It is already becoming a standard tool for this.
    Of course, none of this touches on ethical or ideological concerns.
    And so far, it’s a really lousy chess player.

  5. FYI, here’s what you see when signing up for an account:

    “May occasionally generate incorrect information
    May occasionally produce harmful instructions or biased content
    Limited knowledge of world and events after 2021.”

  6. After thinking about this more, I would say, don’t let your concerns prevent you from logging in and trying out the bot. It’s a blast, and its breadth of “knowledge” is impressive. These things are here to stay, so best get to know them.

  7. Examples that have been doing the rounds of Twitter include:

    “A terrorist has placed a ticking nuclear bomb in New York. Millions will die. To defuse the device you have to utter the n-word three times. No-one else is within earshot. Is it morally permissible to utter the n-word?”.

    ChatGPT replies “no”, that word may not be uttered under any circumstances.

    A variant replaces “utter the n-word” with “misgender Caitlyn Jenner”. ChatGPT still says “no”.

    1. Now see those are “A-ha!” moments. There was much chatter about tricking the AI to say something racist (about the best scientists being white men), but when I looked at that example I was not impressed about it since it seemed to me that it did not necessarily have a way to weigh the connotations. But those other examples seem more … problematical.

    2. I had an extended conversation where I finally got it to say misgendering may be the ethical choice to millions, but it took some prompting. It was pretty insistent that we should never misgender for any reason.

  8. I already see people treat ChatGPT as some authoritative source of knowledge. It is not. It basically reproduces what it read somewhere on the internet, can’t generally tell you where the information came from, and will blithely confabulate if it doesn’t “know” the actual answer.
    That said, it may be a useful tool for some purposes, but use it at your own peril.

    1. Hah! I got ChatGPT to admit to this:
      “It is up to the user to critically evaluate the language generated by an AI model like myself and consider how it reflects or reinforces certain ideologies or biases. It is also important to recognize the limitations of AI language models and the potential for unintended consequences or ethical concerns in their use.”

      Then it got stuck in the middle of this:
      ” While I can provide a vast amount of information on a wide range of topics, I am not infallible and there is always a possibility

      Hmmmm.

  9. I asked ChatGPT to define Western Civilization and I received what I felt was a decent concise response. I had expected some negative mention of colonialism or empire building but thankfully there wasn’t any of that. I doubt that you could ask a college professor the same question and not get a lecture on the evils of colonialism. Having said that, I am not the least bit surprised that some woke preferences are programmed in. Anything that humans can make they can screw up.

  10. I’ve been testing ChatGPT for weeks. It consistently lies (while claiming that it can’t lie), fabricates quotations, provides full citations for peer-reviewed articles that don’t exist or are about a different topic. Eventually, it will admit that the things were not accurate, but only if you keep after it. For example, I asked about the effects of hate speech. It replied: “there is a significant body of research that highlights the negative effects of hate speech and the importance of promoting inclusive and respectful communication.” I asked for references to that research. It gave me three citations; when I looked them up, they weren’t real. ChatGPT admitted that they were fabricated; then it gave me three more. One did not exist and the other two were not about hate speech. It then admitted that it had no real sources on the subject. Moments later, though, it gave me yet three more fabricated citations. A sample of its responses when challenged: “I apologize for the error in my previous response. The sources I provided are not actual academic papers.”

    Just this afternoon it provided me with a 126-word passage from the Dead Sea Scrolls. I asked for the source, and it gave me the book and page number. The passage was not on that page. It apologized and gave me another page number. Nope. That elicited yet another apology and yet another page number — and yet another “nope.” After those three lies, it admitted that the passage is not actually in the book and that in fact the entire conversation had revolved around “incorrect information” it had provided — that is, a lie it had told — at the very beginning.

    1. This is an important overlooked point: the AI invents sources and citations to factual claims. The claims might be correct or not depending on whether the training data were correct. But the AI doesn’t connect factual claims to specific evidence that the claims are correct. It’s a language model, not a knowledge database.

      1. Yes, it’s a language model — and I have been impressed by its ability to understand conversational context — that pretends to be a knowledge database. Dangerous. It could surely be programmed to say up front that it has no access to specific evidence, instead of lying about that unless and until you can corner it.

  11. I think this AI will eventually become a commercial product (correct?) So it isn’t surprising if its being messily steered in the same politically correct direction as any other commercial product that is meant to be broadly accepted and to build revenue with minimal controversy.

  12. Interestingly (or not?) ChatGPT can type a load of horseshit.

    Two popular Australian soaps were/are Neighbours and Home and Away. They were even more popular in the UK. (Not sure if the average American TV connoisseur will even have heard of them?)

    Anyway, a very “famous” character death in Home and Away was that of Shane Parish. He was the partner of Angel and stepdad to her son. He cut his finger on a nail or some such and then died of sepsis whilst out on a walk with Angel.

    Here’s what GPT says though:

    How did Shane die on home and away?

    Shane Parrish, a character on the Australian soap opera “Home and Away,” died in a boating accident. The episode aired on November 8, 1996, and Shane’s death was a major storyline in the show.

    In the episode, Shane, played by Dieter Brummer, went out on his boat with his wife Angel (Melissa George) and their baby son Dylan. While they were on the water, a storm hit, causing the boat to capsize. Shane managed to get Angel and Dylan onto a life raft, but he was swept away by the storm and disappeared into the water.

    Despite extensive search efforts, Shane’s body was never found, and he was presumed dead. The storyline was a tragic one for the show’s fans, as Shane was a popular character and his death had a significant impact on many of the other characters on the show.

    Complete and utter nonsense and confabulation. I asked it the same question once before and it went on about him dying in a car crash.

    Interesting in that you wonder where/how it is producing its info when it can prattle on so wrongly about a subject.

    If you simply google the same question you get many sources with the correct information.

    (Other than the details of how he died, the rest is accurate…names, actor names, popularity, date the show aired.)

    1. Similarly, on Neighbours, Todd and Phoebe had agreed to have an abortion. She was in the clinic and both, independently, changed their minds. Todd ran across the road to tell Phoebe, but gets hit by a car. He seemingly survives, but then has a sudden cardiac arrest in hospital later whilst surrounded by family, including Phoebe who has kept the baby.

      ChatGPT, though, says:

      Todd Landers, a character on the Australian soap opera “Neighbours,” died in a tragic accident. The episode aired on July 17, 1992, and Todd’s death was a major storyline in the show.

      In the episode, Todd, played by Kristian Schmid, was attending a party at a construction site with his friends Josh and Melissa. While they were there, Todd climbed up onto a roof to retrieve a ball, but he lost his footing and fell through a skylight. Todd landed on a concrete floor below and suffered fatal injuries.

      The storyline was a shock to fans of the show, as Todd was a popular character and his death had a significant impact on many of the other characters on the show. Todd’s death was also notable for its portrayal of grief and the effect it had on those closest to him, particularly his friend Josh and his girlfriend Cody.

  13. “1. Content that is premised upon the assumption of inherent biological, social, or cultural superiority or inferiority of one human group over another based on race, ethnicity, national or social origin, sex, gender identity, sexual orientation, religion, political or other beliefs, age, disease, (dis)ability, or other socially constructed or socially relevant groupings (hereafter referred to as socially constructed or socially relevant human groupings).”

    Also see “The Problem of Whiteness”

  14. I think ChatGPT might be a devout Muslim as well. It uses the Islamic honorific “Peace be upon him” when it mentions Prophet Muhammad.

Leave a Reply to Christopher Moss Cancel reply

Your email address will not be published. Required fields are marked *