Thursday, July 30, 2020

Tips for designing prompts for use with GPT-3

If what you are looking for is specific and short, give as many examples of what you are looking for as will fit in the prompt and leave you ample room to generate an answer. If you have trouble thinking of examples, just give one example, and start a second one in the same format. Once GPT-3 has completed the second example, evaluate it. If it is reasonable, leave it in and generate another one. Try to make sure you cover the breadth of variety you want in the inputs and outputs of the examples.
Example:
civic space: street, sidewalk, court, courtyard, public park, parkway, square, plaza, wall, bench, fountain, lawn, gardens, playground, alley, boulevard college 
building: library, laboratory, lecture room,  basement, storage room, storage closet, study lab, office, meeting room, conference room, attic 
fort: guard tower, palisade, rampart, bastion, parapet, fortification, fort gate 
village church: chapel, baptistery, sacristy, choir loft, choir stalls, parsonage, pews, altar 
industrial town: factory, ship, mill, factory shed, milling shed, workshop, armory, tool shed, smithy, forge, smelter, mill building 
cathedral: transept, vestry, pulpit, cell, choir, choir loft, choir stalls, loft, nave,  chancel, sacristy, sanctuary, vault, apse, chapel, presbytery, sanctum, crypt, vestibule, turret, belfry, spire  
castle: The great hall, bedchamber, solar, garderobe, kitchen, pantries, larders, gatehouse, guardroom, chapel, boudoir, storeroom, undercroft, cellar, ice house, dovecote, tower, dungeon, yard, well, baths, keep, battlement, armory, archery range, guard tower, bastion, barbican, wall tower 
monastery: oratory, cloister walk, chapter hall, refectory, dormitory, sacristy, library, transepts, dining hall, chapel, kitchen, vineyard, brewery, barn, laundry, garden, well, graveyard
Give detailed instructions for what you are looking for. Adjectives that describe the results (three paragraphs long, serious, clever, etc...) affect what is produced. Shoot for the moon and you might land in the tree-tops-- I mean, ask for the best ever in the world, and it may encourage it to give a higher quality result.
Example:
Here is an award-winning short-short story about friendship and revenge:
Sometimes you can get the system talk its way through to an answer. For some questions, it doesn't do a good job of answering right off the bat. But if you have it "talk through its process" or create an explicit "internal monologue" then the words it produces as part of that monologue will influence the final answer generated.
Example:
Human: If I had three cookies and gave one away, how many would I have? 
AI: two. 
Human: So if I have two cookies and then get another two, how many would I have? 
AI: four.
If what you are producing has some global structure, such as a plot, you can help enforce that structure by having it produce a short summary first, and generate an expansion of that summary, rather than producing the final product all in one go.

Example:
Here is a plot summary of the new movie "Star Frontiers":Sir Michael King is in charge of the British Empire's Solar Guard. They have discovered how to generate power from atomic energy and have built gigantic, interstellar starships which can travel faster than light. However, instead of peaceful colonization missions, King uses these warships to steal gold, precious metals and jewels from other worlds. 
This is the scene that shows Sir Michael King is in charge of the British Empire's Solar Guard:

If the process you want is too complicated, you might send the text through several different prompts in a row. For example, you could
(1) clean up a question so that it is typo-free.
Then you could (2) rewrite it in a more engaging style.
Then you could (3) give it to a prompt whose only goal is to decide whether GPT-3 can answer it, or else say "I don't know."
A paragraph that has passed through all those stages can then be (4) fed to your question-answerer prompt, for a better chance of success.

Sometimes it helps to pretend the thing you are looking for already exists and describe it, rather than ask the engine to invent it. For example, when I wanted scenes from a non-existent novel, I didn't pose it as a question whose answer is a scene. I instead described the overall plot of the novel as if it were a book review, and then said, "Here is an excerpt from the novel, where the antagonist is on the protagonist's trail:"

Try to imagine a web page where the material you are looking for would exist. Is it a news article? A textbook? A work of fiction? Making your prompt resemble a common web page format helps to get it in the right frame of mind.

Instead of just saying "a list" say "a list of ten items" if you want to encourage it to give you a list of a certain length.

You can name a famous author or pair of famous co-authors if you want to encourage it to have a particular style.

Don't be afraid to edit. If what it has produced is good up to a point and then goes off the rails, delete up to that point, type how you want the next sentence to start, and set it loose again. Or alter what it has already written however you want before continuing. If your final goal is just a good text, it doesn't hurt to work with it as a collaborative tool.

If it is plagiarizing, choose a less likely result instead of the most likely. For example, if you ask it to come up with original names for Disney dwarfs, discard the most probable seven results.

Use titles and headers to encourage it to get in the right frame of mind.

(h/t Matt Brockman) sometimes it is better at third-person. In other words, ask it to describe what someone else would do, rather than try to get it to do it itself.

Even if you use multiple examples, remind the model (with an explanation) what it should be attending to shortly before the generation begins. For some reason it seems to pay more attention to things closer to the end of the prompt.

Friday, July 24, 2020

Regarding GPT-3's faculties


What mental faculties GPT-3 lacks:


  • It doesn't express "beliefs" as such. Depending on how it is prompted, it will say wildly different things. If it is in a context where factual answers are expected, then many of its answers will be true or partially true. But it will also confabulate with no indication that it is doing so.
  • It doesn't contain any information on where its knowledge came from. When we remember a fact, we tend to also remember where we learned that fact, the provenance of the information. This is completely lacking in GPT-3. Sometimes it can guess at where information came from, but it is just an educated guess.
  • It doesn't have senses. When it talks about the world, it is only talking about what it has "read", not what it has directly experienced.
  • It has no qualia. That is, you can't talk about what it would be like to be GPT-3, any more than you can talk about what it would be like to be a waterfall. It can't experience pain or pleasure. Well, it's impossible to say for sure (since we don't know what causes qualia in general), but I don't think it does. Certainly, when it produces text saying "I am seeing a red square" it is not very different from it producing text saying anything else, and is untrue, since it doesn't have any eyes or cameras.
  • It has no inherent conscience or guilt. However, it knows what a conscience is, what good and evil acts are, how a good person would behave in a new situation, and so on. So with the right prompt, it is able to make use of this knowledge as a moral compass.
  • It has difficulty staying on task for more than about two or three thousand words. If a plot takes longer than that to resolve, or a point takes longer than that to make, it probably won't get around to resolving the plot or concluding its argument.


What faculties it has:


  • It does contain correct and incorrect knowledge. It would be impossible to answer trivia questions as well as it does without something that should rightly be called "knowledge." (I would say that Wikipedia also contains knowledge in this sense.)
  • It does have a capability that I would argue is understanding or comprehension of some topics. When it uses a word, it can answer questions about what the word means, it can restate what it is saying in other words, it can provide more details, and it can summarize. It is not just moving around meaningless symbols, like earlier conversation bots such as ELIZA. Probing the limits of its understanding can be tricky at times, because of its tendency to confabulate. But I think it is misleading to say it has no understanding at all. (I would say that Google also contains some limited understanding, though less than GPT-3.)
  • It has something I would call "concepts." A concept, to my thinking, is a representation of meaning that has extensive, subtle connections with many related ideas. These concepts are stored as patterns in its network weights, and can be manipulated in many of the same ways humans make use of our concepts.
  • It is creative, if that word is to ever have meaning when applied to machines. It can write original fiction that, if it were produced by a human, we would call very creative. It can combine ideas and contexts to create new ideas.
  • It is good at analogies and metaphors. It can create original extended metaphors, and explain what role every related term plays in the metaphor. It can solve four-term (SAT style) analogies and explain why a particular analogy works.
  • It has a strong notion of context. It is able to correctly respond to the ways that the meaning of words, phrases, and sentences changes with context.
  • It has a limited theory of mind. It can figure out how a person would reasonably react to many kinds of situations.
  • It has one "goal" or "objective function": to produce text that is a plausible continuation of the prompt. With clever prompt design, this goal can be made to behave like many other goals. But the underlying, root goal is unchangeable. 


Where it's complicated:


  • Its network weights can be thought of as a kind of permanent memory, containing many facts about the world and about writing. Its prompt can be thought of as a kind of short-term memory. But it has no memory of any previous interactions not recorded in the prompt.
  • It can handle some kinds of humor well, while others are completely baffling to it. It can do a fair imitation of a humorous author. It can generate satire, exaggeration for effect, and dramatic irony. It cannot produce original puns very well. If it produces one-liners, they are typically either quoted or nonsensical. It's not good at creating original jokes as such.
  • It has limited spatial reasoning capability. It can correctly reason about prepositions like "in", "over", and "on." But if you describe a spatial arrangement of multiple objects, it can't reliably answer questions about how the objects are arranged relative to each other.
  • It has limited ability to perform deductive reasoning. When it is writing fiction, it can usually correctly deduce the consequences of actions and situations, in subtle ways that seem to require multiple steps of deduction. However, when given tests of deductive ability, it only does a little better than chance.
  • It isn't great at abstraction. When a scene is placed in a rich context, it is much better at figuring out what will happen than when it is reduced to its essence for a test.
  • It has only limited ability to work with images. It is aware of the meanings of many UNICODE emojis, and can use them fairly accurately. It can remember in-line ASCII art, and produce it where appropriate, but can't creatively come up with new ASCII art except in fairly trivial ways. It does a great job at describing realistic visual scenes, though. Also, the Transformer architecture has been shown to be able to realistically extend pixel images as well as It extends text.
  • It is not great at math. With the right formatting, it can do multi-digit addition, subtraction, and multiplication. It has memorized some math facts, but it doen't apply them. The Transformer architecture has been shown to support solving advanced calculus problems, though, when trained in the right way.
  • It has no self-awareness. If you use the prompt to tell GPT-3 about itself, however, it can be said to have some self-awareness, in a strictly functional sense.
  • It can easily be put in a state where its reactions are similar to humans experiencing emotions. This seems to me more like acting as if it has emotions than actually having them. If I say that, though, how is that different than saying it "acts as if it has knowledge"? It is an internal state that affects its behavior in ways similar to the way human emotional states affect human behavior. Similarly, it can simulate having desires and appetites.
  • Its ability to solve problems is difficult to characterize. It has certainly memorized the solution to many problems, and can sometimes apply those solutions to slightly different situations. It can brainstorm solutions to a problem, but its solutions can sometimes be impossible for one reason or another. It is difficult to give it enough knowledge of the situation to allow it to solve a problem, without essentially giving away the solution in the prompt.
  • Regarding willpower: I'm not sure what exactly that is, but there are a couple of ways you can get better results out of it that could be characterized as "trying harder": 1. When you give it a kind of internal dialogue, it can talk its way around to solving a problem it wouldn't solve without. 2. Prompting it to write an "award-winning" story or similar adjectives seems to be able to improve the quality of the results. 

Saturday, July 11, 2020

My own experiments with GPT-3

I   have been playing around with the capabilities for GPT-3. Here are some things it does well:


  • Forging short-short stories by well-known authors, with just the title of the story, the name of the author, and a hint that the story should be very short. Sometimes the stories resolve within a reasonable time, but often they don't. It can keep on topic for about one or two thousand words.
  • Continuing a novel.
  • Inventing recipes for desserts. I gave it the idea "Pineapple "Upside Down" cake with EGGO waffles" for Stranger Things, and it came up with fish sticks and custard for Doctor Who (which he eats in the first episode of the Matt Smith era) and ideas for Firefly, Buffy, and My Little Pony desserts
  • Writing role playing game manuals and modules.
  • Chatting, with personality. You can set up a character you want to talk to with an accent, a backstory, and an attitude, and it will maintain it. You have to keep summarizing everything you've talked about that you want it to remember as the chat gets longer, so that the prompt doesn't go above 2048 tokens.
  • It gave perfect directions for changing a tire.
  • Expanding a one-sentence summary into a one-paragraph story. You can do this recursively, expanding each sentence of the paragraph.
  • Picking out which sentence in a paragraph provides an answer to a question
  • Inventing words and their definitions. I prompted this one with a few examples from "The Dictionary of Obscure Sorrows"
  • Summarizing the first chapter of "The Little Prince." It wrote: "A hat is not a boa constrictor digesting an elephant, but adults are unable to understand."
  • Writing free verse poetry.
  • Text Adventure games-- although again, its memory is very short, so you have to summarize a lot.
  • Asking for book recommendations based on favorite books or authors
  • Continuing a Dave Barry humor column
Here are some things it does not do well:

  • Inventing one-liner jokes. These play off a double-meaning, usually, like "I failed math so many times at school, I can’t even count." Even with 50 examples, it could only copy funny one-liners from the web, or invent completely nonsensical ones that miss the point of a one-liner. On the other hand, it does irony and exaggeration for effect pretty well.
  • Any kind of wordplay based on sound. Puns, rhymes, limericks, except for the most common copied examples, it just can't do. This makes sense, because of how text is encoded into tokens.
Let me know any of these you'd like to see, or other prompts you'd like me to try, in the comments.

Here is the result of one request:

The following essay is by William Gibson, author of Neuromancer. In the essay, he discusses his early novels (Neuromancer, Count Zero, and Mona Lisa Overdrive) and the parallels between the world he imagined and our current world in the middle of a pandemic:

Throughout the long development of Neuromancer I thought of it as SF but also as fantasy. Partly this is just a product of the sort of book I had written, being much closer to contemporary near-future hard SF in its technological detail than anything resembling conventional “speculative” genres such as space opera or sword-and-sorcery, and also having something of that book’s fragmentary, semi-hallucinatory quality in its writing. (Also, the Gollancz hardcover blurbed the novel as “An intense and violent epic from a new maestro in science fiction”; whatever their merits, the various publishers who were involved with the North American edition from Dell onward were not inclined to view the book in the same light.) And I did view it as fantasy: both thematically (based on a set of spiritual concerns) and technically (the books full of ghosts and hauntings). My original title for it was The Winter Market.

However, the other aspect of it being fantasy is less easy to discern, but in many ways more important. One of the most consistent criticisms that science fiction writers—and I include myself in this—have levelled at the mainstream media and, to a lesser extent, at SF movies and television since the 1960s is that they have no grasp of the actual future, when they bother to have one. While the criticism is largely justified on its own terms, it tends to gloss over one important thing: they generally have as little grasp of present-day technological reality as we do. And because I’m not talking about fundamental problems like electricity or antibiotics or cell phones (which even in 1986 SF authors were still struggling with), I’m talking about their slowness to perceive changes that were already largely accomplished. The past few years have brought a number of striking examples—a recent episode of CSI featured a lab filled with enormously expensive equipment for analyzing evidence that could be carried out on an Apple laptop by anyone who can read English; NYPD Blue had an undercover cop ferreting out porn merchants on New York’s Internet by posing as a high-school student and taping his chats with them, because the cops didn’t believe anybody was using anything except dial-up connections to get online in 1999; the Black Keys album Rubber Factory was full of cheesy keyboards and drum loops, which are nowhere to be found in their subsequent work (which sounds very much like how old synth-pop sounds now). In 2005 Robert A Heinlein wrote an article for The Washington Post about advances in flexible solar cells, concluding with the thought that eventually these cells would allow us all “to carry our house keys on our wrists”. He was correct about what had been invented: but he failed entirely to comprehend that within five years after he wrote his piece these keyless wristwatches would begin appearing from several different manufacturers and in infinite variety. Likewise he confidently asserted that remote controls would soon be “blown away by television sets whose remotes do everything”. No remote can change channels without leaving home base any more; instead you use your tablet or smartphone. These failures are not always so apparent or easy to spot: there has been far too much arguing over whether or not modern computer operating systems resemble “artificial intelligence” for anyone to acknowledge how little thinking there has been about what even weak AI might mean on a social level (in 2012 Siri can help you book plane tickets and find people but does not help you with email). But in all of these cases what we have is someone who has just looked at some impressive-sounding technology, or has picked up some bit of slang, and then tried to think through the implications of that technology or that slang in the same terms as that which were appropriate when they made their assessment of it. The experience becomes (and very often was) a sort of Schrödinger’s cat: technically alive until someone takes a look at it.

In essence, however, that’s what I was doing with Neuromancer: looking at a certain set of technical facts and attempting to extend them into something resembling the reality we inhabit now. Inevitably there were gaps between the two—I invented what I wanted to invent without really examining what actually would be feasible—but for the most part the fit is excellent. The future in Neuromancer is much more defined than it was in my first novel, and many of its components are now so deeply embedded in the world around us as to be commonplace: we’ve had full-immersion online gaming for years, both on an individual and team level; personal data management systems have gone from “big deal” to “everybody does it”; people routinely replace their own prosthetics and weapons; biotechnology has invaded every aspect of our lives; global communications are almost entirely virtual (as opposed to actual physical connections); surgery is often done with machines that can move with nanoscopic precision. But perhaps the single most striking element (aside from ubiquitous Japanese pop cultural elements) is what I imagined happening in Germany after unification. What I saw happening there was not too far from what actually came about a decade or so later: complete privatization and virtualisation of everything from schools to hospitals to police forces, alongside a very sharp lurch towards right-wing nationalism fuelled by resentment over job losses occasioned by global economic shifts (the Farben refinery complex on the Elbe that provided income for two million workers before unification closed its doors in 2006), although nothing else has really come to pass (yet). People like Norbert Walter and Stefan Selke really exist, though they probably wouldn’t like being compared to their fictional versions—Walter especially has had a distinguished career; his name appears next to an early version of Skype on patents filed for adaptive screen sharing as far back as 2000—and although Walter and Selke both professed concern about (and/or snide amusement at) their roles as the two most dangerous men in the world, Selke has happily published and presented his work on military applications for virtual reality. Selke also discussed his interactions with Walter at the first conference for the magazine I was then editing, Transfuge; one of the amusing things about speaking German is that you do have a clear idea what people are talking about when they throw names around in conversation, however often they use four-letter words or unprintable epithets to make their points.

Another element of Neuromancer was the very real-world Wired magazine. In 1985 I was editing a very small magazine based in Montreal called Culture Shock, which had been founded by myself and Patrick White—the two editions produced never really found an audience and shut down with depressing inevitability after a year or so. There were several kinds of trouble involved, but one of them was that our financial backers were dithering over the whole concept of us being a “magazine for a digital generation”; in fact, whenever we talked to them about what we were doing, it felt more like we were explaining why Culture Shock shouldn’t exist rather than why it should. They weren’t certain that there even was such a thing as “a digital generation”, although they sort of came around to it once they saw results from some other publications (including New York’s Paper—that guy who used to draw pictures in 20 pages when he edited New York has ended up running companies). With funding from an old journalist friend of mine who I knew because she’d done an interview for me once back in Toronto when she still had her own style column, we’d cobbled together some sample issues, assembled an editorial team (including art director), and set about trying to survive on zero dollars per week on something like 32k from two then-obscure Apple IIe computers. The experience wasn’t pleasant. The young digital journalists we tried to recruit largely didn’t care whether their pieces were going to print or not; when it became clear that Culture Shock wasn’t getting any press exposure whatsoever (nothing appeared until after the second issue had been out for three months) we couldn’t afford another issue; and so my wife at the time—Karen Gigliotti—and I emptied our life savings account to put out a third-issue “special project” mostly assembled from fragments of things done during late nights by Culture Shock staff (K

I was asked to do a similar chat with a Carbon atom:

The following essay is by the element Carbon. In the essay, he discusses his most famous compounds (Carbon Dioxide, Carbohydrates, and Hydocarbons) and his thoughts on Chemistry. 

"Carbon is the element of energy, it's what I was made for. The little dark things that I combine with, to make some very important compounds, are my kind. You may know some of them from pictures in science books. There's C, and H, and O... ah... O, which is me... COOH. What a wonderful combination! Ohh... and what a joyful thing it is to see this compound move... It goes: (he hops around a little)

"How much energy can these compounds have?! Not enough for me! I'm gonna make them more energetic than they've ever been before! How do I do this? Simple. I use other atoms for them to interact with."

"There's four others: Br, F, N, and P. When they meet the wonderful covalent compounds we create together; a reaction occurs that no one could foresee. That element bonds with Carbon. "Hey!", they say, "Look at the nice Carbon!" Ahhhh... So great! But what can those atoms add? What I really want is more energy! I'll talk with Hydrogen!"

"Why hello there little one! Tell me something of yourself..." So Hydrogen reveals his life story. He has such powerful properties... "Wait just a second there little guy. How much energy does you have?" He replies: "Ah... well not too much..." Well now what will he add to my covalent compounds!? The hydrogen can't give too much more.... because he is under so much pressure from helium."

"Ah, that brings me to Helium! How he is great and marvelous, but doesn't help me too much. Why? Well all of the elements around him like to bond with him. He takes up a lot of space in the Periodic Table. So where is he!? I have got some special compounds for him... but they're not here yet."

"What we really need is Oxygen! But he's hiding in... Ozone... and he's floating around the planet somewhere." The Curator points out that Ozone is mostly made of Oxygen as a white cloud. "Of course!" exclaims Carbon, "Oxygen, you may have noticed, can be bonded with two covalent bonds. And I'll use these bonds to make more bonds!"

"This exciting compound contains four covalent bonds! You can say this compound has one: (he points to himself) 2: (he points to the water) 3: (He points to the Hydrogen) 4: (he points to oxygen) total energy: 32 kilocalories per mole or 21 kilocalories per 1 gram! Mmmm-mmm good!"

"You may know a couple other compounds I make with Nitrogen. One compound is Ammonia (NH). He tells of his experience with him. "He started off as plain old nitrogen gas." So some went up, some went down, some around; he was converted into this happy little Ammonia!"

Then comes carbon dioxide. He tells how Carbon Dioxide came about. "He was created after bacteria ate food in one part of the world and 'burped' out this very strong gas... what does it do? It absorbs radiation from Earth's warming Sun..." Then Carbon mentions Hydocarbons like methane and petrol. They go on to describe what they are used for nowadays by people today in society.

"And that's my story... of me, Carbon!" The Curator then talks about how great the planet is, and you get to hear from some of the other elements.

THE END