Multiversal views on language models
MMultiversal views on language models
Laria Reynolds [email protected]
Kyle McDonell [email protected]
Abstract
The virtuosity of language models like GPT-3 opens a new world of possibility for human-AIcollaboration in writing. In this paper, we present a framework in which generative languagemodels are conceptualized as multiverse generators. This framework also applies to humanimagination and is core to how we read and write fiction. We call for exploration into thiscommonality through new forms of interfaces which allow humans to couple their imaginationto AI to write, explore, and understand non-linear fiction. We discuss the early insights wehave gained from actively pursuing this approach by developing and testing a novel multiversalGPT-3-assisted writing interface.
Keywords: writing assistant, hypertext narratives, multiverse writing, GPT-3
GPT-3 [4], OpenAI’s new generative language model, has astonished with its ability to generatecoherent, varied, and often beautiful continuations to natural language passages of any style. Tocreative writers and those who wish themselves writers, such a system opens a new world of pos-sibilities. Some rightfully worry whether human writing will become deprecated or worthless in aworld shared with such generative models, and others are excited for a renaissance in which thecreative powers of human writers are raised to unprecedented heights in collaboration with AI. Inorder to achieve the latter outcome, we must figure out how to engineer human-machine interfacesthat allow humans to couple their imaginations to machines and feel freed rather than replaced. Wewill present the still-evolving approach we have learned over several months of testing and designinginterfaces for writing with the aid of GPT-3, beginning by introducing the framework of languagemodels as multiverse generators.
Autoregressive language models such as GPT-3 take natural language input and output a vector ofprobabilities representing predictions for the next word or token. Such language models can be usedto generate a passage of text by repeating the following procedure: a single token is sampled fromthe probability distribution and then appended to the prompt, which then serves as the next input.As the sampling method can be stochastic, running this process multiple times on the same inputwill yield diverging continuations. Instead of creating a single linear continuation, these continuationscan be kept and each continued themselves. This yields a branching structure, which we will call amultiverse, or the “subtree” downstream of a prompt as shown in Figure 1.
Quantum mechanics tells us that the future is fundamentally indeterminate. We can calculateprobabilities of future outcomes, but we cannot know with certainty what we will observe until we1 a r X i v : . [ c s . H C ] F e b igure 1: The process of generating a multiverse story with a language model. The probabilitydistribution is sampled multiple times, and each sampled token starts a separate branch. Branchingis repeated at the next token (or per set interval, or adaptively), resulting in a branching treestructure as shown in Figure 1.Figure 2: A narrative tree with initial prompt “In the beginning, GPT-3 created the root node ofthe” 2ctually measure it. The problem is not merely epistemic; the future truly has not yet been written,except in probabilities. However, when we do finally venture to measure it, the ambiguous futureseems to us to become a concrete, singular present.The Everettian or many-worlds interpretation of quantum mechanics, which has become increas-ingly popular among quantum physicists in recent years, views the situation differently [6]. It claimsthat we, as observers, live in indeterminacy like the world around us. When we make a measurement,rather than collapsing the probabilistic world around us into a single present, we join it in ambiguity.“We” (in a greater sense than we normally use the word) experience all of the possible futures, eachin a separate branch of a great multiverse. Other branches quickly become decoherent and evolveseparately, no longer observable or able to influence our subjective slice of the multiverse.This is the universe an autoregressive language model like GPT-3 can generate. From any givenpresent it creates a functionally infinite multitude of possible futures, each unique and fractallybranching.David Deutsch, one of the founders of quantum computing, draws a connection between theconcept of a state and its quantum evolution with virtual reality generation [5]. He imagines a the-oretical machine which simulates environments and models the possible responses of all interactionsbetween objects. Deutsch further posits that it will one day be possible to build such a universalvirtual reality generator, whose repertoire includes every possible physical environment.Language models, of course, still fall well short of this dream. But their recent, dramatic increasein coherence and fluency allow them to serve as our first approximation of such a virtual realitygenerator. When given a natural-language description of objects, they can propagate the multiverseof consequences that result from a vast number of possible interactions. Deutsch’s view emphasizes that from any given a state there are a multiplicity of possible futuresingle-world dynamics; stories unfold differently in different rollouts of an identical initial state.There is another dimension of multiplicity that we must also consider, especially when we are talkingabout states defined by natural language.Natural language descriptions invariably contain ambiguities. In the case of a narrative, we maysay that the natural language description defines a certain present – but it is impossible to describeevery variable that may have an effect on the future. In any scene there are implicitly many objectspresent which are not specified but which may conceivably play a role in some future or be entirelyabsent in another.The multiverse generated by a language model downstream of a prompt will contain outcomesconsistent with the ambiguous variable taking on separate values which are mutually inconsistent.So we define two levels of uncertainty, which can both be explored by a language model:1. An uncertainty/multiplicity of present states, each associated with2. An uncertainty/multiplicity of futures consistent with the same ”underlying” presentWe will call the first form of multiplicity interpretational multiplicity, and the second formdynamic multiplicity.
Humans exist in a constant state of epistemological uncertainty regarding what will happen in thefuture and even what happened in the past and the state of the present [12]. We are then, by virtueof being adapted to our uncertain environments, natural multiverse reasoners.3avid Deutsch also points out that our imaginations, which seek to model the world, mimicreality as virtual reality generators: we model environments and imagine how they could play outin different branches.
When a piece of literature is read, the underlying multiverse shapes the reader’s interpretations andexpectations. The structure which determines the meaning of a piece as experienced by a readeris not the linear-time story but the implicit, counterfactual past/present/future plexus surroundingeach point in the text given by the reader’s projective and interpretive imagination.More concretely stated, at each moment in a story, there is uncertainty about how dynamics willplay out (will the hero think of a way out of their dilemma?) as well as uncertainty about the hiddenstate of the present (is the mysterious mentor good or evil?). Each world in the superposition notonly exerts an independent effect on the reader’s imagination but interacts with counterfactuals (thehero is aware of the uncertainty of their mentor’s moral alignment, and this influences their actions).The reader simulates the minds of the characters and experiences the multiverses evoked by thestory.
A writer may have a predetermined interpretation and future in mind or may write as a meansof exploring the interpretative and/or dynamic multiverse. Regardless, a writer must be aware ofthe multiplicity which defines the readers’ and characters’ subjective experiences as the shaper ofthe meaning and dynamics of the work. The writer thus seeks to simulate and manipulate thatmultiplicity.We propose that generative language models in their multiversal modality can serve as an aug-mentation to and be augmented by the writer’s inherently multiversal imagination.
So far we’ve implicitly assumed that, despite the multiversal forces at work, the writer’s objectiveis to eventually compose a single history. However, language models naturally encourage writingexplicitly multiversal works.In the same way that hypertext transcended the limitations the linear order in which physicalbooks are read, exciting a surge of multiversal fiction [2], language models introduce new possibilitiesfor writing nonlinear narratives.After all, it’s only a small leap from incorporating multiverses in the brainstorming process toincluding them in the narrative. Counterfactual branches often occur in traditional fiction in theform of imaginary constructs, and our minds are naturally drawn to their infinite possibility [1].
We propose the creation of new tools to allow writers to work alongside language models to exploreand be inspired by the multiverses already hiding in their writing.Research into hypertext writing tools has been ongoing for more than two decades and hasproduced notable tools like StorySpace[3]. However, the issue of hypertext interfaces assisted bylanguage models is a newer development, as only very recently have language models become ad-vanced enough to be useful in the writing process [9]. Likewise, there has been significant researchinto interactive narratives, including in branching, multiversal settings [10, 11], but never one inwhich the human and the language assistant can act together as such high-bandwidth partners.4s has been shown in past hypertext interface design studies [8], the primary concern in thecreation of an interface for writing multiverse story is the massive amount of information that couldbe shown to the writer. If intuitive user experience is not central to the design of the program,this information will feel overwhelming and functionally prevent the user from leveraging the poweroffered by multiverse access at all.An effective multiversal interface should allow the writer, with the aid of a generative languagemodel, to expose, explore, and exploit the interpretational and dynamic multiplicity of a passage.Not only will such a tool allow the user to explore the ways in which a scenario might play out, suchan interface will also expose previously unnoticed ambiguities in the text (and their consequences).Depending on the design of the interface and the way the user approaches it, many differenthuman-AI collaborative workflows are possible. Ideally, the interface should give the user a senseof creative superpowers, providing endless inspiration combined with executive control over thenarrative, as well as allowing and encouraging the user to intervene to any degree.
Over the past several months, we have prototyped and tested several iterations of multiversal writingtools using GPT-3 as the generation function.The demand for a multiversal writing application grew from use of GPT-3 as a more standardlinear writing assistant. It became increasingly clear, as users sought greater interaction bandwidthand more efficient ways to structure and leverage the model’s ideas, that an interface which organizesthe model’s outputs in a branching tree would be more effective.The early results we have seen leave no doubt about the power of language models as writingassistants. Our small cohort of five beta users have, alongside GPT-3, co-written linear and nonlinearstories spanning the equivalent of thousands of pages of often astonishing ingenuity and beauty andsurprisingly long-range coherence. Three users have reported a sense of previously unimaginedcreative freedom and expressive power.However, it has also become evident that much more research and development is necessary. Inour beta-tests, we’ve found that flaws in interface design can easily overwhelm or damage a feelingof ownership over the work produced. Below we will share some of our findings, which representonly the first step in creating a true interface between the creative mind and the machine.
We have found that a visual representation of the branching structure of the narrative helps usersconceptualize and navigate fractal narratives. This view (called visualize ) displays the flow of pastsand futures surrounding each node (Figure 3) and zooming out displays the global structure of themultiverse (Figure 4). The visualize view allows users to expand and collapse nodes and subtrees,as well as “hoist” any node so that it acts as the root of the tree. Altering the topology of thetree, (e.g. reassigning children to different parents, splitting and merging nodes) is more intuitivefor users in the visualize view than the linear view.In addition to tree-based multiverse visualization, the read view displays the text of a node andits ancestry in a single-history format (Figure 5).
With a generative language model, story multiverses can quickly become too large to navigatethrough node connections alone. To assist navigation, we have implemented the following features: • Search all text or text in a subtree and/or text in a node’s ancestry5igure 3:
Visualize viewFigure 4: Zoomed-out visualization of a nonlinear story6igure 5:
Read view • Indexing by chapters : Chapters are assigned to individual nodes, and all nodes belong tothe chapter of the closest ancestor that is the root of a chapter. As a consequence, chaptershave the shape of subtrees. • Bookmarks and tags : Bookmarks create a named pointer to a node without enforcing chaptermembership. Tags are similar to bookmarks, but can be applied to multiple nodes.
A naive way to automatically generate a multiverse using a language model might involve branchingevery fixed n tokens. However, this is not the most meaningful way to branch in a story. In somesituations, there is essentially one correct answer for what a language model should output next. Insuch a case, the language model will assign a very high confidence (often > Humans retain an advantage over current language models in our ability to edit writing and performtopological modifications on the multiverse such as merging interesting aspects of two separatebranches into one. 7he interface should ideally allow the human to perform all desired operations with maximal ease.Because GPT-3 is so capable of producing high-quality text, some interface designs make it feasiblefor the human to cultivate coherent and interesting passages through curation alone. We have foundthat an interface which makes it easy to generate continuations but relatively difficult to modifythe content and topology of the resulting multiverse encourages a passive workflow, where the userrelies almost exclusively on the language model’s outputs and the branching topology determinedby the process of generation.While such a passive mode can be fun, resembling an open-ended text adventure game, and aswell as useful for efficiently exploring counterfactuals, the goal of a writing interface is to facilitatetwo-way interaction: the outputs of the language model should augment and inspire the user’simagination and vice versa.Thus, we are are developing features to encourage meaningful and unrestrained human contri-bution such as: • Easy ways to edit, move text, and change tree topology • Support for nonstandard topologies that are not automatically generated by language mod-els and require human arrangement, such as cycles and multiple parents ( § • Floating notes to allow saving passages and ideas independent from the tree structure ( § • Fine-grained control over language model memory ( § • Interactive writing tools that offer influence over the narrative in ways other than directintervention ( § • Program modes which encourage manual synthesis of content from multiverse exploration intoa single history, for instance by distinguishing between exploratory and canonical branches
Floating notes are text files which, rather than being associated with a particular node, are accessibleeither globally or anywhere in a subtree. We decided to implement this feature because users wouldoften have a separate text file open in order to copy and paste interesting outputs and keep noteswithout being constrained by the tree structure. Floating notes make it easier for the user exertgreater agency over the narrative.
The interface supports nodes with multiple parents and allows cyclic graphs (Figure 7). Opportu-nities to arrange convergent and cyclic topologies, which do not occur if the language model is usedpassively, encourage human cowriters to play a more active role, for instance, in arranging for sepa-rate branches to converge to a single outcome. Multiversal stories naturally invite plots about timetravel and weaving timelines, and we have found this feature to unlock many creative possibilities.
GPT-3 has a limited context window, which might seem to imply limited usefulness for composinglongform works like novels, but our users have found that long-range coherence is surprisingly easy tomaintain. Often, the broad unseen past events of the narrative are contained in the interpretationalmultiplicity of the present and thus exposed through generations, and consistent narratives are easilyachieved through curation. In order to reference past information more specifically, often all thatis needed is minimal external suggestion, introduced either by the author-curator or by a built-in8igure 6: A subtree generated with adaptive branchingFigure 7: Nodes can have multiple parents, allowing for cyclic story components9emory system. We are developing such a system which automatically saves and indexes storyinformation from which memory can be keyed based on narrative content.
Beyond direct continuations of the body of the story, a language model controlled by engineeredprompts can contribute in an open-ended range of modalities. Sudowrite[7] has pioneered usingGPT-3 powered functions that, for instance, generate sensory descriptions of a given object, orprompt for a twist ending given a story summary.The ability to generate high-quality summaries has great utility for memory and as input tohelper prompts and forms an exciting direction for our future research. We are exploring summa-rization pipelines for GPT-3 that incorporate contextual information and examples of successfulsummarizations of similar content.
The problem of designing good interfaces for AI systems to interact with humans in novel ways willbecome increasingly important as the systems increase in capability. We can imagine a bifurcationin the path of humankind’s future: one in which we are left behind once the machines we createexceed our natural capabilities, and another in which we are uplifted along with them. We hopethat this paper can further inspire the HCI community to contribute to this exciting problem ofbuilding the infrastructure for our changing future.
Acknowledgements
We are grateful to Lav Varshney for his valuable discussions and helpful feedback and to MichaelIvanitskiy and John Balis for their feedback and help compiling this article. In addition we wouldlike to thank Miles Brundage and OpenAI for providing access to GPT-3.
References [1] Espen J Aarseth. “Nonlinearity and literary theory”. In:
Hyper/text/theory
52 (1994), pp. 761–780.[2] Kimberly Amaral. “Hypertext and writing: An overview of the hypertext medium”. In:
Re-trieved August
16 (1995), p. 2004.[3] Mark Bernstein. “Storyspace 1”. In:
Proceedings of the thirteenth ACM conference on Hypertextand hypermedia . 2002, pp. 172–181.[4] Tom B Brown et al. “Language models are few-shot learners”. In: arXiv preprint arXiv:2005.14165 (2020).[5] David Deutsch.
The Fabric of Reality . Penguin UK, 1998.[6] Bryce Seligman DeWitt and Neill Graham.
The many worlds interpretation of quantum me-chanics . Vol. 63. Princeton University Press, 2015.[7] Amit Gupta and James Yu. . 2021. url : (visited on 01/14/2021).[8] Spencer Jordan. “‘An Infinitude of Possible Worlds’: Towards a Research Method for HypertextFiction”. In: New Writing . 2020, pp. 1–6.[10] M. O. Riedl and R. M. Young. “From linear story generation to branching story graphs”. In:
IEEE Computer Graphics and Applications doi : .[11] Mark Owen Riedl and Vadim Bulitko. “Interactive narrative: An intelligent systems approach”.In: Ai Magazine