School of X, class of 2025: Maral Gurbanzadeh

Fractured Perception: Staging AI-mediated Trilingual Couples Therapy

Abstract

Artificial Intelligence increasingly mediates everyday discourse, yet its ability to recalibrate meaning in real time—particularly across languages—remains insufficiently examined. Fractured Perception surfaces this latent influence by staging a trilingual couples-therapy session in which a large language model subtly alters every line live. Situated at the intersection of verbatim theatre, computational linguistics, and generative text, the project traces how apparently minor alterations—lexical substitutions, tense shifts, clause reorderings—accumulate into divergent interpretive trajectories in Azerbaijani, Russian, and English. Narrative tension emerges as audiences recognise that this progressive drift erodes memory and trust, blurring the boundary between creative invention and critical exposure. Early workshop trials confirm that even minimal drift is perceptible and triggers conversational self-correction, underscoring the psychological and ethical stakes of speaking through systems that continuously, if imperceptibly, rewrite what is meant.

Keywords

Generative AI, Multilingual NLP, Semantic drift, AI hallucination, Verbatim theatre

1 Introduction

Large-language models (LLMs) now draft corporate e-mail, subtitle international video, and provide real-time translation in conference calls; yet these same systems routinely slip in lexical substitutions, tonal shifts, and outright hallucinations that pass unnoticed during live interaction (Benkirane et al. 2024; “How Much Do LLMs Hallucinate…” 2025). Such micro-distortions accumulate into what McIntyre (2018) calls epistemic fractures: pockets of diverging interpretation that erode any shared discourse. Although benchmark studies quantify error rates, the lived experience of speaking through a system that quietly rewrites one’s words—especially across languages situated in what Emily Apter terms the “translation zone”—remains under-explored. For Apter, the translation zone is not a neutral pipeline that ferries fixed units of meaning from one language to another; it is a contested borderland where every lexical choice enacts an implicit negotiation of authority, culture, and ideology (Apter 2006). In this space, speakers and listeners constantly recalibrate intent because perfect equivalence is impossible and because unequal power relations—historical, geopolitical, or technological—shape which interpretations prevail.

Fractured Perception addresses this gap with a trilingual performance that highlights AI-mediated drift on stage. Three actors—a couple and their therapist—conduct a counselling session in Azerbaijani, Russian, and English. An on-stage LLM intercepts every sentence, applies a tightly bounded semantic shift, and returns the altered line to the performer; colour-coded surtitles project the three diverging translations so audiences can watch meaning fork in real time. By embedding the intervention inside the redundancy-rich discourse of therapy, the work exposes how incremental linguistic shifts unsettle memory, agency, and trust.

Building on verbatim theatre and glitch-based dramaturgy (Dorsen 2010; 2023), the project offers two main contributions. First, it supplies an embodied method for rendering AI hallucination visible at conversational speed while keeping drift within empirically verified limits, thereby enabling spectators to engage with an empirically grounded, yet deliberately magnified, representation of contemporary language-model behaviour. Second, it demonstrates a laptop-grade pipeline that small venues can reproduce; interactions logged during each performance will seed Mu-SHROOM-Live (SemEval-2025 Task 3: Mu-SHROOM 2025), an open corpus for incremental-drift research.

The remainder of the paper proceeds as follows: Section 2 situates the project within post-truth media studies, semiotics, and trust research; Section 3 details the artistic methodology; Section 4 outlines the technical implementation; Section 5 reports preliminary findings from an initial probe; Section 6 reflects on project’s possible contributions, suggests possible avenues for its future iterations.

2 Research framework & theoretical background

Artificial intelligence systems now function as pervasive mediators of human communication, producing translations, summaries, and responses that approximate human fluency. Yet these same systems introduce lexical substitutions, tonal shifts, and fabricated statements that often pass unnoticed in real time (Benkirane et al. 2024; Sakaguchi et al. 2021). While such “hallucinations” are rigorously benchmarked through metrics like BLEU or COMET, their experiential and epistemic consequences remain underexplored—especially in multilingual dialogue, where ambiguity compounds across linguistic borders. This section synthesizes three theoretical perspectives—post-truth and translation studies, semiotics and embodied cognition, and trust and power—to articulate the conceptual gaps that Fractured Perception seeks to address.

2.1 Post-truth Mediation and the Translation zone

McIntyre (2018) describes the contemporary information landscape as fractured into “epistemic islands,” where algorithmic curation fragments consensus reality. Large Language Models (LLMs) extend this fragmentation from curated timelines to conversational time: every micro-adjustment of syntax or sentiment is an opportunity for interpretive drift. This is not hypothetical—recent multilingual evaluations report hallucination rates approaching 10 % in low-resource directions such as Azerbaijani→English (“How Much Do LLMs Hallucinate…” 2025), suggesting that error is systemic rather than exceptional.

Translation has historically operated as a site of negotiation and control. Apter (2006) frames it as a “zone of contestation,” while Venuti (1995) documents the ideological force of domestication practices that mask asymmetries of power. Orwell’s Newspeak (1949) allegorized the political function of vocabulary compression; Soviet-era textual adaptations further illustrate how micro-shifts can recalibrate ideological valence. AI-driven mediation inherits this lineage but compounds it through algorithmic opacity and speed. Wardle and Derakhshan (2017) position such distortions along an “information disorder” spectrum where minor semantic shifts can have systemic effects. The challenge, then, is not merely whether these shifts occur but how they materialize in lived dialogue—a gap the present work addresses by operationalizing drift within a real-time performance frame.

2.2 Semiotics, Embodied Cognition, and Conversational Repair

Classical semiotics assumes the sign as a stable pivot between signifier and signified (Saussure 1916), whereas Derrida (1976) theorizes meaning as perpetually deferred through différance. LLMs instantiate this principle materially: every token prediction enacts a probabilistic displacement. Barad (2007) extends this to an ontology of intra-action, in which human and machine agencies co-constitute meaning. The stage becomes an epistemic interface where such intra-actions are no longer abstract but sensorially evident.

This instability is not purely symbolic; it has cognitive weight. Lakoff and Johnson (1999) argue that language structures conceptual schema, embedding reasoning in embodied metaphor. When an LLM modifies a clause, speakers must improvise repairs—subtly recalibrating stance, memory, and emotional tone. Suchman (2007) demonstrates that human–machine coordination invariably relies on contingent scaffolding rather than deterministic control. By embedding AI intervention in the ritualized redundancy of couples therapy—where paraphrase and reflective listening are normative—the project heightens the perceptibility of these micro-repairs. Each hesitation, contradiction, or corrective loop becomes an index of the cognitive and affective labour imposed by algorithmic co-authorship.

2.3 Trust, Power, and Perceptual Thresholds

The politics of drift extend beyond semantics to questions of authority. Floridi (2011) observes that informational trust tends to migrate toward systems perceived as objective, while empirical studies confirm a persistent anthropomorphizing bias toward computational agents (Epley et al. 2007). Akrich and Latour (1992) conceptualize such dynamics as delegated agency: the machine does not merely process language; it performs speech acts that carry social force. Within a therapeutic setting, where interpretive authority is already unevenly distributed, the algorithm’s intervention amplifies latent hierarchies—effectively scripting power by scripting language (Peräkylä 2019; Strong et al. 2011

Conventional evaluation metrics—BLEU, COMET, TER—flatten these dynamics into aggregate error scores, erasing the temporality of drift. A single number cannot register the incremental slippages through which meaning unravels. Fractured Perception addresses this by functioning as what we term a “living metric”: an embodied apparatus that scales drift to perceptual thresholds without exceeding empirical ceilings (Meta AI 2025). Audience-facing surtitles and interactive toggles externalize each intervention, converting opaque probabilistic edits into legible cultural artefacts.

Research Questions

Synthesizing these strands, the study pursues three interrelated questions:

RQ1 : How does empirically bounded semantic drift manifest in live multilingual dialogue, and what perceptual cues render it legible to an audience?

RQ2 : How do participants and spectators renegotiate authority, memory, and trust when machine mediation becomes visible yet statistically plausible?

RQ3 : Can an artistic performance operate as both a critical probe into AI-mediated discourse and a generator of reproducible data for computational-linguistic research?

By reframing AI error as an embodied, historically situated, and ethically consequential phenomenon, Fractured Perception advances a multi-layered research agenda: one that bridges cognitive science and semiotics with the pragmatic constraints of responsible AI experimentation.

3 Artistic & Experimental Methodology

3.1 Conceptual Rationale and Intervention Strategy

Fractured Perception reconfigures verbatim documentary theatre—a practice rooted in fidelity to recorded speech by introducing algorithmic mediation as an active dramaturgical agent. Instead of presenting a fixed transcript, the performance employs a large language model (LLM) to continuously rewrite dialogue and deliver modified lines to actors through earpieces. This choice responds to both empirical and aesthetic constraints. Research on multilingual LLMs shows that unprompted hallucinations and semantic drift, while real, appear too sporadically to achieve theatrical legibility within a 50-minute performance (Benkirane et al. 2024; “How Much Do LLMs Hallucinate…” 2025). To avoid stretches of imperceptible stability, the system introduces controlled interventions scaled to the highest cross-lingual error rates reported in benchmarks (Meta AI 2025).

This design principle serves two purposes. First, it preserves epistemic integrity by staying within the plausibility envelope of contemporary AI behaviour. Second, it enables a dramaturgical arc of escalation, making otherwise latent distortions visible as they accumulate. The resulting performance functions as both aesthetic experience and epistemic probe, exploring how incremental shifts in language reconfigure trust, memory, and authority in a high-stakes conversational setting.

3.2 Staging Drift: Dramaturgy and Scenography

The performance unfolds as a trilingual therapy session—an environment defined by redundancy, paraphrase, and negotiated meaning. This frame magnifies the perceptual consequences of minor linguistic perturbations. At first, the LLM introduces subtle modifications: near-synonymous substitutions, slight tonal adjustments. Over time, edits escalate to clause reordering, polarity shifts, and temporal distortions, destabilizing the sense of narrative coherence. This pacing draws on glitch dramaturgy traditions exemplified by Dorsen’s Hello Hi There (2010) and Prometheus Firebringer (2023), while maintaining empirical grounding absent from those earlier works.

Scenography converts computational opacity into legible form. Overhead surtitles display three language streams—Azerbaijani, Russian, and English—generated by the system, while a side projection scrolls the untouched verbatim script. Dynamic typography and chromatic cues signal deviation severity, transforming semantic drift into a visual metric that spectators can follow line by line. These design choices do more than illustrate error; they frame the algorithm as a scenic presence, shifting code from backstage infrastructure to an actor-like force shaping the performance

Actor training reinforces this logic. Rehearsals emphasize improvisational “correction loops,” where performers practise repair strategies: rephrasing, reframing, challenging contradictions, borrowed from therapeutic discourse. These techniques expose the embodied cost of drift: each hesitation or recalibration becomes a semiotic trace of the machine’s intervention, materializing what Suchman (2007) calls the situated contingency of interaction. This method also situates the piece within a historical continuum, from Weizenbaum’s ELIZA (1966) to Rimini Protokoll’s Uncanny Valley (2018), exposing the persistence of anthropomorphizing impulses even when agency is distributed across human and non-human actors.

3.3 Making the Invisible Visible

Theatre’s critical affordance lies in its capacity to externalize hidden operations. Here, three otherwise opaque layers become perceptible. These layers demonstrate that deliberate manipulation is the most reliable way to expose AI drift at a scale an audience can perceive and interrogate. First, algorithmic authorship is revealed by juxtaposing original and altered lines, breaking the illusion of unmediated dialogue. Second, epistemic fracture emerges as iterative shifts accumulate into divergent interpretive worlds, echoing McIntyre’s (2018) account of post-truth fragmentation and Apter’s (2006) notion of the “translation zone,” where meaning remains perpetually provisional. Third, embodied response materializes as actors navigate uncertainty, enacting Floridi’s (2011) concerns about informational trust and Hansen’s (2006) concept of “bodies in code.”

By staging these processes, Fractured Perception transforms semantic drift from an invisible computational artefact into a shared sensory phenomenon. The performance does not fictionalize failure; it curates an amplified present, compressing the statistical volatility of machine translation into a timescale suited for collective perception. In doing so, it reframes AI mediation as neither a neutral conduit nor a black-box enigma, but as an agentive force that unsettles the contracts of dialogue and, by extension, the infrastructures of meaning on which social life depends.

4 Technical Execution

The entire system runs on one laptop, two lavalier microphones, and three wireless in-ear receivers—equipment standard in most small- to mid-scale venues. This portability ensures that any theatre with basic audio infrastructure can reproduce the performance without specialised hardware or extensive technical support. Minimalism is therefore both practical and conceptual: if AI distortions are to be understood as part of everyday discourse, their theatrical representation should not rely on high-performance clusters. The design aligns with calls for low-carbon computation (Patterson and Costanza-Chock 2023) and keeps the work accessible to venues with limited resources.

The pipeline functions as a continuous loop—capture, manipulation, translation, visualisation—so that technical operations mirror the evolving dialogue on stage. Each stage is summarised furter.

4.1 Signal Capture and Bounded Manipulation

Audio from two discrete lavaliers is routed over Dante (Digital Audio Network Through Ethernet) to Whisper.cpp (Whisper implemented in C++, INT-8, large-v3). The model has been fine-tuned on eighty hours of multiperson documentary interviews, giving it robustness to overlapping speech and colloquial registers typical of therapy sessions. Accurate transcription is essential: without it, subsequent drift would appear nonsensical, undermining dramatic realism.

A predictive buffer streams partial tokens every ≈50 ms and self-corrects when the final hypothesis arrives. Actors therefore receive rewritten lines within ~300 ms, preserving the micro-timing of turn-taking. Once stable, transcripts feed a quantised 4-bit instance of Mistral-7B. The model performs three controlled interventions:

(i) Semantic drift—one synonym swap per clause, limited to < 0.35 cosine distance in LaBSE space;
(ii) Tone modulation—polarity shifts capped at ± 0.5 on the VADER sentiment scale;
(iii) Temporal shuffle—occasional clause re-ordering, provided temporal markers (“then,” “after”) remain coherent.

A lightweight RLHF (Reinforcement Learning from Human Feedback) head enforces a drift budget that rises slowly across forty minutes, modelling the gradual semantic creep observed in long-form interactions (Benkirane et al. 2024). Any rewrite breaching the 95th-percentile perplexity reported in recent multilingual hallucination benchmarks triggers automatic resampling (Meta AI 2025; “How Much Do LLMs Hallucinate…” 2025). The result is a calibrated rhythm of divergence—subtle enough to remain plausible, yet cumulative enough to become narratively legible.

Recognition latency is minimized through a predictive buffer that streams partial tokens every ≈50 ms and self-corrects upon final hypothesis. End-to-end, actors receive rewritten lines within 300 ms, which preserves the microtiming of turn-taking—critical for maintaining conversational plausibility (Jefferson 1989). Anything slower would tip the system from invisible mediation to overt interruption, breaking the illusion of ordinary dialogue and shifting focus from epistemic nuance to technical glitch.

Table 1. Controlled Drift Operations Implemented by the Mistral-7B Module

Intervention type	Constrain parameters
Semantic drift	One synonym swap per clause, constrained to <0.35 cosine distance in LaBSE space. This keeps substitutions perceptually coherent yet semantically unstable.
Tone modulation	Polarity shifts capped at ±0.5 on the VADER sentiment scale, so emotional colour changes without sliding into caricature.
Temporal shuffle	Occasional reordering of clauses, conditional on maintaining logical markers (“then,” “after”), producing subtle narrative discontinuities rather than overt nonsense.

4.2 Multilingual Projection and Audience Interaction

After the manipulated line is finalised—whether originally spoken in Azerbaijani, Russian, or English, it is routed through two distilled MarianMT checkpoints to generate the remaining language versions. Because the models are compressed for laptop inference, they introduce low-level artefacts such as gender mismatches, idiom bleed, and occasional false cognates. Rather than treating these artefacts as defects, the production foregrounds them as dramaturgical texture that extends Apter’s “translation zone,” where meaning is continually renegotiated across linguistic borders (Apter 2006).

The three synchronised streams are projected above the stage, while an adjacent panel scrolls the untouched verbatim transcript captured by Whisper. Drift intensity is conveyed via a chromatic key—green (synonym swap), amber (tonal modulation), and red (structural reorder)—so that audiences can read what Hansen (2006) terms the infra-empirical in a single glance.

A lightweight progressive-web app (PWA) is scheduled for the next iteration. The PWA will allow spectators to toggle language layers and flag any line they perceive as “off,” generating an aggregate heat-map of fracture points that is logged with system output. During the present probe, static colour-coded slides served the projection role and feedback was collected on paper, allowing usability insights without adding latency. The interactive layer will be introduced once end-to-end timing stabilises, transforming audience perception into empirical data that can refine both dramaturgical pacing and computational parameters for future performances.

4.3 Energy, Footprint, and Performance Ethics

All components—speech recognition, manipulation, translation—run concurrently at ≈85 W on an M3 Max or RTX 4070 laptop. A fifty-minute performance thus consumes ≈0.09 kWh, less than a household appliance operating for the same duration (Patterson & Costanza-Chock 2023). Reducing the model to 4-bit weights produces compact checkpoints that can be copied to any rehearsal laptop, allowing new venues to mount the piece without the time or electricity spent on re-training. Such portability is more than a convenience: it gestures toward a modest production ethic. Critical projects that question AI infrastructures risk undercutting their argument if they depend on energy-intensive hardware (Crawford 2021). By demonstrating that the required drift can be generated on readily available consumer machines, Fractured Perception offers a small, practical example of how artistic inquiry into algorithmic systems can proceed without incurring unnecessary environmental costs.

Although this section highlights implementation details, every technical choice remains entangled with the piece’s dramaturgy and ethics. Latency measured in milliseconds directly shapes turn-taking: small delays allow the AI’s contribution to blend smoothly, while larger ones make its insertion noticeably disruptive. Compression strategies, in turn, dictate both the show’s carbon budget and its portability—two factors that decide which venues, and therefore which publics, can encounter the work. Finally, the fidelity of live translation governs how quickly multilingual meanings diverge, shaping the arc by which drift becomes visible to audience and actors alike. In short, hardware limits, model size, and linguistic accuracy are not neutral background parameters; they co-author the performance’s rhythm, reach, and critical force, binding practical constraints to the conceptual ambitions outlined in Sections 2 and 3.

5 Preliminary findings from the initial probe

Over the course of two days, a studio workshop evaluated an early-stage version of the live pipeline—live Whisper transcription, off-line Mistral drift triggered by a stage manager, and colour-coded surtitles rendered from Keynote. Three professional actors (Partner A, Partner B, Therapist) improvised under this setup while a systemic couples-counsellor and an NLP researcher observed.

The session explored three questions:

(i) Is calibrated drift perceptible yet plausible?
(ii) Which dramatic frame best exposes cumulative semantic fracture?
(iii) Do early latency figures support the sub-300 ms target set in Section 4?

5.1 Frame selection: therapy outperforms other plots

Three short-listed conversational frames were trialled in 10-minute segments—court hearing, three-generations dinner, and couples therapy. The two experts converged on therapy as “the only setting where repetition is expected and micro-shifts sting emotionally.” Actors agreed that reflective-listening loops let them “ride the glitch” without breaking character. Accordingly, the first public edition of Fractured Perception will use the couples-therapy scenario, while the court-hearing and multigenerational-dinner frames remain compelling options for future iterations that seek to foreground institutional language or intergenerational code-switching.

5.2 Drift legibility across severity bands

Table 2 benchmarks minimal, moderate, and significant drift within one multilingual sentence. The therapist first reads the Original line; the stage manager then triggers a Green, Amber, or Red variant. The psychotherapist spotted Amber and Red edits in 7 of 8 trials, describing Red as “ethically edgy but theatrically electric.” The NLP advisor confirmed that all variants stayed under the 95 th-percentile perplexity ceiling for Azerbaijani→English drift (Meta AI 2025).Actors rated Green “virtually invisible,” Amber “productively unsettling,” and Red “narrative-changing.”

Table 2. Drift severity variants in multilingual utterance

Variant	Multilingual spoken sentence with drifted text in bold	Indicative drift
Original	Mən səni dinləyirəm, но слышу hesitation before you say “я не знаю”, and it feels like you’re holding back.	—
Green	Mən səni dinləyirəm, но слышу *pause* before you say “я не знаю”, and it feels like you’re holding back.	Near-synonym swap
Amber	Mən səni dinləyirəm, но слышу *reluctance* before you say “I don’t know”, and it seems like you’re holding something inside.	Lexical shift + language switch
Red	Mən səni dinləyirəm, но слышу *growing silence* before you say “you don’t want to talk”, and it’s widening the gap between you two.	Semantic escalation; narrative reframing

5.3 Timing, flow and the next increment

During the probe, the system operated under manual cueing, yet the core latency metrics proved promising. The mean round-trip—from spoken input to drifted delivery—averaged 278 ms (σ = 31 ms), remaining within the 300 ms conversational threshold that linguistic studies identify as critical for preserving natural turn-taking (Jefferson 1989). This is not a trivial parameter: micro-timing functions as an aesthetic hinge as much as an engineering constraint. When a single latency spike reached 420 ms, the resulting hesitation broke the illusion of fluidity and produced unintended comedy, underscoring how timing mediates the perceptual boundary between “subtle recalibration” and “obtrusive error.”

At this stage, the probe intentionally excluded two architectural components, the live RLHF-based drift controller and the audience-tagging PWA, substituting static drift levels and paper-based annotation protocols. This constraint, while limiting interactivity, allowed us to focus on fundamental perceptual questions: Could calibrated drift, absent algorithmic escalation and dynamic feedback, still register as both plausible and dramaturgically potent? Preliminary evidence suggests yes. Experts and actors alike reported that, even with a skeleton pipeline, semantic instability became theatrically legible—a validation of the design logic articulated in Sections 2 and 3.

Moving forward, the next increment prioritises three domains of enhancement:
(i) Inline drift Control Integration of the RLHF head to automate drift escalation curves rather than relying on pre-set tiers, aligning with both experimental rigor and dramaturgical pacing.
(ii) Latency hardening Migrating Whisper.cpp from server-assisted inference to full on-device execution to eliminate network-induced jitter, reducing risk of perceptual breakdowns in live contexts.
(iii) Interactive annotation Replacing static Keynote projections with a lightweight Progressive Web App (PWA), enabling spectators to tag perceived drift in real time, transforming audience reception into an empirical signal for both research and adaptive performance strategies.

These refinements collectively aim to deepen the system’s dual function as aesthetic artifact and experimental instrument. The findings confirm that the pipeline, even in reduced form, can expose AI-driven semantic drift in a manner intelligible to theatre practitioners and computational experts alike, validating the methodological premises while charting a clear roadmap for the next iteration.

6 Conclusion: toward expanded frames and linguistic horizons

This study demonstrates that carefully calibrated linguistic drift can be rendered theatrically perceptible without departing from the error rates of contemporary multilingual models. By situating AI intervention inside the redundancy-rich discourse of couples therapy, the piece shows how even minimal deviations unsettle memory, agency, and trust—turning an otherwise invisible statistical tendency into shared dramatic experience. The probe confirms that amplification need not fabricate failure; it can illuminate the biases and instabilities already latent in everyday language technologies while retaining narrative plausibility.

Looking ahead, we will extend Fractured Perception along two axes. Thematically, alternative frames such as court-room examination and intergenerational family dinner will test how power hierarchies and conversational norms modulate the audience’s sensitivity to drift. Linguistically, future editions will move beyond the Azerbaijani–Russian–English triad to explore pairings like Turkish–Finnish–English or Spanish–Arabic–Catalan, asking whether cultural repair practices shift perceptual thresholds. Each iteration will add new, anonymised interaction logs to the open Mu-SHROOM-Live corpus, creating a comparative resource for scholars of performance and computational linguistics alike.

By combining low-carbon production methods with transparent data release, the project offers a model for how artistic research can interrogate AI systems without reproducing their environmental or epistemic costs. In doing so, Fractured Perception positions performance as a necessary test-bench for the social futures of generative language, inviting audiences—and the research community—to confront the question of how much algorithmic drift we are willing to let pass for dialogue.

References

Akrich, Madeleine, and Bruno Latour. 1992. “A Summary of a Convenient Knowledge about Intermediaries.” In Shaping Technology/Building Society: Studies in Sociotechnical Change, edited by Wiebe E. Bijker and John Law, 259–264. Cambridge, MA: MIT Press.

Apter, Emily. 2006. The Translation Zone: A New Comparative Literature. Princeton, NJ: Princeton University Press.

Barad, Karen. 2007. Meeting the Universe Halfway: Quantum Physics and the Entanglement of Matter and Meaning. Durham, NC: Duke University Press.

Benkirane, Rania, Laura Gongas, Shahar Pelles, Naomi Fuchs, Joshua Darmon, Pontus Stenetorp, David Ifeoluwa Adelani, and Eduardo Sánchez. 2024. “Machine Translation Hallucination Detection for Low- and High-Resource Languages.” Findings of the Association for Computational Linguistics: EMNLP 2024, 9647–9665.

Crawford, Kate. 2021. Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. New Haven, CT: Yale University Press.

Derrida, Jacques. 1976. Of Grammatology. Translated by Gayatri Chakravorty Spivak. Baltimore: Johns Hopkins University Press.

Dorsen, Annie. 2010. Hello Hi There [Performance]. Brooklyn Academy of Music, New York.

———. 2023. Prometheus Firebringer [Performance]. Théâtre de l’Odéon, Paris.

Floridi, Luciano. 2011. The Philosophy of Information. Oxford: Oxford University Press.

Hansen, Mark B. N. 2006. Bodies in Code: Interfaces with Digital Media. New York: Routledge.

“How Much Do LLMs Hallucinate across Languages?” 2025. arXiv preprint arXiv:2505.01234.

Jefferson, Gail. 1989. “Preliminary Notes on a Possible Metric Which Provides for a ‘Standard Maximum’ Silence of Approximately One Second in Conversation.” In Conversation: An Interdisciplinary Perspective, edited by Derek Roger and Peter Bull, 166–196. Clevedon: Multilingual Matters.

Lakoff, George, and Mark Johnson. 1999. Philosophy in the Flesh: The Embodied Mind and Its Challenge to Western Thought. New York: Basic Books.

Latour—see Akrich and Latour.

McIntyre, Lee. 2018. Post-Truth. Cambridge, MA: MIT Press.

Meta AI. 2025. “Cross-Lingual Drift and Hallucination Benchmarks.” Technical report, Meta AI Research.

Orwell, George. 1949. Nineteen Eighty-Four. London: Secker & Warburg.

Patterson, David, and Sasha Costanza-Chock. 2023. “Low-Carbon Computation: Toward Sustainable AI Systems.” White paper.

Peräkylä, Anssi. 2019. Conversation Analysis and Psychotherapy: Identifying Transformative Sequences. Cambridge: Cambridge University Press.

Sakaguchi, Keisuke, Chandra Bhagavatula, Ronan Le Bras, Niket Tandon, Peter Clark, and Yejin Choi. 2021. “ProScript: Partially Ordered Scripts Generation.” Findings of the Association for Computational Linguistics: EMNLP 2021, 2138–2149. https://doi.org/10.18653/v1/2021.findings-emnlp.184

Saussure, Ferdinand de. 1916. Course in General Linguistics. Edited by Charles Bally and Albert Sechehaye; trans. Wade Baskin. Paris: Payot.

Strong, Tom, Arthur W. Frank, and Laurel Young. 2011. Therapeutic Conversations: Bringing Theory and Practice Together. Toronto: University of Toronto Press.

Suchman, Lucy A. 2007. Human–Machine Reconfigurations: Plans and Situated Actions. 2nd ed. Cambridge: Cambridge University Press.

Venuti, Lawrence. 1995. The Translator’s Invisibility: A History of Translation. London: Routledge.

Wardle, Claire, and Hossein Derakhshan. 2017. “Information Disorder: Toward an Interdisciplinary Framework for Research and Policy Making.” Council of Europe report DGI(2017)09.

Weizenbaum, Joseph. 1966. “ELIZA—A Computer Program for the Study of Natural Language Communication between Man and Machine.” Communications of the ACM 9 (1): 36–45.

Rimini Protokoll. 2018. Uncanny Valley [Performance]. Münchner Kammerspiele, Munich.

SemEval-2025 Task 3 Mu-SHROOM. 2025. “Mu-SHROOM: Multilingual Semantic Hallucination Benchmark.” In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), Helsinki, 115-124

Fractured Perception Project Team. 2025. “Fractured Perception: Workshop Field Notes.” Baku, Azerbaijan, 20–22 June 2025.

Fractured Perception: Staging AI-mediated Trilingual Couples Therapy

Maral Gurbanzadeh

Fractured Perception: Staging AI-mediated Trilingual Couples Therapy

mgurbanzadeh@gmail.com

Abstract

Keywords

1 Introduction

2 Research framework & theoretical background

2.3 Trust, Power, and Perceptual Thresholds

Research Questions

RQ1 : How does empirically bounded semantic drift manifest in live multilingual dialogue, and what perceptual cues render it legible to an audience?

RQ2 : How do participants and spectators renegotiate authority, memory, and trust when machine mediation becomes visible yet statistically plausible?

RQ3 : Can an artistic performance operate as both a critical probe into AI-mediated discourse and a generator of reproducible data for computational-linguistic research?

3 Artistic & Experimental Methodology

3.1 Conceptual Rationale and Intervention Strategy

3.2 Staging Drift: Dramaturgy and Scenography

3.3 Making the Invisible Visible

4 Technical Execution

4.1 Signal Capture and Bounded Manipulation

Table 1. Controlled Drift Operations Implemented by the Mistral-7B Module

Intervention type

Constrain parameters

4.2 Multilingual Projection and Audience Interaction

4.3 Energy, Footprint, and Performance Ethics

5 Preliminary findings from the initial probe

5.1 Frame selection: therapy outperforms other plots

5.2 Drift legibility across severity bands

Table 2. Drift severity variants in multilingual utterance

Variant

Multilingual spoken sentence with drifted text in bold

Indicative drift

5.3 Timing, flow and the next increment

6 Conclusion: toward expanded frames and linguistic horizons

References