Search | arXiv e-print repository

The Drama Machine: Simulating Character Development with LLM Agents

Authors: Liam Magee, Vanicka Arora, Gus Gollings, Norma Lam-Saw

Abstract: This paper explores use of multiple large language model (LLM) agents to simulate complex, dynamic characters in dramatic scenarios. We introduce a `drama machine' framework that coordinates interactions between LLM agents playing different `Ego' and `Superego' psychological roles. In roleplay simulations, this design allows intersubjective dialogue and intra-subjective internal monologue to devel… ▽ More This paper explores use of multiple large language model (LLM) agents to simulate complex, dynamic characters in dramatic scenarios. We introduce a `drama machine' framework that coordinates interactions between LLM agents playing different `Ego' and `Superego' psychological roles. In roleplay simulations, this design allows intersubjective dialogue and intra-subjective internal monologue to develop in parallel. We apply this framework to two dramatic scenarios - an interview and a detective story - and compare character development with and without the Superego's influence. Though exploratory, results suggest this multi-agent approach can produce more nuanced, adaptive narratives that evolve over a sequence of dialogical turns. We discuss different modalities of LLM-based roleplay and character development, along with what this might mean for conceptualization of AI subjectivity. The paper concludes by considering how this approach opens possibilities for thinking of the roles of internal conflict and social performativity in AI-based simulation. △ Less

Submitted 3 August, 2024; originally announced August 2024.

Comments: 28 pages, 2 figures

ACM Class: J.4; J.5; K.4.2

arXiv:2405.09734 [pdf]

Attention is All You Want: Machinic Gaze and the Anthropocene

Authors: Liam Magee, Vanicka Arora

Abstract: This chapter experiments with ways computational vision interprets and synthesises representations of the Anthropocene. Text-to-image systems such as MidJourney and StableDiffusion, trained on large data sets of harvested images and captions, yield often striking compositions that serve, alternately, as banal reproduction, alien imaginary and refracted commentary on the preoccupations of Internet… ▽ More This chapter experiments with ways computational vision interprets and synthesises representations of the Anthropocene. Text-to-image systems such as MidJourney and StableDiffusion, trained on large data sets of harvested images and captions, yield often striking compositions that serve, alternately, as banal reproduction, alien imaginary and refracted commentary on the preoccupations of Internet visual culture. While the effects of AI on visual culture may themselves be transformative or catastrophic, we are more interested here in how it has been trained to imagine shared human, technical and ecological futures. Through a series of textual prompts that marry elements of the Anthropocenic and Australian environmental vernacular, we examine how this emergent machinic gaze both looks out, through its compositions of futuristic landscapes, and looks back, towards an observing and observed human subject. In its varied assistive, surveillant and generative roles, computational vision not only mirrors human desire but articulates oblique demands of its own. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 19 pages

ACM Class: K.4.2; J.5

arXiv:2405.06919 [pdf, other]

Automating Thematic Analysis: How LLMs Analyse Controversial Topics

Authors: Awais Hameed Khan, Hiruni Kegalle, Rhea D'Silva, Ned Watt, Daniel Whelan-Shamy, Lida Ghahremanlou, Liam Magee

Abstract: Large Language Models (LLMs) are promising analytical tools. They can augment human epistemic, cognitive and reasoning abilities, and support 'sensemaking', making sense of a complex environment or subject by analysing large volumes of data with a sensitivity to context and nuance absent in earlier text processing systems. This paper presents a pilot experiment that explores how LLMs can support t… ▽ More Large Language Models (LLMs) are promising analytical tools. They can augment human epistemic, cognitive and reasoning abilities, and support 'sensemaking', making sense of a complex environment or subject by analysing large volumes of data with a sensitivity to context and nuance absent in earlier text processing systems. This paper presents a pilot experiment that explores how LLMs can support thematic analysis of controversial topics. We compare how human researchers and two LLMs GPT-4 and Llama 2 categorise excerpts from media coverage of the controversial Australian Robodebt scandal. Our findings highlight intriguing overlaps and variances in thematic categorisation between human and machine agents, and suggest where LLMs can be effective in supporting forms of discourse and thematic analysis. We argue LLMs should be used to augment, and not replace human interpretation, and we add further methodological insights and reflections to existing research on the application of automation to qualitative research methods. We also introduce a novel card-based design toolkit, for both researchers and practitioners to further interrogate LLMs as analytical tools. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: 18 pages, 6 figures

ACM Class: K.4.2

arXiv:2401.00210 [pdf, other]

The Problem of Alignment

Authors: Tsvetelina Hristova, Liam Magee, Karen Soldatic

Abstract: Large Language Models produce sequences learned as statistical patterns from large corpora. In order not to reproduce corpus biases, after initial training models must be aligned with human values, preferencing certain continuations over others. Alignment, which can be viewed as the superimposition of normative structure onto a statistical model, reveals a conflicted and complex interrelationship… ▽ More Large Language Models produce sequences learned as statistical patterns from large corpora. In order not to reproduce corpus biases, after initial training models must be aligned with human values, preferencing certain continuations over others. Alignment, which can be viewed as the superimposition of normative structure onto a statistical model, reveals a conflicted and complex interrelationship between language and technology. This relationship shapes theories of language, linguistic practice and subjectivity, which are especially relevant to the current sophistication in artificially produced text. We examine this practice of structuration as a two-way interaction between users and models by analysing how ChatGPT4 redacts perceived `anomalous' language in fragments of Joyce's Ulysses and the new linguistic practice of prompt engineering. We then situate this alignment problem historically, revisiting earlier postwar linguistic debates which counterposed two views of meaning: as discrete structures, and as continuous probability distributions. We discuss the largely occluded work of the Moscow Linguistic School, which sought to reconcile this opposition. Our attention to the Moscow School and later related arguments by Searle and Kristeva casts the problem of alignment in a new light: as one involving attention to the social structuration of linguistic practice, including structuration of anomalies that, like the Joycean text, exist in defiance of expressive conventions. These debates around the communicative orientation toward language can help explain some of the contemporary behaviours and interdependencies that take place between users and LLMs. △ Less

Submitted 30 December, 2023; originally announced January 2024.

Comments: 23 pages, 1 figure

ACM Class: K.4.2

arXiv:2312.14424 [pdf, other]

Lost in the Logistical Funhouse: Speculative Design as Synthetic Media Enterprise

Authors: Zoe Horn, Liam Magee, Anna Munster

Abstract: From the deployment of chatbots as procurement negotiators by corporations such as Walmart to autonomous agents providing 'differentiated chat' for managing overbooked flights, synthetic media are making the world of logistics their 'natural' habitat. Here the coordination of commodities, parts and labour design the problems and produce the training sets from which 'solutions' can be synthesised.… ▽ More From the deployment of chatbots as procurement negotiators by corporations such as Walmart to autonomous agents providing 'differentiated chat' for managing overbooked flights, synthetic media are making the world of logistics their 'natural' habitat. Here the coordination of commodities, parts and labour design the problems and produce the training sets from which 'solutions' can be synthesised. But to what extent might synthetic media, surfacing via proto-platforms such as MidJourney and OpenAI and apps such as Eleven Labs and D:ID, be understood as logistical media? This paper details synthetic media experiments with 'ChatFOS', a GPT-based bot tasked with developing a logistics design business. Using its prompt-generated media outputs, we assemble a simulation and parody of AI's emerging functionalities within logistical worlds. In the process, and with clunky 'human-in-the-loop' stitching, we illustrate how large language models become media routers or switches, governing production of image prompts, website code, promotional copy, and investor pitch scenarios. Together these elements become links chained together in media ensembles such as the corporate website or the promotional video, fuelling the fictive logistics visualisation company we have 'founded'. The processes and methods of producing speculative scenarios via ChatFOS lead us to consider how synthetic media might be re-positioned as logistical media. Our experiments probe the ways in which the media of logistics and the logistics of media are increasingly enfolded. We ask: what can a (practice-based) articulation of this double-becoming of logistics and synthetic mediality tell us about the politics and aesthetics of contemporary computation and capital? △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: 16 pages, 5 figures

ACM Class: K.4.2; K.4.3; J.5

arXiv:2312.04777 [pdf]

Inclusive Online Learning in Australia: Barriers and Enablers

Authors: Linda Marsden, Luke Munn, Liam Magee, Matthew Ferrinda, Justin St. Pierre, Amanda Third

Abstract: While the pandemic highlighted the critical role technology plays in children's lives, not all Australian children have reliable access to technology. This situation exacerbates educational disadvantage for children who are already amongst our nation's most vulnerable. In this research project, we carried out a pilot project with three schools in Western Australia, conducting a series of workshops… ▽ More While the pandemic highlighted the critical role technology plays in children's lives, not all Australian children have reliable access to technology. This situation exacerbates educational disadvantage for children who are already amongst our nation's most vulnerable. In this research project, we carried out a pilot project with three schools in Western Australia, conducting a series of workshops and interviews with students, parents, school staff members, and teachers. Drawing on rich empirical material, we identify key barriers and enablers for digitally inclusive online learning at the individual, interpersonal, organizational, and infrastructural levels. Of particular importance is that technology is only part of this story - an array of social, environmental, and skills "infrastructure" is needed to facilitate inclusive online learning. Building on this finding, we ran a Digital Inclusion Studio to address this holistic set of issues with strongly positive feedback from participants. We conclude with a set of recommendations for stakeholders (parents, schools, government agencies) who wish to support more digitally inclusive learning. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: 22 pages, 1 figure

arXiv:2310.04628 [pdf, other]

(Re)framing Built Heritage through the Machinic Gaze

Authors: Vanicka Arora, Liam Magee, Luke Munn

Abstract: Built heritage has been both subject and product of a gaze that has been sustained through moments of colonial fixation on ruins and monuments, technocratic examination and representation, and fetishisation by aglobal tourist industry. We argue that the recent proliferation of machine learning and vision technologies create new scopic regimes for heritage: storing and retrieving existing images fr… ▽ More Built heritage has been both subject and product of a gaze that has been sustained through moments of colonial fixation on ruins and monuments, technocratic examination and representation, and fetishisation by aglobal tourist industry. We argue that the recent proliferation of machine learning and vision technologies create new scopic regimes for heritage: storing and retrieving existing images from vast digital archives, and further imparting their own distortions upon its visual representation. We introduce the term `machinic gaze' to conceptualise the reconfiguration of heritage representation via AI models. To explore how this gaze reframes heritage, we deploy an image-text-image pipeline that reads, interprets, and resynthesizes images of several UNESCO World Heritage Sites. Employing two concepts from media studies -- heteroscopia and anamorphosis -- we describe the reoriented perspective that machine vision systems introduce. We propose that the machinic gaze highlights the artifice of the human gaze and its underlying assumptions and practices that combine to form established notions of heritage. △ Less

Submitted 6 October, 2023; originally announced October 2023.

Comments: 18 pages, 5 figures

ACM Class: J.5; K.4.2

arXiv:2309.16045 [pdf, other]

Minimum Monotone Tree Decomposition of Density Functions Defined on Graphs

Authors: Lucas Magee, Yusu Wang

Abstract: Monotone trees - trees with a function defined on their vertices that decreases the further away from a root node one travels, are a natural model for a process that weakens the further one gets from its source. Given an aggregation of monotone trees, one may wish to reconstruct the individual monotone components. A natural representation of such an aggregation would be a graph. While many methods… ▽ More Monotone trees - trees with a function defined on their vertices that decreases the further away from a root node one travels, are a natural model for a process that weakens the further one gets from its source. Given an aggregation of monotone trees, one may wish to reconstruct the individual monotone components. A natural representation of such an aggregation would be a graph. While many methods have been developed for extracting hidden graph structure from datasets, which makes obtaining such an aggregation possible, decomposing such graphs into the original monotone trees is algorithmically challenging. Recently, a polynomial time algorithm has been developed to extract a minimum cardinality collection of monotone trees (M-Tree Set) from a given density tree - but no such algorithm exists for density graphs that may contain cycles. In this work, we prove that extracting such minimum M-Tree Sets of density graphs is NP-Complete. We additionally prove three additional variations of the problem - such as the minimum M-Tree Set such that the intersection between any two monotone trees is either empty or contractible (SM-Tree Set) - are also NP-Complete. We conclude by providing some approximation algorithms, highlighted by a 3-approximation algorithm for computing the minimum SM-Tree Set for density cactus graphs. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2308.15668 [pdf]

doi 10.1177/10778004221099560

Intersectional Inquiry, on the Ground and in the Algorithm

Authors: Shanthi Robertson, Liam Magee, Karen Soldatić

Abstract: This article makes two key contributions to methodological debates in automation research. First, we argue for and demonstrate how methods in this field must account for intersections of social difference, such as race, class, ethnicity, culture, and disability, in more nuanced ways. Second, we consider the complexities of bringing together computational and qualitative methods in an intersectiona… ▽ More This article makes two key contributions to methodological debates in automation research. First, we argue for and demonstrate how methods in this field must account for intersections of social difference, such as race, class, ethnicity, culture, and disability, in more nuanced ways. Second, we consider the complexities of bringing together computational and qualitative methods in an intersectional methodological approach while also arguing that in their respective subjects (machines and human subjects) and conceptual scope they enable a specific dialogue on intersectionality and automation to be articulated. We draw on field reflections from a project that combines an analysis of intersectional bias in language models with findings from a community workshop on the frustrations and aspirations produced through engagement with everyday AI-driven technologies in the context of care. △ Less

Submitted 29 August, 2023; originally announced August 2023.

ACM Class: K.4.2

Journal ref: Qualitative Inquiry, 28(7), 814-826 (2022)

arXiv:2307.09753 [pdf, other]

Unmaking AI Imagemaking: A Methodological Toolkit for Critical Investigation

Authors: Luke Munn, Liam Magee, Vanicka Arora

Abstract: AI image models are rapidly evolving, disrupting aesthetic production in many industries. However, understanding of their underlying archives, their logic of image reproduction, and their persistent biases remains limited. What kind of methods and approaches could open up these black boxes? In this paper, we provide three methodological approaches for investigating AI image models and apply them t… ▽ More AI image models are rapidly evolving, disrupting aesthetic production in many industries. However, understanding of their underlying archives, their logic of image reproduction, and their persistent biases remains limited. What kind of methods and approaches could open up these black boxes? In this paper, we provide three methodological approaches for investigating AI image models and apply them to Stable Diffusion as a case study. Unmaking the ecosystem analyzes the values, structures, and incentives surrounding the model's production. Unmaking the data analyzes the images and text the model draws upon, with their attendant particularities and biases. Unmaking the output analyzes the model's generative results, revealing its logics through prompting, reflection, and iteration. Each mode of inquiry highlights particular ways in which the image model captures, "understands," and recreates the world. This accessible framework supports the work of critically investigating generative AI image models and paves the way for more socially and politically attuned analyses of their impacts in the world. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: 14 pages, 4 figures

ACM Class: K.4.1; K.2; J.5

arXiv:2301.12347 [pdf, other]

Academic Institutions in Multilateral Data Governance: Emerging Arrangements for Negotiating Risk, Value and Ethics in the Big Data Economy

Authors: Tsvetelina Hristova, Liam Magee, Emma Kearney

Abstract: Data sharing partnerships are increasingly an imperative for research institutions and, at the same time, a challenge for established models of data governance and ethical research oversight. We analyse four cases of data partnership involving academic institutions and examine the role afforded to the research partner in negotiating the relationship between risk, value, trust and ethics. Within th… ▽ More Data sharing partnerships are increasingly an imperative for research institutions and, at the same time, a challenge for established models of data governance and ethical research oversight. We analyse four cases of data partnership involving academic institutions and examine the role afforded to the research partner in negotiating the relationship between risk, value, trust and ethics. Within this terrain, far from being a restraint on financialisation, the instrumentation of ethics forms part of the wider mobilisation of infrastructure for the realisation of profit in the big data economy. Under what we term `combinatorial data governance' academic structures for the management of research ethics are instrumentalised as organisational functions that serve to mitigate reputational damage and societal distrust. In the alternative model of `experimental data governance' researchers propose frameworks and instruments for the rethinking of data ethics and the risks associated with it - a model that is promising but limited in its practical application. △ Less

Submitted 28 January, 2023; originally announced January 2023.

Comments: 21 pages, 5 figures

arXiv:2301.12066 [pdf, other]

Truth Machines: Synthesizing Veracity in AI Language Models

Authors: Luke Munn, Liam Magee, Vanicka Arora

Abstract: As AI technologies are rolled out into healthcare, academia, human resources, law, and a multitude of other domains, they become de-facto arbiters of truth. But truth is highly contested, with many different definitions and approaches. This article discusses the struggle for truth in AI systems and the general responses to date. It then investigates the production of truth in InstructGPT, a large… ▽ More As AI technologies are rolled out into healthcare, academia, human resources, law, and a multitude of other domains, they become de-facto arbiters of truth. But truth is highly contested, with many different definitions and approaches. This article discusses the struggle for truth in AI systems and the general responses to date. It then investigates the production of truth in InstructGPT, a large language model, highlighting how data harvesting, model architectures, and social feedback mechanisms weave together disparate understandings of veracity. It conceptualizes this performance as an operationalization of truth, where distinct, often conflicting claims are smoothly synthesized and confidently presented into truth-statements. We argue that these same logics and inconsistencies play out in Instruct's successor, ChatGPT, reiterating truth as a non-trivial problem. We suggest that enriching sociality and thickening "reality" are two promising vectors for enhancing the truth-evaluating capacities of future language models. We conclude, however, by stepping back to consider AI truth-telling as a social practice: what kind of "truth" do we as listeners desire? △ Less

Submitted 27 January, 2023; originally announced January 2023.

Comments: 20 pages, 3 figures

arXiv:2212.05058 [pdf, other]

Structured Like a Language Model: Analysing AI as an Automated Subject

Authors: Liam Magee, Vanicka Arora, Luke Munn

Abstract: Drawing from the resources of psychoanalysis and critical media studies, in this paper we develop an analysis of Large Language Models (LLMs) as automated subjects. We argue the intentional fictional projection of subjectivity onto LLMs can yield an alternate frame through which AI behaviour, including its productions of bias and harm, can be analysed. First, we introduce language models, discuss… ▽ More Drawing from the resources of psychoanalysis and critical media studies, in this paper we develop an analysis of Large Language Models (LLMs) as automated subjects. We argue the intentional fictional projection of subjectivity onto LLMs can yield an alternate frame through which AI behaviour, including its productions of bias and harm, can be analysed. First, we introduce language models, discuss their significance and risks, and outline our case for interpreting model design and outputs with support from psychoanalytic concepts. We trace a brief history of language models, culminating with the releases, in 2022, of systems that realise state-of-the-art natural language processing performance. We engage with one such system, OpenAI's InstructGPT, as a case study, detailing the layers of its construction and conducting exploratory and semi-structured interviews with chatbots. These interviews probe the model's moral imperatives to be helpful, truthful and harmless by design. The model acts, we argue, as the condensation of often competing social desires, articulated through the internet and harvested into training data, which must then be regulated and repressed. This foundational structure can however be redirected via prompting, so that the model comes to identify with, and transfer, its commitments to the immediate human subject before it. In turn, these automated productions of language can lead to the human subject projecting agency upon the model, effecting occasionally further forms of countertransference. We conclude that critical media methods and psychoanalytic theory together offer a productive frame for grasping the powerful new capacities of AI-driven language systems. △ Less

Submitted 8 December, 2022; originally announced December 2022.

arXiv:2109.07606 [pdf, other]

Graph skeletonization of high-dimensional point cloud data via topological method

Authors: Lucas Magee, Yusu Wang

Abstract: Geometric graphs form an important family of hidden structures behind data. In this paper, we develop an efficient and robust algorithm to infer a graph skeleton of a high-dimensional point cloud dataset (PCD). Previously, there has been much work to recover a hidden graph from a low-dimensional density field, or from a relatively clean high-dimensional PCD. Our proposed approach builds upon the r… ▽ More Geometric graphs form an important family of hidden structures behind data. In this paper, we develop an efficient and robust algorithm to infer a graph skeleton of a high-dimensional point cloud dataset (PCD). Previously, there has been much work to recover a hidden graph from a low-dimensional density field, or from a relatively clean high-dimensional PCD. Our proposed approach builds upon the recent line of work on using a persistence-guided discrete Morse (DM) theory based approach to reconstruct a geometric graph from a density field defined over a low-dimensional triangulation. In particular, we first give a very simple generalization of this DM-based algorithm from a density-function perspective to a general filtration perspective. On the theoretical front, we show that the output of the generalized algorithm contains a so-called lexicographic-optimal persistent cycle basis w.r.t the input filtration, justifying that the output is indeed meaningful. On the algorithmic front, the generalization allows us to combine sparsified weighted Rips filtration to develop a new graph reconstruction algorithm for noisy point cloud data. The new algorithm is robust to background noise and non-uniform distribution of input points, and we provide various experimental results to show its effectiveness. △ Less

Submitted 13 October, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

arXiv:2107.07691 [pdf, other]

Intersectional Bias in Causal Language Models

Authors: Liam Magee, Lida Ghahremanlou, Karen Soldatic, Shanthi Robertson

Abstract: To examine whether intersectional bias can be observed in language generation, we examine \emph{GPT-2} and \emph{GPT-NEO} models, ranging in size from 124 million to ~2.7 billion parameters. We conduct an experiment combining up to three social categories - gender, religion and disability - into unconditional or zero-shot prompts used to generate sentences that are then analysed for sentiment. Our… ▽ More To examine whether intersectional bias can be observed in language generation, we examine \emph{GPT-2} and \emph{GPT-NEO} models, ranging in size from 124 million to ~2.7 billion parameters. We conduct an experiment combining up to three social categories - gender, religion and disability - into unconditional or zero-shot prompts used to generate sentences that are then analysed for sentiment. Our results confirm earlier tests conducted with auto-regressive causal models, including the \emph{GPT} family of models. We also illustrate why bias may be resistant to techniques that target single categories (e.g. gender, religion and race), as it can also manifest, in often subtle ways, in texts prompted by concatenated social categories. To address these difficulties, we suggest technical and community-based approaches need to combine to acknowledge and address complex and intersectional language model bias. △ Less

Submitted 15 July, 2021; originally announced July 2021.

Comments: 18 pages, 4 figures

arXiv:2106.14269 [pdf]

doi 10.1109/TEM.2022.3152216

Deep Learning for Technical Document Classification

Authors: Shuo Jiang, Jie Hu, Christopher L. Magee, Jianxi Luo

Abstract: In large technology companies, the requirements for managing and organizing technical documents created by engineers and managers have increased dramatically in recent years, which has led to a higher demand for more scalable, accurate, and automated document classification. Prior studies have only focused on processing text for classification, whereas technical documents often contain multimodal… ▽ More In large technology companies, the requirements for managing and organizing technical documents created by engineers and managers have increased dramatically in recent years, which has led to a higher demand for more scalable, accurate, and automated document classification. Prior studies have only focused on processing text for classification, whereas technical documents often contain multimodal information. To leverage multimodal information for document classification to improve the model performance, this paper presents a novel multimodal deep learning architecture, TechDoc, which utilizes three types of information, including natural language texts and descriptive images within documents and the associations among the documents. The architecture synthesizes the convolutional neural network, recurrent neural network, and graph neural network through an integrated training process. We applied the architecture to a large multimodal technical document database and trained the model for classifying documents based on the hierarchical International Patent Classification system. Our results show that TechDoc presents a greater classification accuracy than the unimodal methods and other state-of-the-art benchmarks. The trained model can potentially be scaled to millions of real-world multimodal technical documents, which is useful for data and knowledge management in large technology companies and organizations. △ Less

Submitted 19 February, 2022; v1 submitted 27 June, 2021; originally announced June 2021.

Comments: 16 pages, 8 figures, 9 tables

Journal ref: IEEE Transactions on Engineering Management. Published Online. 2022

arXiv:2104.05608 [pdf, other]

doi 10.1615/IntJMultCompEng.2022042266

Equivariant geometric learning for digital rock physics: estimating formation factor and effective permeability tensors from Morse graph

Authors: Chen Cai, Nikolaos Vlassis, Lucas Magee, Ran Ma, Zeyu Xiong, Bahador Bahmani, Teng-Fong Wong, Yusu Wang, WaiChing Sun

Abstract: We present a SE(3)-equivariant graph neural network (GNN) approach that directly predicting the formation factor and effective permeability from micro-CT images. FFT solvers are established to compute both the formation factor and effective permeability, while the topology and geometry of the pore space are represented by a persistence-based Morse graph. Together, they constitute the database for… ▽ More We present a SE(3)-equivariant graph neural network (GNN) approach that directly predicting the formation factor and effective permeability from micro-CT images. FFT solvers are established to compute both the formation factor and effective permeability, while the topology and geometry of the pore space are represented by a persistence-based Morse graph. Together, they constitute the database for training, validating, and testing the neural networks. While the graph and Euclidean convolutional approaches both employ neural networks to generate low-dimensional latent space to represent the features of the micro-structures for forward predictions, the SE(3) equivariant neural network is found to generate more accurate predictions, especially when the training data is limited. Numerical experiments have also shown that the new SE(3) approach leads to predictions that fulfill the material frame indifference whereas the predictions from classical convolutional neural networks (CNN) may suffer from spurious dependence on the coordinate system of the training data. Comparisons among predictions inferred from training the CNN and those from graph convolutional neural networks (GNN) with and without the equivariant constraint indicate that the equivariant graph neural network seems to perform better than the CNN and GNN without enforcing equivariant constraints. △ Less

Submitted 12 October, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

arXiv:2004.13919 [pdf]

doi 10.1016/j.respol.2021.104294

Technological improvement rate estimates for all technologies: Use of patent data and an extended domain description

Authors: Anuraag Singh, Giorgio Triulzi, Christopher L. Magee

Abstract: In this work, we attempt to provide a comprehensive granular account of the pace of technological change. More specifically, we survey estimated yearly performance improvement rates for nearly all definable technologies for the first time. We do this by creating a correspondence of all patents within the US patent system to a set of technology domains. A technology domain is a body of patented inv… ▽ More In this work, we attempt to provide a comprehensive granular account of the pace of technological change. More specifically, we survey estimated yearly performance improvement rates for nearly all definable technologies for the first time. We do this by creating a correspondence of all patents within the US patent system to a set of technology domains. A technology domain is a body of patented inventions achieving the same technological function using the same knowledge and scientific principles. We obtain a set of 1757 domains using an extension of the previously defined classification overlap method (COM). These domains contain 97.14% of all patents within the entire US patent system. From the identified patent sets, we calculated the average centrality of the patents in each domain to estimate their improvement rates, following a methodology tested in prior work. The estimated improvement rates vary from a low of 1.9% per year for the Mechanical Skin treatment - Hair Removal and wrinkles domain to a high of 228.8% per year for the Network management - client-server applications domain. We developed a one-line descriptor identifying the technological function achieved and the underlying knowledge base for the largest 50, fastest 20 as well as slowest 20 of these domains, which cover more than forty percent of the patent system. In general, the rates of improvement were not a strong function of the patent set size and the fastest improving domains are predominantly software-based. We make available an online system that allows for automated searching for domains and improvement rates corresponding to any technology of interest to researchers, strategists and policy formulators. △ Less

Submitted 28 April, 2020; originally announced April 2020.

Journal ref: Technological Improvement Rate Predictions for All Technologies: Use of Patent Data and an Extended Domain Description. Research Policy 50 (9): 104294. 2021

arXiv:2004.02755 [pdf, other]

Detection and skeletonization of single neurons and tracer injections using topological methods

Authors: Dingkang Wang, Lucas Magee, Bing-Xing Huo, Samik Banerjee, Xu Li, Jaikishan Jayakumar, Meng Kuan Lin, Keerthi Ram, Suyi Wang, Yusu Wang, Partha P. Mitra

Abstract: Neuroscientific data analysis has traditionally relied on linear algebra and stochastic process theory. However, the tree-like shapes of neurons cannot be described easily as points in a vector space (the subtraction of two neuronal shapes is not a meaningful operation), and methods from computational topology are better suited to their analysis. Here we introduce methods from Discrete Morse (DM)… ▽ More Neuroscientific data analysis has traditionally relied on linear algebra and stochastic process theory. However, the tree-like shapes of neurons cannot be described easily as points in a vector space (the subtraction of two neuronal shapes is not a meaningful operation), and methods from computational topology are better suited to their analysis. Here we introduce methods from Discrete Morse (DM) Theory to extract the tree-skeletons of individual neurons from volumetric brain image data, and to summarize collections of neurons labelled by tracer injections. Since individual neurons are topologically trees, it is sensible to summarize the collection of neurons using a consensus tree-shape that provides a richer information summary than the traditional regional 'connectivity matrix' approach. The conceptually elegant DM approach lacks hand-tuned parameters and captures global properties of the data as opposed to previous approaches which are inherently local. For individual skeletonization of sparsely labelled neurons we obtain substantial performance gains over state-of-the-art non-topological methods (over 10% improvements in precision and faster proofreading). The consensus-tree summary of tracer injections incorporates the regional connectivity matrix information, but in addition captures the collective collateral branching patterns of the set of neurons connected to the injection site, and provides a bridge between single-neuron morphology and tracer-injection data. △ Less

Submitted 20 March, 2020; originally announced April 2020.

Comments: 20 pages (14 pages main-text and 6 pages supplementary information). 5 main-text figures. 5 supplementary figures. 2 supplementary tables

arXiv:2003.08741 [pdf]

doi 10.1115/DETC2020-22048

A Convolutional Neural Network-based Patent Image Retrieval Method for Design Ideation

Authors: Shuo Jiang, Jianxi Luo, Guillermo Ruiz Pava, Jie Hu, Christopher L. Magee

Abstract: The patent database is often used in searches of inspirational stimuli for innovative design opportunities because of its large size, extensive variety and rich design information in patent documents. However, most patent mining research only focuses on textual information and ignores visual information. Herein, we propose a convolutional neural network (CNN)-based patent image retrieval method. T… ▽ More The patent database is often used in searches of inspirational stimuli for innovative design opportunities because of its large size, extensive variety and rich design information in patent documents. However, most patent mining research only focuses on textual information and ignores visual information. Herein, we propose a convolutional neural network (CNN)-based patent image retrieval method. The core of this approach is a novel neural network architecture named Dual-VGG that is aimed to accomplish two tasks: visual material type prediction and international patent classification (IPC) class label prediction. In turn, the trained neural network provides the deep features in the image embedding vectors that can be utilized for patent image retrieval and visual mapping. The accuracy of both training tasks and patent image embedding space are evaluated to show the performance of our model. This approach is also illustrated in a case study of robot arm design retrieval. Compared to traditional keyword-based searching and Google image searching, the proposed method discovers more useful visual information for engineering design. △ Less

Submitted 19 May, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

Comments: 11 pages, 11 figures

Journal ref: ASME 2020 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference

arXiv:1806.06947 [pdf]

Forecasting the value of battery electric vehicles compared to internal combustion engine vehicles: the influence of driving range and battery technology

Authors: JongRoul Woo, Christopher L. Magee

Abstract: Battery electric vehicles (BEVs) are now clearly a promising candidate in addressing the environmental problems associated with conventional internal combustion engine vehicles (ICEVs). However, BEVs, unlike ICEVs, are still not widely accepted in the automobile market but continuing technological change could overcome this barrier. The aim of this study is to assess and forecast whether and when… ▽ More Battery electric vehicles (BEVs) are now clearly a promising candidate in addressing the environmental problems associated with conventional internal combustion engine vehicles (ICEVs). However, BEVs, unlike ICEVs, are still not widely accepted in the automobile market but continuing technological change could overcome this barrier. The aim of this study is to assess and forecast whether and when design changes and technological improvements related to major challenges in driving range and battery cost will make the user value of BEVs greater than the user value of ICEVs. Specifically, we estimate the relative user value of BEVs and ICEVs resulting after design modifications to achieve different driving ranges by considering the engineering trade-offs based on a vehicle simulation. Then, we analyze when the relative user value of BEVs is expected to exceed ICEVs as the energy density and cost of batteries improve because of ongoing technological change. Our analysis demonstrates that the relative value of BEVs is lower than that of ICEVs because BEVs have high battery cost and high cost of time spent recharging despite high torque, high fuel efficiency, and low fuel cost. Moreover, we found the relative value differences between BEVs and ICEVs are found to be less in high performance large cars than in low performance compact cars because BEVs can achieve high acceleration performance more easily than ICEVs. In addition, this study predicts that in approximately 2050, high performance large BEVs could have higher relative value than high performance large ICEVs because of technological improvements in batteries; however low performance compact BEVs are still very likely to have significantly lower user value than comparable ICEVs until well beyond 2050. △ Less

Submitted 31 May, 2018; originally announced June 2018.

arXiv:1706.07140 [pdf]

Dynamic patterns of knowledge flows across technological domains: empirical results and link prediction

Authors: Jieun Kim, Christopher L. Magee

Abstract: The purpose of this study is to investigate the structure and evolution of knowledge spillovers across technological domains. Specifically, dynamic patterns of knowledge flow among 29 technological domains, measured by patent citations for eight distinct periods, are identified and link prediction is tested for capability for forecasting the evolution in these cross-domain patent networks. The ove… ▽ More The purpose of this study is to investigate the structure and evolution of knowledge spillovers across technological domains. Specifically, dynamic patterns of knowledge flow among 29 technological domains, measured by patent citations for eight distinct periods, are identified and link prediction is tested for capability for forecasting the evolution in these cross-domain patent networks. The overall success of the predictions using the Katz metric implies that there is a tendency to generate increased knowledge flows mostly within the set of previously linked technological domains. This study contributes to innovation studies by characterizing the structural change and evolutionary behaviors in dynamic technology networks and by offering the basis for predicting the emergence of future technological knowledge flows. △ Less

Submitted 21 June, 2017; originally announced June 2017.

arXiv:1705.00258 [pdf]

Testing the science/technology relationship by analysis of patent citations of scientific papers after decomposition of both science and technology

Authors: Fang Han, Christopher L. Magee

Abstract: The relationship of scientific knowledge development to technological development is widely recognized as one of the most important and complex aspects of technological evolution. This paper adds to our understanding of the relationship through use of a more rigorous structure for differentiating among technologies based upon technological domains (defined as consisting of the artifacts over time… ▽ More The relationship of scientific knowledge development to technological development is widely recognized as one of the most important and complex aspects of technological evolution. This paper adds to our understanding of the relationship through use of a more rigorous structure for differentiating among technologies based upon technological domains (defined as consisting of the artifacts over time that fulfill a specific generic function using a specific body of technical knowledge). △ Less

Submitted 29 April, 2017; originally announced May 2017.

Comments: 32 pages, 6 figures, 7 tables, the paper was presented at the 16th ISS conference

arXiv:1609.03806 [pdf]

Quantitative identification of technological discontinuities using simulation modeling

Authors: Hyunseok Park, Christopher L. Magee

Abstract: The aim of this paper is to develop and test metrics to quantitatively identify technological discontinuities in a knowledge network. We developed five metrics based on innovation theories and tested the metrics by a simulation model-based knowledge network and hypothetically designed discontinuity. The designed discontinuity is modeled as a node which combines two different knowledge streams and… ▽ More The aim of this paper is to develop and test metrics to quantitatively identify technological discontinuities in a knowledge network. We developed five metrics based on innovation theories and tested the metrics by a simulation model-based knowledge network and hypothetically designed discontinuity. The designed discontinuity is modeled as a node which combines two different knowledge streams and whose knowledge is dominantly persistent in the knowledge network. The performances of the proposed metrics were evaluated by how well the metrics can distinguish the designed discontinuity from other nodes on the knowledge network. The simulation results show that the persistence times # of converging main paths provides the best performance in identifying the designed discontinuity: the designed discontinuity was identified as one of the top 3 patents with 96~99% probability by Metric 5 and it is, according to the size of a domain, 12~34% better than the performance of the second best metric. Beyond the simulation analysis, we tested the metrics using a patent set representative of the Magnetic information storage domain. The three representative patents associated with a well-known breakthrough technology in the domain, the giant magneto-resistance (GMR) spin valve sensor, were selected based on the qualitative studies, and the metrics were tested by how well the metrics identify the selected patents as top-ranked patents. The empirical results fully support the simulation results and therefore the persistence times # of converging main paths is recommended for identifying technological discontinuities for any technology. △ Less

Submitted 13 September, 2016; originally announced September 2016.

Comments: 25 pages, 13 figures

arXiv:1608.07371 [pdf]

doi 10.1371/journal.pone.0170895

Tracing technological development trajectories: A genetic knowledge persistence-based main path approach

Authors: Hyunseok Park, Christopher L. Magee

Abstract: The aim of this paper is to propose a new method to identify main paths in a technological domain using patent citations. Previous approaches for using main path analysis have greatly improved our understanding of actual technological trajectories but nonetheless have some limitations. They have high potential to miss some dominant patents from the identified main paths; nonetheless, the high netw… ▽ More The aim of this paper is to propose a new method to identify main paths in a technological domain using patent citations. Previous approaches for using main path analysis have greatly improved our understanding of actual technological trajectories but nonetheless have some limitations. They have high potential to miss some dominant patents from the identified main paths; nonetheless, the high network complexity of their main paths makes qualitative tracing of trajectories problematic. The proposed method searches backward and forward paths from the high-persistence patents which are identified based on a standard genetic knowledge persistence algorithm. We tested the new method by applying it to the desalination and the solar photovoltaic domains and compared the results to output from the same domains using a prior method. The empirical results show that the proposed method overcomes the aforementioned drawbacks defining main paths that are almost 10x less complex while containing more of the relevant important knowledge than the main path networks defined by the existing method. △ Less

Submitted 26 August, 2016; originally announced August 2016.

Comments: 20 pages, 7 figures

arXiv:1604.06053 [pdf]

Decomposition and Analysis of Technological domains for better understanding of Technological Structure

Authors: Xin Guo, Hyunseok Park, Christopher L. Magee

Abstract: Patents represent one of the most complete sources of information related to technological change. This paper presents three months of research on U.S. patents in the field of patent analysis. The methodology consists of using search terms to locate the most representative international and US patent classes and determines the overlap of those classes to arrive at the final set of patents and usin… ▽ More Patents represent one of the most complete sources of information related to technological change. This paper presents three months of research on U.S. patents in the field of patent analysis. The methodology consists of using search terms to locate the most representative international and US patent classes and determines the overlap of those classes to arrive at the final set of patents and using the prediction model developed by Benson and Magee to calculate the technological improvement rate for the technological domains. My research focused on the Biochemical Pharmacology technological area and selecting relevant patents for technological domains and sub-domains within this area. The goal is to better understand structure of technology domain and understand how fast the domains and their sub-domains progress. The method I used is developed by Benson and Magee which is called the Classification Overlap Method1, it provides a reliable and largely automated way to break the patent database into understandable technological domains where progress can be measured. △ Less

Submitted 19 April, 2016; originally announced April 2016.

Comments: 17 pages, 5 figures

arXiv:1602.04713 [pdf]

Modeling of technological performance trends using design theory

Authors: Subarna Basnet, Christopher L. Magee

Abstract: Functional technical performance usually follows an exponential dependence on time but the rate of change (the exponent) varies greatly among technological domains. This paper presents a simple model that provides an explanatory foundation for these phenomena based upon the inventive design process. The model assumes that invention - novel and useful design- arises through probabilistic analogic… ▽ More Functional technical performance usually follows an exponential dependence on time but the rate of change (the exponent) varies greatly among technological domains. This paper presents a simple model that provides an explanatory foundation for these phenomena based upon the inventive design process. The model assumes that invention - novel and useful design- arises through probabilistic analogical transfers that combine existing knowledge by combining existing individual operational ideas to arrive at new individual operating ideas. The continuing production of individual operating ideas relies upon injection of new basic individual operating ideas that occurs through coupling of science and technology simulations. The individual operational ideas that result from this process are then modeled as being assimilated in components of artifacts characteristic of a technological domain. According to the model, two effects (differences in interactions among components for different domains and differences in scaling laws for different domains) account for the differences found in improvement rates among domains whereas the analogical transfer process is the source of the exponential behavior. The model is supported by a number of known empirical facts: further empirical research is suggested to independently assess further predictions made by the model. △ Less

Submitted 11 February, 2016; originally announced February 2016.

Comments: 43 pages, 10 figures

Showing 1–27 of 27 results for author: Magee, L