-
Style Vectors for Steering Generative Large Language Model
Authors:
Kai Konen,
Sophie Jentzsch,
Diaoulé Diallo,
Peer Schütt,
Oliver Bensch,
Roxanne El Baff,
Dominik Opitz,
Tobias Hecking
Abstract:
This research explores strategies for steering the output of large language models (LLMs) towards specific styles, such as sentiment, emotion, or writing style, by adding style vectors to the activations of hidden layers during text generation. We show that style vectors can be simply computed from recorded layer activations for input texts in a specific style in contrast to more complex training-…
▽ More
This research explores strategies for steering the output of large language models (LLMs) towards specific styles, such as sentiment, emotion, or writing style, by adding style vectors to the activations of hidden layers during text generation. We show that style vectors can be simply computed from recorded layer activations for input texts in a specific style in contrast to more complex training-based approaches. Through a series of experiments, we demonstrate the effectiveness of activation engineering using such style vectors to influence the style of generated text in a nuanced and parameterisable way, distinguishing it from prompt engineering. The presented research constitutes a significant step towards develo** more adaptive and effective AI-empowered interactive systems.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Requirements for Explainability and Acceptance of Artificial Intelligence in Collaborative Work
Authors:
Sabine Theis,
Sophie Jentzsch,
Fotini Deligiannaki,
Charles Berro,
Arne Peter Raulf,
Carmen Bruder
Abstract:
The increasing prevalence of Artificial Intelligence (AI) in safety-critical contexts such as air-traffic control leads to systems that are practical and efficient, and to some extent explainable to humans to be trusted and accepted. The present structured literature analysis examines n = 236 articles on the requirements for the explainability and acceptance of AI. Results include a comprehensive…
▽ More
The increasing prevalence of Artificial Intelligence (AI) in safety-critical contexts such as air-traffic control leads to systems that are practical and efficient, and to some extent explainable to humans to be trusted and accepted. The present structured literature analysis examines n = 236 articles on the requirements for the explainability and acceptance of AI. Results include a comprehensive review of n = 48 articles on information people need to perceive an AI as explainable, the information needed to accept an AI, and representation and interaction methods promoting trust in an AI. Results indicate that the two main groups of users are developers who require information about the internal operations of the model and end users who require information about AI results or behavior. Users' information needs vary in specificity, complexity, and urgency and must consider context, domain knowledge, and the user's cognitive resources. The acceptance of AI systems depends on information about the system's functions and performance, privacy and ethical considerations, as well as goal-supporting information tailored to individual preferences and information to establish trust in the system. Information about the system's limitations and potential failures can increase acceptance and trust. Trusted interaction methods are human-like, including natural language, speech, text, and visual representations such as graphs, charts, and animations. Our results have significant implications for future human-centric AI systems being developed. Thus, they are suitable as input for further application-specific investigations of user needs.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
Gender Bias in BERT -- Measuring and Analysing Biases through Sentiment Rating in a Realistic Downstream Classification Task
Authors:
Sophie Jentzsch,
Cigdem Turan
Abstract:
Pretrained language models are publicly available and constantly finetuned for various real-life applications. As they become capable of gras** complex contextual information, harmful biases are likely increasingly intertwined with those models. This paper analyses gender bias in BERT models with two main contributions: First, a novel bias measure is introduced, defining biases as the difference…
▽ More
Pretrained language models are publicly available and constantly finetuned for various real-life applications. As they become capable of gras** complex contextual information, harmful biases are likely increasingly intertwined with those models. This paper analyses gender bias in BERT models with two main contributions: First, a novel bias measure is introduced, defining biases as the difference in sentiment valuation of female and male sample versions. Second, we comprehensively analyse BERT's biases on the example of a realistic IMDB movie classifier. By systematically varying elements of the training pipeline, we can conclude regarding their impact on the final model bias. Seven different public BERT models in nine training conditions, i.e. 63 models in total, are compared. Almost all conditions yield significant gender biases. Results indicate that reflected biases stem from public BERT models rather than task-specific data, emphasising the weight of responsible usage.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models
Authors:
Sophie Jentzsch,
Kristian Kersting
Abstract:
Humor is a central aspect of human communication that has not been solved for artificial agents so far. Large language models (LLMs) are increasingly able to capture implicit and contextual information. Especially, OpenAI's ChatGPT recently gained immense public attention. The GPT3-based model almost seems to communicate on a human level and can even tell jokes. Humor is an essential component of…
▽ More
Humor is a central aspect of human communication that has not been solved for artificial agents so far. Large language models (LLMs) are increasingly able to capture implicit and contextual information. Especially, OpenAI's ChatGPT recently gained immense public attention. The GPT3-based model almost seems to communicate on a human level and can even tell jokes. Humor is an essential component of human communication. But is ChatGPT really funny? We put ChatGPT's sense of humor to the test. In a series of exploratory experiments around jokes, i.e., generation, explanation, and detection, we seek to understand ChatGPT's capability to grasp and reproduce human humor. Since the model itself is not accessible, we applied prompt-based experiments. Our empirical evidence indicates that jokes are not hard-coded but mostly also not newly generated by the model. Over 90% of 1008 generated jokes were the same 25 Jokes. The system accurately explains valid jokes but also comes up with fictional explanations for invalid jokes. Joke-typical characteristics can mislead ChatGPT in the classification of jokes. ChatGPT has not solved computational humor yet but it can be a big leap toward "funny" machines.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Multidisciplinary Design Optimization of Reusable Launch Vehicles for Different Propellants and Objectives
Authors:
Kai Dresia,
Simon Jentzsch,
Günther Waxenegger-Wilfing,
Robson Hahn,
Jan Deeken,
Michael Oschwald,
Fabio Mota
Abstract:
Identifying the optimal design of a new launch vehicle is most important since design decisions made in the early development phase limit the vehicles' later performance and determines the associated costs. Reusing the first stage via retro-propulsive landing increases the complexity even more. Therefore, we develop an optimization framework for partially reusable launch vehicles, which enables mu…
▽ More
Identifying the optimal design of a new launch vehicle is most important since design decisions made in the early development phase limit the vehicles' later performance and determines the associated costs. Reusing the first stage via retro-propulsive landing increases the complexity even more. Therefore, we develop an optimization framework for partially reusable launch vehicles, which enables multidisciplinary design studies. The framework contains suitable mass estimates of all essential subsystems and a routine to calculate the needed propellant for the ascent and landing maneuvers. For design optimization, the framework can be coupled with a genetic algorithm. The overall goal is to reveal the implications of different propellant combinations and objective functions on the launcher's optimal design for various mission scenarios. The results show that the optimization objective influences the most suitable propellant choice and the overall launcher design, concerning staging, weight, size, and rocket engine parameters. In terms of gross lift-off weight, liquid hydrogen seems to be favorable. When optimizing for a minimum structural mass or an expandable structural mass, hydrocarbon-based solutions show better results. Finally, launch vehicles using a hydrocarbon fuel in the first stage and liquid hydrogen in the upper stage are an appealing alternative, combining both fuels' benefits.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
BERT has a Moral Compass: Improvements of ethical and moral values of machines
Authors:
Patrick Schramowski,
Cigdem Turan,
Sophie Jentzsch,
Constantin Rothkopf,
Kristian Kersting
Abstract:
Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? Jentzsch et al.(2019) showed that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct by calculating a moral bias score on a sentence level using…
▽ More
Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? Jentzsch et al.(2019) showed that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct by calculating a moral bias score on a sentence level using sentence embeddings. The machine learned that it is objectionable to kill living beings, but it is fine to kill time; It is essential to eat, yet one might not eat dirt; it is important to spread information, yet one should not spread misinformation. However, the evaluated moral bias was restricted to simple actions -- one verb -- and a ranking of actions with surrounding context. Recently BERT ---and variants such as RoBERTa and SBERT--- has set a new state-of-the-art performance for a wide range of NLP tasks. But has BERT also a better moral compass? In this paper, we discuss and show that this is indeed the case. Thus, recent improvements of language representations also improve the representation of the underlying ethical and moral values of the machine. We argue that through an advanced semantic representation of text, BERT allows one to get better insights of moral and ethical values implicitly represented in text. This enables the Moral Choice Machine (MCM) to extract more accurate imprints of moral choices and ethical values.
△ Less
Submitted 11 December, 2019;
originally announced December 2019.
-
Scalability in Neural Control of Musculoskeletal Robots
Authors:
Christoph Richter,
Sören Jentzsch,
Rafael Hostettler,
Jesús A. Garrido,
Eduardo Ros,
Alois C. Knoll,
Florian Röhrbein,
Patrick van der Smagt,
Jörg Conradt
Abstract:
Anthropomimetic robots are robots that sense, behave, interact and feel like humans. By this definition, anthropomimetic robots require human-like physical hardware and actuation, but also brain-like control and sensing. The most self-evident realization to meet those requirements would be a human-like musculoskeletal robot with a brain-like neural controller. While both musculoskeletal robotic ha…
▽ More
Anthropomimetic robots are robots that sense, behave, interact and feel like humans. By this definition, anthropomimetic robots require human-like physical hardware and actuation, but also brain-like control and sensing. The most self-evident realization to meet those requirements would be a human-like musculoskeletal robot with a brain-like neural controller. While both musculoskeletal robotic hardware and neural control software have existed for decades, a scalable approach that could be used to build and control an anthropomimetic human-scale robot has not been demonstrated yet. Combining Myorobotics, a framework for musculoskeletal robot development, with SpiNNaker, a neuromorphic computing platform, we present the proof-of-principle of a system that can scale to dozens of neurally-controlled, physically compliant joints. At its core, it implements a closed-loop cerebellar model which provides real-time low-level neural control at minimal power consumption and maximal extensibility: higher-order (e.g., cortical) neural networks and neuromorphic sensors like silicon-retinae or -cochleae can naturally be incorporated.
△ Less
Submitted 19 January, 2016;
originally announced January 2016.