-
"If the Machine Is As Good As Me, Then What Use Am I?" -- How the Use of ChatGPT Changes Young Professionals' Perception of Productivity and Accomplishment
Authors:
Charlotte Kobiella,
Yarhy Said Flores López,
Fiona Draxler,
Albrecht Schmidt
Abstract:
Large language models (LLMs) like ChatGPT have been widely adopted in work contexts. We explore the impact of ChatGPT on young professionals' perception of productivity and sense of accomplishment. We collected LLMs' main use cases in knowledge work through a preliminary study, which served as the basis for a two-week diary study with 21 young professionals reflecting on their ChatGPT use. Finding…
▽ More
Large language models (LLMs) like ChatGPT have been widely adopted in work contexts. We explore the impact of ChatGPT on young professionals' perception of productivity and sense of accomplishment. We collected LLMs' main use cases in knowledge work through a preliminary study, which served as the basis for a two-week diary study with 21 young professionals reflecting on their ChatGPT use. Findings indicate that ChatGPT enhanced some participants' perceptions of productivity and accomplishment by enabling greater creative output and satisfaction from efficient tool utilization. Others experienced decreased perceived productivity and accomplishment, driven by a diminished sense of ownership, perceived lack of challenge, and mediocre results. We found that the suitability of task delegation to ChatGPT varies strongly depending on the task nature. It's especially suitable for comprehending broad subject domains, generating creative solutions, and uncovering new information. It's less suitable for research tasks due to hallucinations, which necessitate extensive validation.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
On the Universality of Coupling-based Normalizing Flows
Authors:
Felix Draxler,
Stefan Wahl,
Christoph Schnörr,
Ullrich Köthe
Abstract:
We present a novel theoretical framework for understanding the expressive power of normalizing flows. Despite their prevalence in scientific applications, a comprehensive understanding of flows remains elusive due to their restricted architectures. Existing theorems fall short as they require the use of arbitrarily ill-conditioned neural networks, limiting practical applicability. We propose a dis…
▽ More
We present a novel theoretical framework for understanding the expressive power of normalizing flows. Despite their prevalence in scientific applications, a comprehensive understanding of flows remains elusive due to their restricted architectures. Existing theorems fall short as they require the use of arbitrarily ill-conditioned neural networks, limiting practical applicability. We propose a distributional universality theorem for well-conditioned coupling-based normalizing flows such as RealNVP. In addition, we show that volume-preserving normalizing flows are not universal, what distribution they learn instead, and how to fix their expressivity. Our results support the general wisdom that affine and related couplings are expressive and in general outperform volume-preserving flows, bridging a gap between empirical results and theoretical understanding.
△ Less
Submitted 5 June, 2024; v1 submitted 9 February, 2024;
originally announced February 2024.
-
Learning Distributions on Manifolds with Free-form Flows
Authors:
Peter Sorrenson,
Felix Draxler,
Armand Rousselot,
Sander Hummerich,
Ullrich Köthe
Abstract:
Many real world data, particularly in the natural sciences and computer vision, lie on known Riemannian manifolds such as spheres, tori or the group of rotation matrices. The predominant approaches to learning a distribution on such a manifold require solving a differential equation in order to sample from the model and evaluate densities. The resulting sampling times are slowed down by a high num…
▽ More
Many real world data, particularly in the natural sciences and computer vision, lie on known Riemannian manifolds such as spheres, tori or the group of rotation matrices. The predominant approaches to learning a distribution on such a manifold require solving a differential equation in order to sample from the model and evaluate densities. The resulting sampling times are slowed down by a high number of function evaluations. In this work, we propose an alternative approach which only requires a single function evaluation followed by a projection to the manifold. Training is achieved by an adaptation of the recently proposed free-form flow framework to Riemannian manifolds. The central idea is to estimate the gradient of the negative log-likelihood via a trace evaluated in the tangent space. We evaluate our method on various manifolds, and find significantly faster inference at competitive performance compared to previous work. We make our code public at https://github.com/vislearn/FFF.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Free-form Flows: Make Any Architecture a Normalizing Flow
Authors:
Felix Draxler,
Peter Sorrenson,
Lea Zimmermann,
Armand Rousselot,
Ullrich Köthe
Abstract:
Normalizing Flows are generative models that directly maximize the likelihood. Previously, the design of normalizing flows was largely constrained by the need for analytical invertibility. We overcome this constraint by a training procedure that uses an efficient estimator for the gradient of the change of variables formula. This enables any dimension-preserving neural network to serve as a genera…
▽ More
Normalizing Flows are generative models that directly maximize the likelihood. Previously, the design of normalizing flows was largely constrained by the need for analytical invertibility. We overcome this constraint by a training procedure that uses an efficient estimator for the gradient of the change of variables formula. This enables any dimension-preserving neural network to serve as a generative model through maximum likelihood training. Our approach allows placing the emphasis on tailoring inductive biases precisely to the task at hand. Specifically, we achieve excellent results in molecule generation benchmarks utilizing $E(n)$-equivariant networks. Moreover, our method is competitive in an inverse problem benchmark, while employing off-the-shelf ResNet architectures.
△ Less
Submitted 24 April, 2024; v1 submitted 25 October, 2023;
originally announced October 2023.
-
Gender, Age, and Technology Education Influence the Adoption and Appropriation of LLMs
Authors:
Fiona Draxler,
Daniel Buschek,
Mikke Tavast,
Perttu Hämäläinen,
Albrecht Schmidt,
Juhi Kulshrestha,
Robin Welsch
Abstract:
Large Language Models (LLMs) such as ChatGPT have become increasingly integrated into critical activities of daily life, raising concerns about equitable access and utilization across diverse demographics. This study investigates the usage of LLMs among 1,500 representative US citizens. Remarkably, 42% of participants reported utilizing an LLM. Our findings reveal a gender gap in LLM technology ad…
▽ More
Large Language Models (LLMs) such as ChatGPT have become increasingly integrated into critical activities of daily life, raising concerns about equitable access and utilization across diverse demographics. This study investigates the usage of LLMs among 1,500 representative US citizens. Remarkably, 42% of participants reported utilizing an LLM. Our findings reveal a gender gap in LLM technology adoption (more male users than female users) with complex interaction patterns regarding age. Technology-related education eliminates the gender gap in our sample. Moreover, expert users are more likely than novices to list professional tasks as typical application scenarios, suggesting discrepancies in effective usage at the workplace. These results underscore the importance of providing education in artificial intelligence in our technology-driven society to promote equitable access to and benefits from LLMs. We urge for both international replication beyond the US and longitudinal observation of adoption.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Useful but Distracting: Keyword Highlights and Time-Synchronization in Captions for Language Learning
Authors:
Fiona Draxler,
Henrike Weingärtner,
Maximiliane Windl,
Albrecht Schmidt,
Lewis L. Chuang
Abstract:
Captions provide language learners with a scaffold for comprehension and vocabulary acquisition. Past work has proposed several enhancements such as keyword highlights for increased learning gains. However, little is known about learners' experience with enhanced captions, although this is critical for adoption in everyday life. We conducted a survey and focus group to elicit learner preferences a…
▽ More
Captions provide language learners with a scaffold for comprehension and vocabulary acquisition. Past work has proposed several enhancements such as keyword highlights for increased learning gains. However, little is known about learners' experience with enhanced captions, although this is critical for adoption in everyday life. We conducted a survey and focus group to elicit learner preferences and requirements and implemented a processing pipeline for enhanced captions with keyword highlights, time-synchronized keyword highlights, and keyword captions. A subsequent online study (n = 49) showed that time-synchronized keyword highlights were the preferred design for learning but were perceived as too distracting to replace standard captions in everyday viewing scenarios. We conclude that keyword highlights and time-synchronization are suitable for integrating learning into an entertaining everyday-life activity, but the design should be optimized to provide a more seamless experience.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
On the Convergence Rate of Gaussianization with Random Rotations
Authors:
Felix Draxler,
Lars Kühmichel,
Armand Rousselot,
Jens Müller,
Christoph Schnörr,
Ullrich Köthe
Abstract:
Gaussianization is a simple generative model that can be trained without backpropagation. It has shown compelling performance on low dimensional data. As the dimension increases, however, it has been observed that the convergence speed slows down. We show analytically that the number of required layers scales linearly with the dimension for Gaussian input. We argue that this is because the model i…
▽ More
Gaussianization is a simple generative model that can be trained without backpropagation. It has shown compelling performance on low dimensional data. As the dimension increases, however, it has been observed that the convergence speed slows down. We show analytically that the number of required layers scales linearly with the dimension for Gaussian input. We argue that this is because the model is unable to capture dependencies between dimensions. Empirically, we find the same linear increase in cost for arbitrary input $p(x)$, but observe favorable scaling for some distributions. We explore potential speed-ups and formulate challenges for further research.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Lifting Architectural Constraints of Injective Flows
Authors:
Peter Sorrenson,
Felix Draxler,
Armand Rousselot,
Sander Hummerich,
Lea Zimmermann,
Ullrich Köthe
Abstract:
Normalizing Flows explicitly maximize a full-dimensional likelihood on the training data. However, real data is typically only supported on a lower-dimensional manifold leading the model to expend significant compute on modeling noise. Injective Flows fix this by jointly learning a manifold and the distribution on it. So far, they have been limited by restrictive architectures and/or high computat…
▽ More
Normalizing Flows explicitly maximize a full-dimensional likelihood on the training data. However, real data is typically only supported on a lower-dimensional manifold leading the model to expend significant compute on modeling noise. Injective Flows fix this by jointly learning a manifold and the distribution on it. So far, they have been limited by restrictive architectures and/or high computational cost. We lift both constraints by a new efficient estimator for the maximum likelihood loss, compatible with free-form bottleneck architectures. We further show that naively learning both the data manifold and the distribution on it can lead to divergent solutions, and use this insight to motivate a stable maximum likelihood training objective. We perform extensive experiments on toy, tabular and image data, demonstrating the competitive performance of the resulting model.
△ Less
Submitted 27 June, 2024; v1 submitted 2 June, 2023;
originally announced June 2023.
-
Bose Einstein condensate as nonlinear block of a Machine Learning pipeline
Authors:
Maurus Hans,
Elinor Kath,
Marius Sparn,
Nikolas Liebster,
Felix Draxler,
Christoph Schnörr,
Helmut Strobel,
Markus K. Oberthaler
Abstract:
Physical systems can be used as an information processing substrate and with that extend traditional computing architectures. For such an application the experimental platform must guarantee pristine control of the initial state, the temporal evolution and readout. All these ingredients are provided by modern experimental realizations of atomic Bose Einstein condensates. By embedding the nonlinear…
▽ More
Physical systems can be used as an information processing substrate and with that extend traditional computing architectures. For such an application the experimental platform must guarantee pristine control of the initial state, the temporal evolution and readout. All these ingredients are provided by modern experimental realizations of atomic Bose Einstein condensates. By embedding the nonlinear evolution of a quantum gas in a Machine Learning pipeline, one can represent nonlinear functions while only linear operations on classical computing of the pipeline are necessary. We demonstrate successful regression and interpolation of a nonlinear function using a quasi one-dimensional cloud of potassium atoms and characterize the performance of our system.
△ Less
Submitted 28 April, 2023;
originally announced April 2023.
-
Finding Competence Regions in Domain Generalization
Authors:
Jens Müller,
Stefan T. Radev,
Robert Schmier,
Felix Draxler,
Carsten Rother,
Ullrich Köthe
Abstract:
We investigate a "learning to reject" framework to address the problem of silent failures in Domain Generalization (DG), where the test distribution differs from the training distribution. Assuming a mild distribution shift, we wish to accept out-of-distribution (OOD) data from a new domain whenever a model's estimated competence foresees trustworthy responses, instead of rejecting OOD data outrig…
▽ More
We investigate a "learning to reject" framework to address the problem of silent failures in Domain Generalization (DG), where the test distribution differs from the training distribution. Assuming a mild distribution shift, we wish to accept out-of-distribution (OOD) data from a new domain whenever a model's estimated competence foresees trustworthy responses, instead of rejecting OOD data outright. Trustworthiness is then predicted via a proxy incompetence score that is tightly linked to the performance of a classifier. We present a comprehensive experimental evaluation of existing proxy scores as incompetence scores for classification and highlight the resulting trade-offs between rejection rate and accuracy gain. For comparability with prior work, we focus on standard DG benchmarks and consider the effect of measuring incompetence via different learned representations in a closed versus an open world setting. Our results suggest that increasing incompetence scores are indeed predictive of reduced accuracy, leading to significant improvements of the average accuracy below a suitable incompetence threshold. However, the scores are not yet good enough to allow for a favorable accuracy/rejection trade-off in all tested domains. Surprisingly, our results also indicate that classifiers optimized for DG robustness do not outperform a naive Empirical Risk Minimization (ERM) baseline in the competence region, that is, where test samples elicit low incompetence scores.
△ Less
Submitted 21 June, 2023; v1 submitted 17 March, 2023;
originally announced March 2023.
-
The AI Ghostwriter Effect: When Users Do Not Perceive Ownership of AI-Generated Text But Self-Declare as Authors
Authors:
Fiona Draxler,
Anna Werner,
Florian Lehmann,
Matthias Hoppe,
Albrecht Schmidt,
Daniel Buschek,
Robin Welsch
Abstract:
Human-AI interaction in text production increases complexity in authorship. In two empirical studies (n1 = 30 & n2 = 96), we investigate authorship and ownership in human-AI collaboration for personalized language generation. We show an AI Ghostwriter Effect: Users do not consider themselves the owners and authors of AI-generated text but refrain from publicly declaring AI authorship. Personalizat…
▽ More
Human-AI interaction in text production increases complexity in authorship. In two empirical studies (n1 = 30 & n2 = 96), we investigate authorship and ownership in human-AI collaboration for personalized language generation. We show an AI Ghostwriter Effect: Users do not consider themselves the owners and authors of AI-generated text but refrain from publicly declaring AI authorship. Personalization of AI-generated texts did not impact the AI Ghostwriter Effect, and higher levels of participants' influence on texts increased their sense of ownership. Participants were more likely to attribute ownership to supposedly human ghostwriters than AI ghostwriters, resulting in a higher ownership-authorship discrepancy for human ghostwriters. Rationalizations for authorship in AI ghostwriters and human ghostwriters were similar. We discuss how our findings relate to psychological ownership and human-AI interaction to lay the foundations for adapting authorship frameworks and user interfaces in AI in text-generation tasks.
△ Less
Submitted 7 November, 2023; v1 submitted 6 March, 2023;
originally announced March 2023.
-
Whitening Convergence Rate of Coupling-based Normalizing Flows
Authors:
Felix Draxler,
Christoph Schnörr,
Ullrich Köthe
Abstract:
Coupling-based normalizing flows (e.g. RealNVP) are a popular family of normalizing flow architectures that work surprisingly well in practice. This calls for theoretical understanding. Existing work shows that such flows weakly converge to arbitrary data distributions. However, they make no statement about the stricter convergence criterion used in practice, the maximum likelihood loss. For the f…
▽ More
Coupling-based normalizing flows (e.g. RealNVP) are a popular family of normalizing flow architectures that work surprisingly well in practice. This calls for theoretical understanding. Existing work shows that such flows weakly converge to arbitrary data distributions. However, they make no statement about the stricter convergence criterion used in practice, the maximum likelihood loss. For the first time, we make a quantitative statement about this kind of convergence: We prove that all coupling-based normalizing flows perform whitening of the data distribution (i.e. diagonalize the covariance matrix) and derive corresponding convergence bounds that show a linear convergence rate in the depth of the flow. Numerical experiments demonstrate the implications of our theory and point at open questions.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
On the Spectral Bias of Neural Networks
Authors:
Nasim Rahaman,
Aristide Baratin,
Devansh Arpit,
Felix Draxler,
Min Lin,
Fred A. Hamprecht,
Yoshua Bengio,
Aaron Courville
Abstract:
Neural networks are known to be a class of highly expressive functions able to fit even random input-output map**s with $100\%$ accuracy. In this work, we present properties of neural networks that complement this aspect of expressivity. By using tools from Fourier analysis, we show that deep ReLU networks are biased towards low frequency functions, meaning that they cannot have local fluctuatio…
▽ More
Neural networks are known to be a class of highly expressive functions able to fit even random input-output map**s with $100\%$ accuracy. In this work, we present properties of neural networks that complement this aspect of expressivity. By using tools from Fourier analysis, we show that deep ReLU networks are biased towards low frequency functions, meaning that they cannot have local fluctuations without affecting their global behavior. Intuitively, this property is in line with the observation that over-parameterized networks find simple patterns that generalize across data samples. We also investigate how the shape of the data manifold affects expressivity by showing evidence that learning high frequencies gets \emph{easier} with increasing manifold complexity, and present a theoretical understanding of this behavior. Finally, we study the robustness of the frequency components with respect to parameter perturbation, to develop the intuition that the parameters must be finely tuned to express high frequency functions.
△ Less
Submitted 31 May, 2019; v1 submitted 22 June, 2018;
originally announced June 2018.
-
Essentially No Barriers in Neural Network Energy Landscape
Authors:
Felix Draxler,
Kambis Veschgini,
Manfred Salmhofer,
Fred A. Hamprecht
Abstract:
Training neural networks involves finding minima of a high-dimensional non-convex loss function. Knowledge of the structure of this energy landscape is sparse. Relaxing from linear interpolations, we construct continuous paths between minima of recent neural network architectures on CIFAR10 and CIFAR100. Surprisingly, the paths are essentially flat in both the training and test landscapes. This im…
▽ More
Training neural networks involves finding minima of a high-dimensional non-convex loss function. Knowledge of the structure of this energy landscape is sparse. Relaxing from linear interpolations, we construct continuous paths between minima of recent neural network architectures on CIFAR10 and CIFAR100. Surprisingly, the paths are essentially flat in both the training and test landscapes. This implies that neural networks have enough capacity for structural changes, or that these changes are small between minima. Also, each minimum has at least one vanishing Hessian eigenvalue in addition to those resulting from trivial invariance.
△ Less
Submitted 22 February, 2019; v1 submitted 2 March, 2018;
originally announced March 2018.
-
Equiangular Lines and Spherical Codes in Euclidean Space
Authors:
Igor Balla,
Felix Dräxler,
Peter Keevash,
Benny Sudakov
Abstract:
A family of lines through the origin in Euclidean space is called equiangular if any pair of lines defines the same angle. The problem of estimating the maximum cardinality of such a family in $\mathbb{R}^n$ was extensively studied for the last 70 years. Motivated by a question of Lemmens and Seidel from 1973, in this paper we prove that for every fixed angle $θ$ and sufficiently large $n$ there a…
▽ More
A family of lines through the origin in Euclidean space is called equiangular if any pair of lines defines the same angle. The problem of estimating the maximum cardinality of such a family in $\mathbb{R}^n$ was extensively studied for the last 70 years. Motivated by a question of Lemmens and Seidel from 1973, in this paper we prove that for every fixed angle $θ$ and sufficiently large $n$ there are at most $2n-2$ lines in $\mathbb{R}^n$ with common angle $θ$. Moreover, this is achievable only for $θ= \arccos(1/3)$. We also show that for any set of $k$ fixed angles, one can find at most $O(n^k)$ lines in $\mathbb{R}^n$ having these angles. This bound, conjectured by Bukh, substantially improves the estimate of Delsarte, Goethals and Seidel from 1975. Various extensions of these results to the more general setting of spherical codes will be discussed as well.
△ Less
Submitted 28 June, 2017; v1 submitted 21 June, 2016;
originally announced June 2016.