-
Correctness Comparison of ChatGPT-4, Gemini, Claude-3, and Copilot for Spatial Tasks
Authors:
Hartwig H. Hochmair,
Levente Juhasz,
Takoda Kemp
Abstract:
Generative AI including large language models (LLMs) has recently gained significant interest in the geo-science community through its versatile task-solving capabilities including programming, arithmetic reasoning, generation of sample data, time-series forecasting, toponym recognition, or image classification. Most existing performance assessments of LLMs for spatial tasks have primarily focused…
▽ More
Generative AI including large language models (LLMs) has recently gained significant interest in the geo-science community through its versatile task-solving capabilities including programming, arithmetic reasoning, generation of sample data, time-series forecasting, toponym recognition, or image classification. Most existing performance assessments of LLMs for spatial tasks have primarily focused on ChatGPT, whereas other chatbots received less attention. To narrow this research gap, this study conducts a zero-shot correctness evaluation for a set of 76 spatial tasks across seven task categories assigned to four prominent chatbots, i.e., ChatGPT-4, Gemini, Claude-3, and Copilot. The chatbots generally performed well on tasks related to spatial literacy, GIS theory, and interpretation of programming code and functions, but revealed weaknesses in map**, code writing, and spatial reasoning. Furthermore, there was a significant difference in correctness of results between the four chatbots. Responses from repeated tasks assigned to each chatbot showed a high level of consistency in responses with matching rates of over 80% for most task categories in the four chatbots.
△ Less
Submitted 1 June, 2024; v1 submitted 4 January, 2024;
originally announced January 2024.
-
Constructing High-Order Signed Distance Maps from Computed Tomography Data with Application to Bone Morphometry
Authors:
Bryce A. Besler,
Tannis D. Kemp,
Nils D. Forkert,
Steven K. Boyd
Abstract:
An algorithm is presented for constructing high-order signed distance fields for two phase materials imaged with computed tomography. The signed distance field is high-order in that it is free of the quantization artifact associated with the distance transform of sampled signals. The narrowband is solved using a closest point algorithm extended for implicit embeddings that are not a signed distanc…
▽ More
An algorithm is presented for constructing high-order signed distance fields for two phase materials imaged with computed tomography. The signed distance field is high-order in that it is free of the quantization artifact associated with the distance transform of sampled signals. The narrowband is solved using a closest point algorithm extended for implicit embeddings that are not a signed distance field. The high-order fast swee** algorithm is used to extend the narrowband to the remainder of the domain. The order of accuracy of the narrowband and extension methods are verified on ideal implicit surfaces. The method is applied to ten excised cubes of bovine trabecular bone. Localization of the surface, estimation of phase densities, and local morphometry is validated with these subjects. Since the embedding is high-order, gradients and thus curvatures can be accurately estimated locally in the image data.
△ Less
Submitted 1 November, 2021;
originally announced November 2021.
-
Local Morphometry of Closed, Implicit Surfaces
Authors:
Bryce A Besler,
Tannis D. Kemp,
Andrew S. Michalski,
Nils D. Forkert,
Steven K. Boyd
Abstract:
Anatomical structures such as the hippocampus, liver, and bones can be analyzed as orientable, closed surfaces. This permits the computation of volume, surface area, mean curvature, Gaussian curvature, and the Euler-Poincaré characteristic as well as comparison of these morphometrics between structures of different topology. The structures are commonly represented implicitly in curve evolution pro…
▽ More
Anatomical structures such as the hippocampus, liver, and bones can be analyzed as orientable, closed surfaces. This permits the computation of volume, surface area, mean curvature, Gaussian curvature, and the Euler-Poincaré characteristic as well as comparison of these morphometrics between structures of different topology. The structures are commonly represented implicitly in curve evolution problems as the zero level set of an embedding. Practically, binary images of anatomical structures are embedded using a signed distance transform. However, quantization prevents the accurate computation of curvatures, leading to considerable errors in morphometry. This paper presents a fast, simple embedding procedure for accurate local morphometry as the zero crossing of the Gaussian blurred binary image. The proposed method was validated based on the femur and fourth lumbar vertebrae of 50 clinical computed tomography datasets. The results show that the signed distance transform leads to large quantization errors in the computed local curvature. Global validation of morphometry using regression and Bland-Altman analysis revealed that the coefficient of determination for the average mean curvature is improved from 93.8% with the signed distance transform to 100% with the proposed method. For the surface area, the proportional bias is improved from -5.0% for the signed distance transform to +0.6% for the proposed method. The Euler-Poincaré characteristic is improved from unusable in the signed distance transform to 98% accuracy for the proposed method. The proposed method enables an improved local and global evaluation of curvature for purposes of morphometry on closed, implicit surfaces.
△ Less
Submitted 29 July, 2021;
originally announced August 2021.
-
Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives
Authors:
Takuya Narihira,
Javier Alonsogarcia,
Fabien Cardinaux,
Akio Hayakawa,
Masato Ishii,
Kazunori Iwaki,
Thomas Kemp,
Yoshiyuki Kobayashi,
Lukas Mauch,
Akira Nakamura,
Yukio Obuchi,
Andrew Shin,
Kenji Suzuki,
Stephen Tiedmann,
Stefan Uhlich,
Takuya Yashima,
Kazuki Yoshiyama
Abstract:
While there exist a plethora of deep learning tools and frameworks, the fast-growing complexity of the field brings new demands and challenges, such as more flexible network design, speedy computation on distributed setting, and compatibility between different tools. In this paper, we introduce Neural Network Libraries (https://nnabla.org), a deep learning framework designed from engineer's perspe…
▽ More
While there exist a plethora of deep learning tools and frameworks, the fast-growing complexity of the field brings new demands and challenges, such as more flexible network design, speedy computation on distributed setting, and compatibility between different tools. In this paper, we introduce Neural Network Libraries (https://nnabla.org), a deep learning framework designed from engineer's perspective, with emphasis on usability and compatibility as its core design principles. We elaborate on each of our design principles and its merits, and validate our attempts via experiments.
△ Less
Submitted 21 June, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
Efficient Sampling for Predictor-Based Neural Architecture Search
Authors:
Lukas Mauch,
Stephen Tiedemann,
Javier Alonso Garcia,
Bac Nguyen Cong,
Kazuki Yoshiyama,
Fabien Cardinaux,
Thomas Kemp
Abstract:
Recently, predictor-based algorithms emerged as a promising approach for neural architecture search (NAS). For NAS, we typically have to calculate the validation accuracy of a large number of Deep Neural Networks (DNNs), what is computationally complex. Predictor-based NAS algorithms address this problem. They train a proxy model that can infer the validation accuracy of DNNs directly from their n…
▽ More
Recently, predictor-based algorithms emerged as a promising approach for neural architecture search (NAS). For NAS, we typically have to calculate the validation accuracy of a large number of Deep Neural Networks (DNNs), what is computationally complex. Predictor-based NAS algorithms address this problem. They train a proxy model that can infer the validation accuracy of DNNs directly from their network structure. During optimization, the proxy can be used to narrow down the number of architectures for which the true validation accuracy must be computed, what makes predictor-based algorithms sample efficient. Usually, we compute the proxy for all DNNs in the network search space and pick those that maximize the proxy as candidates for optimization. However, that is intractable in practice, because the search spaces are often very large and contain billions of network architectures. The contributions of this paper are threefold: 1) We define a sample efficiency gain to compare different predictor-based NAS algorithms. 2) We conduct experiments on the NASBench-101 dataset and show that the sample efficiency of predictor-based algorithms decreases dramatically if the proxy is only computed for a subset of the search space. 3) We show that if we choose the subset of the search space on which the proxy is evaluated in a smart way, the sample efficiency of the original predictor-based algorithm that has access to the full search space can be regained. This is an important step to make predictor-based NAS algorithms useful, in practice.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Speech Synthesis and Control Using Differentiable DSP
Authors:
Giorgio Fabbro,
Vladimir Golkov,
Thomas Kemp,
Daniel Cremers
Abstract:
Modern text-to-speech systems are able to produce natural and high-quality speech, but speech contains factors of variation (e.g. pitch, rhythm, loudness, timbre)\ that text alone cannot contain. In this work we move towards a speech synthesis system that can produce diverse speech renditions of a text by allowing (but not requiring) explicit control over the various factors of variation. We propo…
▽ More
Modern text-to-speech systems are able to produce natural and high-quality speech, but speech contains factors of variation (e.g. pitch, rhythm, loudness, timbre)\ that text alone cannot contain. In this work we move towards a speech synthesis system that can produce diverse speech renditions of a text by allowing (but not requiring) explicit control over the various factors of variation. We propose a new neural vocoder that offers control of such factors of variation. This is achieved by employing differentiable digital signal processing (DDSP) (previously used only for music rather than speech), which exposes these factors of variation. The results show that the proposed approach can produce natural speech with realistic timbre, and individual factors of variation can be freely controlled.
△ Less
Submitted 28 October, 2020;
originally announced October 2020.
-
Iteratively Training Look-Up Tables for Network Quantization
Authors:
Fabien Cardinaux,
Stefan Uhlich,
Kazuki Yoshiyama,
Javier Alonso Garcia,
Lukas Mauch,
Stephen Tiedemann,
Thomas Kemp,
Akira Nakamura
Abstract:
Operating deep neural networks (DNNs) on devices with limited resources requires the reduction of their memory as well as computational footprint. Popular reduction methods are network quantization or pruning, which either reduce the word length of the network parameters or remove weights from the network if they are not needed. In this article we discuss a general framework for network reduction…
▽ More
Operating deep neural networks (DNNs) on devices with limited resources requires the reduction of their memory as well as computational footprint. Popular reduction methods are network quantization or pruning, which either reduce the word length of the network parameters or remove weights from the network if they are not needed. In this article we discuss a general framework for network reduction which we call `Look-Up Table Quantization` (LUT-Q). For each layer, we learn a value dictionary and an assignment matrix to represent the network weights. We propose a special solver which combines gradient descent and a one-step k-means update to learn both the value dictionaries and assignment matrices iteratively. This method is very flexible: by constraining the value dictionary, many different reduction problems such as non-uniform network quantization, training of multiplierless networks, network pruning or simultaneous quantization and pruning can be implemented without changing the solver. This flexibility of the LUT-Q method allows us to use the same method to train networks for different hardware capabilities.
△ Less
Submitted 12 November, 2019;
originally announced November 2019.
-
Mixed Precision DNNs: All you need is a good parametrization
Authors:
Stefan Uhlich,
Lukas Mauch,
Fabien Cardinaux,
Kazuki Yoshiyama,
Javier Alonso Garcia,
Stephen Tiedemann,
Thomas Kemp,
Akira Nakamura
Abstract:
Efficient deep neural network (DNN) inference on mobile or embedded devices typically involves quantization of the network parameters and activations. In particular, mixed precision networks achieve better performance than networks with homogeneous bitwidth for the same size constraint. Since choosing the optimal bitwidths is not straight forward, training methods, which can learn them, are desira…
▽ More
Efficient deep neural network (DNN) inference on mobile or embedded devices typically involves quantization of the network parameters and activations. In particular, mixed precision networks achieve better performance than networks with homogeneous bitwidth for the same size constraint. Since choosing the optimal bitwidths is not straight forward, training methods, which can learn them, are desirable. Differentiable quantization with straight-through gradients allows to learn the quantizer's parameters using gradient methods. We show that a suited parametrization of the quantizer is the key to achieve a stable training and a good final performance. Specifically, we propose to parametrize the quantizer with the step size and dynamic range. The bitwidth can then be inferred from them. Other parametrizations, which explicitly use the bitwidth, consistently perform worse. We confirm our findings with experiments on CIFAR-10 and ImageNet and we obtain mixed precision DNNs with learned quantization parameters, achieving state-of-the-art performance.
△ Less
Submitted 22 May, 2020; v1 submitted 27 May, 2019;
originally announced May 2019.
-
Iteratively Training Look-Up Tables for Network Quantization
Authors:
Fabien Cardinaux,
Stefan Uhlich,
Kazuki Yoshiyama,
Javier Alonso García,
Stephen Tiedemann,
Thomas Kemp,
Akira Nakamura
Abstract:
Operating deep neural networks on devices with limited resources requires the reduction of their memory footprints and computational requirements. In this paper we introduce a training method, called look-up table quantization, LUT-Q, which learns a dictionary and assigns each weight to one of the dictionary's values. We show that this method is very flexible and that many other techniques can be…
▽ More
Operating deep neural networks on devices with limited resources requires the reduction of their memory footprints and computational requirements. In this paper we introduce a training method, called look-up table quantization, LUT-Q, which learns a dictionary and assigns each weight to one of the dictionary's values. We show that this method is very flexible and that many other techniques can be seen as special cases of LUT-Q. For example, we can constrain the dictionary trained with LUT-Q to generate networks with pruned weight matrices or restrict the dictionary to powers-of-two to avoid the need for multiplications. In order to obtain fully multiplier-less networks, we also introduce a multiplier-less version of batch normalization. Extensive experiments on image recognition and object detection tasks show that LUT-Q consistently achieves better performance than other methods with the same quantization bitwidth.
△ Less
Submitted 13 November, 2018;
originally announced November 2018.
-
Improving DNN-based Music Source Separation using Phase Features
Authors:
Joachim Muth,
Stefan Uhlich,
Nathanael Perraudin,
Thomas Kemp,
Fabien Cardinaux,
Yuki Mitsufuji
Abstract:
Music source separation with deep neural networks typically relies only on amplitude features. In this paper we show that additional phase features can improve the separation performance. Using the theoretical relationship between STFT phase and amplitude, we conjecture that derivatives of the phase are a good feature representation opposed to the raw phase. We verify this conjecture experimentall…
▽ More
Music source separation with deep neural networks typically relies only on amplitude features. In this paper we show that additional phase features can improve the separation performance. Using the theoretical relationship between STFT phase and amplitude, we conjecture that derivatives of the phase are a good feature representation opposed to the raw phase. We verify this conjecture experimentally and propose a new DNN architecture which combines amplitude and phase. This joint approach achieves a better signal-to distortion ratio on the DSD100 dataset for all instruments compared to a network that uses only amplitude features. Especially, the bass instrument benefits from the phase information.
△ Less
Submitted 16 July, 2018; v1 submitted 7 July, 2018;
originally announced July 2018.