-
Micromechanically motivated finite-strain phase-field fracture model to investigate damage in crosslinked elastomers
Authors:
S. P. Josyula,
M. Brede,
O. Hesebeck,
K. Koschek,
W. Possart,
A. Wulf,
B. Zimmer,
S. Diebels
Abstract:
A micromechanically motivated phase-field damage model is proposed to investigate the fracture behaviour in crosslinked polyurethane adhesive. The crosslinked polyurethane adhesive typically show viscoelastic behaviour with geometric nonlinearity. The finite-strain viscoelastic behaviour is modelled using a micromechanical network model considering shorter and longer chain length distribution. The…
▽ More
A micromechanically motivated phase-field damage model is proposed to investigate the fracture behaviour in crosslinked polyurethane adhesive. The crosslinked polyurethane adhesive typically show viscoelastic behaviour with geometric nonlinearity. The finite-strain viscoelastic behaviour is modelled using a micromechanical network model considering shorter and longer chain length distribution. The micromechanical viscoelastic network model also consider the softening due to breakage/debonding of the short chains with increase in deformation. The micromechanical model is coupled with the phase-field damage model to investigate the crack initiation and propagation. Critical energy release rate is needed as a material property to solve phase-field equation. The energy release rate is formulated based on the polymer chain network. The numerical investigation is performed using finite element method. The force-displacement curves from the numerical analysis and experiments are compared to validate the proposed material model.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Optimal Clip** and Magnitude-aware Differentiation for Improved Quantization-aware Training
Authors:
Charbel Sakr,
Steve Dai,
Rangharajan Venkatesan,
Brian Zimmer,
William J. Dally,
Brucek Khailany
Abstract:
Data clip** is crucial in reducing noise in quantization operations and improving the achievable accuracy of quantization-aware training (QAT). Current practices rely on heuristics to set clip** threshold scalars and cannot be shown to be optimal. We propose Optimally Clipped Tensors And Vectors (OCTAV), a recursive algorithm to determine MSE-optimal clip** scalars. Derived from the fast New…
▽ More
Data clip** is crucial in reducing noise in quantization operations and improving the achievable accuracy of quantization-aware training (QAT). Current practices rely on heuristics to set clip** threshold scalars and cannot be shown to be optimal. We propose Optimally Clipped Tensors And Vectors (OCTAV), a recursive algorithm to determine MSE-optimal clip** scalars. Derived from the fast Newton-Raphson method, OCTAV finds optimal clip** scalars on the fly, for every tensor, at every iteration of the QAT routine. Thus, the QAT algorithm is formulated with provably minimum quantization noise at each step. In addition, we reveal limitations in common gradient estimation techniques in QAT and propose magnitude-aware differentiation as a remedy to further improve accuracy. Experimentally, OCTAV-enabled QAT achieves state-of-the-art accuracy on multiple tasks. These include training-from-scratch and retraining ResNets and MobileNets on ImageNet, and Squad fine-tuning using BERT models, where OCTAV-enabled QAT consistently preserves accuracy at low precision (4-to-6-bits). Our results require no modifications to the baseline training recipe, except for the insertion of quantization operations where appropriate.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
LNS-Madam: Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update
Authors:
Jiawei Zhao,
Steve Dai,
Rangharajan Venkatesan,
Brian Zimmer,
Mustafa Ali,
Ming-Yu Liu,
Brucek Khailany,
Bill Dally,
Anima Anandkumar
Abstract:
Representing deep neural networks (DNNs) in low-precision is a promising approach to enable efficient acceleration and memory reduction. Previous methods that train DNNs in low-precision typically keep a copy of weights in high-precision during the weight updates. Directly training with low-precision weights leads to accuracy degradation due to complex interactions between the low-precision number…
▽ More
Representing deep neural networks (DNNs) in low-precision is a promising approach to enable efficient acceleration and memory reduction. Previous methods that train DNNs in low-precision typically keep a copy of weights in high-precision during the weight updates. Directly training with low-precision weights leads to accuracy degradation due to complex interactions between the low-precision number systems and the learning algorithms. To address this issue, we develop a co-designed low-precision training framework, termed LNS-Madam, in which we jointly design a logarithmic number system (LNS) and a multiplicative weight update algorithm (Madam). We prove that LNS-Madam results in low quantization error during weight updates, leading to stable performance even if the precision is limited. We further propose a hardware design of LNS-Madam that resolves practical challenges in implementing an efficient datapath for LNS computations. Our implementation effectively reduces energy overhead incurred by LNS-to-integer conversion and partial sum accumulation. Experimental results show that LNS-Madam achieves comparable accuracy to full-precision counterparts with only 8 bits on popular computer vision and natural language tasks. Compared to FP32 and FP8, LNS-Madam reduces the energy consumption by over 90% and 55%, respectively.
△ Less
Submitted 23 August, 2022; v1 submitted 25 June, 2021;
originally announced June 2021.
-
VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference
Authors:
Steve Dai,
Rangharajan Venkatesan,
Haoxing Ren,
Brian Zimmer,
William J. Dally,
Brucek Khailany
Abstract:
Quantization enables efficient acceleration of deep neural networks by reducing model memory footprint and exploiting low-cost integer math hardware units. Quantization maps floating-point weights and activations in a trained model to low-bitwidth integer values using scale factors. Excessive quantization, reducing precision too aggressively, results in accuracy degradation. When scale factors are…
▽ More
Quantization enables efficient acceleration of deep neural networks by reducing model memory footprint and exploiting low-cost integer math hardware units. Quantization maps floating-point weights and activations in a trained model to low-bitwidth integer values using scale factors. Excessive quantization, reducing precision too aggressively, results in accuracy degradation. When scale factors are shared at a coarse granularity across many dimensions of each tensor, effective precision of individual elements within the tensor are limited. To reduce quantization-related accuracy loss, we propose using a separate scale factor for each small vector of ($\approx$16-64) elements within a single dimension of a tensor. To achieve an efficient hardware implementation, the per-vector scale factors can be implemented with low-bitwidth integers when calibrated using a two-level quantization scheme. We find that per-vector scaling consistently achieves better inference accuracy at low precision compared to conventional scaling techniques for popular neural networks without requiring retraining. We also modify a deep learning accelerator hardware design to study the area and energy overheads of per-vector scaling support. Our evaluation demonstrates that per-vector scaled quantization with 4-bit weights and activations achieves 37% area saving and 24% energy saving while maintaining over 75% accuracy for ResNet50 on ImageNet. 4-bit weights and 8-bit activations achieve near-full-precision accuracy for both BERT-base and BERT-large on SQuAD while reducing area by 26% compared to an 8-bit baseline.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
Closed-form detector for solid sub-pixel targets in multivariate t-distributed background clutter
Authors:
James Theiler,
Beate Zimmer,
Amanda Ziemann
Abstract:
The generalized likelihood ratio test (GLRT) is used to derive a detector for solid sub-pixel targets in hyperspectral imagery. A closed-form solution is obtained that optimizes the replacement target model when the background is a fat-tailed elliptically-contoured multivariate t-distribution. This generalizes GLRT-based detectors that have previously been derived for the replacement target model…
▽ More
The generalized likelihood ratio test (GLRT) is used to derive a detector for solid sub-pixel targets in hyperspectral imagery. A closed-form solution is obtained that optimizes the replacement target model when the background is a fat-tailed elliptically-contoured multivariate t-distribution. This generalizes GLRT-based detectors that have previously been derived for the replacement target model with Gaussian background, and for the additive target model with an elliptically-contoured background. Experiments with simulated hyperspectral data illustrate the performance of this detector in various parameter regimes.
△ Less
Submitted 28 April, 2018; v1 submitted 5 April, 2018;
originally announced April 2018.