Search | arXiv e-print repository

Protecting Privacy Through Approximating Optimal Parameters for Sequence Unlearning in Language Models

Authors: Dohyun Lee, Daniel Rim, Minseok Choi, Jaegul Choo

Abstract: Although language models (LMs) demonstrate exceptional capabilities on various tasks, they are potentially vulnerable to extraction attacks, which represent a significant privacy risk. To mitigate the privacy concerns of LMs, machine unlearning has emerged as an important research area, which is utilized to induce the LM to selectively forget about some of its training data. While completely retra… ▽ More Although language models (LMs) demonstrate exceptional capabilities on various tasks, they are potentially vulnerable to extraction attacks, which represent a significant privacy risk. To mitigate the privacy concerns of LMs, machine unlearning has emerged as an important research area, which is utilized to induce the LM to selectively forget about some of its training data. While completely retraining the model will guarantee successful unlearning and privacy assurance, it is impractical for LMs, as it would be time-consuming and resource-intensive. Prior works efficiently unlearn the target token sequences, but upon subsequent iterations, the LM displays significant degradation in performance. In this work, we propose Privacy Protection via Optimal Parameters (POP), a novel unlearning method that effectively forgets the target token sequences from the pretrained LM by applying optimal gradient updates to the parameters. Inspired by the gradient derivation of complete retraining, we approximate the optimal training objective that successfully unlearns the target sequence while retaining the knowledge from the rest of the training data. Experimental results demonstrate that POP exhibits remarkable retention performance post-unlearning across 9 classification and 4 dialogue benchmarks, outperforming the state-of-the-art by a large margin. Furthermore, we introduce Remnant Memorization Accuracy that quantifies privacy risks based on token likelihood and validate its effectiveness through both qualitative and quantitative analyses. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Accepted to ACL2024 findings

arXiv:2406.12329 [pdf, other]

SNAP: Unlearning Selective Knowledge in Large Language Models with Negative Instructions

Authors: Minseok Choi, Daniel Rim, Dohyun Lee, Jaegul Choo

Abstract: Instruction-following large language models (LLMs), such as ChatGPT, have become increasingly popular with the general audience, many of whom are incorporating them into their daily routines. However, these LLMs inadvertently disclose personal or copyrighted information, which calls for a machine unlearning method to remove selective knowledge. Previous attempts sought to forget the link between t… ▽ More Instruction-following large language models (LLMs), such as ChatGPT, have become increasingly popular with the general audience, many of whom are incorporating them into their daily routines. However, these LLMs inadvertently disclose personal or copyrighted information, which calls for a machine unlearning method to remove selective knowledge. Previous attempts sought to forget the link between the target information and its associated entities, but it rather led to generating undesirable responses about the target, compromising the end-user experience. In this work, we propose SNAP, an innovative framework designed to selectively unlearn information by 1) training an LLM with negative instructions to generate obliterated responses, 2) augmenting hard positives to retain the original LLM performance, and 3) applying the novel Wasserstein regularization to ensure adequate deviation from the initial weights of the LLM. We evaluate our framework on various NLP benchmarks and demonstrate that our approach retains the original LLM capabilities, while successfully unlearning the specified information. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 16 pages, 5 figures

arXiv:2406.05694 [pdf, other]

A Low Rank Neural Representation of Entropy Solutions

Authors: Donsub Rim, Gerrit Welper

Abstract: We construct a new representation of entropy solutions to nonlinear scalar conservation laws with a smooth convex flux function in a single spatial dimension. The representation is a generalization of the method of characteristics and posseses a compositional form. While it is a nonlinear representation, the embedded dynamics of the solution in the time variable is linear. This representation is t… ▽ More We construct a new representation of entropy solutions to nonlinear scalar conservation laws with a smooth convex flux function in a single spatial dimension. The representation is a generalization of the method of characteristics and posseses a compositional form. While it is a nonlinear representation, the embedded dynamics of the solution in the time variable is linear. This representation is then discretized as a manifold of implicit neural representations where the feedforward neural network architecture has a low rank structure. Finally, we show that the low rank neural representation with a fixed number of layers and a small number of coefficients can approximate any entropy solution regardless of the complexity of the shock topology, while retaining the linearity of the embedded dynamics. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 42 pages, 9 figures

MSC Class: 68T07; 41A46; 41A25; 65N15; 35L65

arXiv:2401.14240 [pdf, other]

doi 10.1109/ICTC58733.2023.10393433

Enhanced Labeling Technique for Reddit Text and Fine-Tuned Longformer Models for Classifying Depression Severity in English and Luganda

Authors: Richard Kimera, Daniela N. Rim, Joseph Kirabira, Ubong Godwin Udomah, Heeyoul Choi

Abstract: Depression is a global burden and one of the most challenging mental health conditions to control. Experts can detect its severity early using the Beck Depression Inventory (BDI) questionnaire, administer appropriate medication to patients, and impede its progression. Due to the fear of potential stigmatization, many patients turn to social media platforms like Reddit for advice and assistance at… ▽ More Depression is a global burden and one of the most challenging mental health conditions to control. Experts can detect its severity early using the Beck Depression Inventory (BDI) questionnaire, administer appropriate medication to patients, and impede its progression. Due to the fear of potential stigmatization, many patients turn to social media platforms like Reddit for advice and assistance at various stages of their journey. This research extracts text from Reddit to facilitate the diagnostic process. It employs a proposed labeling approach to categorize the text and subsequently fine-tunes the Longformer model. The model's performance is compared against baseline models, including Naive Bayes, Random Forest, Support Vector Machines, and Gradient Boosting. Our findings reveal that the Longformer model outperforms the baseline models in both English (48%) and Luganda (45%) languages on a custom-made dataset. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: In IEEE Proceedings of the 14th International Conference on ICT Convergence (ICTC), Jeju, Korea, October 2023

arXiv:2312.08553 [pdf, other]

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

Authors: Shao** Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Zhonglin Han, Jian Li, Amir Yazdanbakhsh, Shivani Agrawal

Abstract: End-to-end automatic speech recognition (ASR) models have seen revolutionary quality gains with the recent development of large-scale universal speech models (USM). However, deploying these massive USMs is extremely expensive due to the enormous memory usage and computational cost. Therefore, model compression is an important research topic to fit USM-based ASR under budget in real-world scenarios… ▽ More End-to-end automatic speech recognition (ASR) models have seen revolutionary quality gains with the recent development of large-scale universal speech models (USM). However, deploying these massive USMs is extremely expensive due to the enormous memory usage and computational cost. Therefore, model compression is an important research topic to fit USM-based ASR under budget in real-world scenarios. In this study, we propose a USM fine-tuning approach for ASR, with a low-bit quantization and N:M structured sparsity aware paradigm on the model weights, reducing the model complexity from parameter precision and matrix topology perspectives. We conducted extensive experiments with a 2-billion parameter USM on a large-scale voice search dataset to evaluate our proposed method. A series of ablation studies validate the effectiveness of up to int4 quantization and 2:4 sparsity. However, a single compression technique fails to recover the performance well under extreme setups including int2 quantization and 1:4 sparsity. By contrast, our proposed method can compress the model to have 9.4% of the size, at the cost of only 7.3% relative word error rate (WER) regressions. We also provided in-depth analyses on the results and discussions on the limitations and potential solutions, which would be valuable for future studies. △ Less

Submitted 16 January, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

Comments: Accepted by ICASSP 2024. Preprint

arXiv:2310.14391 [pdf, ps, other]

Performance bounds for Reduced Order Models with Application to Parametric Transport

Authors: D. Rim, G. Welper

Abstract: The Kolmogorov $n$-width is an established benchmark to judge the performance of reduced basis and similar methods that produce linear reduced spaces. Although immensely successful in the elliptic regime, this width, shows unsatisfactory slow convergence rates for transport dominated problems. While this has triggered a large amount of work on nonlinear model reduction techniques, we are lacking a… ▽ More The Kolmogorov $n$-width is an established benchmark to judge the performance of reduced basis and similar methods that produce linear reduced spaces. Although immensely successful in the elliptic regime, this width, shows unsatisfactory slow convergence rates for transport dominated problems. While this has triggered a large amount of work on nonlinear model reduction techniques, we are lacking a benchmark to evaluate their optimal performance. Nonlinear benchmarks like manifold/stable/Lipschitz width applied to the solution manifold are often trivial if the degrees of freedom exceed the parameter dimension and ignore desirable structure as offline/online decompositions. In this paper, we show that the same benchmarks applied to the full reduced order model pipeline from PDE to parametric quantity of interest provide non-trivial benchmarks and we prove lower bounds for transport equations. △ Less

Submitted 22 October, 2023; originally announced October 2023.

MSC Class: 41A46; 41A25; 65N15

arXiv:2310.09528 [pdf, other]

Hypernetwork-based Meta-Learning for Low-Rank Physics-Informed Neural Networks

Authors: Woo** Cho, Kook** Lee, Donsub Rim, Noseong Park

Abstract: In various engineering and applied science applications, repetitive numerical simulations of partial differential equations (PDEs) for varying input parameters are often required (e.g., aircraft shape optimization over many design parameters) and solvers are required to perform rapid execution. In this study, we suggest a path that potentially opens up a possibility for physics-informed neural net… ▽ More In various engineering and applied science applications, repetitive numerical simulations of partial differential equations (PDEs) for varying input parameters are often required (e.g., aircraft shape optimization over many design parameters) and solvers are required to perform rapid execution. In this study, we suggest a path that potentially opens up a possibility for physics-informed neural networks (PINNs), emerging deep-learning-based solvers, to be considered as one such solver. Although PINNs have pioneered a proper integration of deep-learning and scientific computing, they require repetitive time-consuming training of neural networks, which is not suitable for many-query scenarios. To address this issue, we propose a lightweight low-rank PINNs containing only hundreds of model parameters and an associated hypernetwork-based meta-learning algorithm, which allows efficient approximation of solutions of PDEs for varying ranges of PDE input parameters. Moreover, we show that the proposed method is effective in overcoming a challenging issue, known as "failure modes" of PINNs. △ Less

Submitted 14 October, 2023; originally announced October 2023.

arXiv:2308.08153 [pdf, other]

Fast Training of NMT Model with Data Sorting

Authors: Daniela N. Rim, Kimera Richard, Heeyoul Choi

Abstract: The Transformer model has revolutionized Natural Language Processing tasks such as Neural Machine Translation, and many efforts have been made to study the Transformer architecture, which increased its efficiency and accuracy. One potential area for improvement is to address the computation of empty tokens that the Transformer computes only to discard them later, leading to an unnecessary computat… ▽ More The Transformer model has revolutionized Natural Language Processing tasks such as Neural Machine Translation, and many efforts have been made to study the Transformer architecture, which increased its efficiency and accuracy. One potential area for improvement is to address the computation of empty tokens that the Transformer computes only to discard them later, leading to an unnecessary computational burden. To tackle this, we propose an algorithm that sorts translation sentence pairs based on their length before batching, minimizing the waste of computing power. Since the amount of sorting could violate the independent and identically distributed (i.i.d) data assumption, we sort the data partially. In experiments, we apply the proposed method to English-Korean and English-Luganda language pairs for machine translation and show that there are gains in computational time while maintaining the performance. Our method is independent of architectures, so that it can be easily integrated into any training process with flexible data lengths. △ Less

Submitted 16 August, 2023; originally announced August 2023.

arXiv:2305.16619 [pdf, other]

2-bit Conformer quantization for automatic speech recognition

Authors: Oleg Rybakov, Phoenix Meadowlark, Shao** Ding, David Qiu, Jian Li, David Rim, Yanzhang He

Abstract: Large speech models are rapidly gaining traction in research community. As a result, model compression has become an important topic, so that these models can fit in memory and be served with reduced cost. Practical approaches for compressing automatic speech recognition (ASR) model use int8 or int4 weight quantization. In this study, we propose to develop 2-bit ASR models. We explore the impact o… ▽ More Large speech models are rapidly gaining traction in research community. As a result, model compression has become an important topic, so that these models can fit in memory and be served with reduced cost. Practical approaches for compressing automatic speech recognition (ASR) model use int8 or int4 weight quantization. In this study, we propose to develop 2-bit ASR models. We explore the impact of symmetric and asymmetric quantization combined with sub-channel quantization and clip** on both LibriSpeech dataset and large-scale training data. We obtain a lossless 2-bit Conformer model with 32% model size reduction when compared to state of the art 4-bit Conformer model for LibriSpeech. With the large-scale training data, we obtain a 2-bit Conformer model with over 40% model size reduction against the 4-bit version at the cost of 17% relative word error rate degradation △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: submitted to Interspeech

arXiv:2305.15536 [pdf, other]

RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models

Authors: David Qiu, David Rim, Shao** Ding, Oleg Rybakov, Yanzhang He

Abstract: With the rapid increase in the size of neural networks, model compression has become an important area of research. Quantization is an effective technique at decreasing the model size, memory access, and compute load of large models. Despite recent advances in quantization aware training (QAT) technique, most papers present evaluations that are focused on computer vision tasks, which have differen… ▽ More With the rapid increase in the size of neural networks, model compression has become an important area of research. Quantization is an effective technique at decreasing the model size, memory access, and compute load of large models. Despite recent advances in quantization aware training (QAT) technique, most papers present evaluations that are focused on computer vision tasks, which have different training dynamics compared to sequence tasks. In this paper, we first benchmark the impact of popular techniques such as straight through estimator, pseudo-quantization noise, learnable scale parameter, clip**, etc. on 4-bit seq2seq models across a suite of speech recognition datasets ranging from 1,000 hours to 1 million hours, as well as one machine translation dataset to illustrate its applicability outside of speech. Through the experiments, we report that noise based QAT suffers when there is insufficient regularization signal flowing back to the quantization scale. We propose low complexity changes to the QAT process to improve model accuracy (outperforming popular learnable scale and clip** methods). With the improved accuracy, it opens up the possibility to exploit some of the other benefits of noise based QAT: 1) training a single model that performs well in mixed precision mode and 2) improved generalization on long form speech recognition. △ Less

Submitted 24 May, 2023; originally announced May 2023.

arXiv:2305.04720 [pdf, other]

DEnsity: Open-domain Dialogue Evaluation Metric using Density Estimation

Authors: ChaeHun Park, Seungil Chad Lee, Daniel Rim, Jaegul Choo

Abstract: Despite the recent advances in open-domain dialogue systems, building a reliable evaluation metric is still a challenging problem. Recent studies proposed learnable metrics based on classification models trained to distinguish the correct response. However, neural classifiers are known to make overly confident predictions for examples from unseen distributions. We propose DEnsity, which evaluates… ▽ More Despite the recent advances in open-domain dialogue systems, building a reliable evaluation metric is still a challenging problem. Recent studies proposed learnable metrics based on classification models trained to distinguish the correct response. However, neural classifiers are known to make overly confident predictions for examples from unseen distributions. We propose DEnsity, which evaluates a response by utilizing density estimation on the feature space derived from a neural classifier. Our metric measures how likely a response would appear in the distribution of human conversations. Moreover, to improve the performance of DEnsity, we utilize contrastive learning to further compress the feature space. Experiments on multiple response evaluation datasets show that DEnsity correlates better with human evaluations than the existing metrics. Our code is available at https://github.com/ddehun/DEnsity. △ Less

Submitted 25 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: Findings of ACL2023. 13 pages

arXiv:2301.02773 [pdf]

doi 10.5626/JOK.2022.49.11.1009

Building a Parallel Corpus and Training Translation Models Between Luganda and English

Authors: Richard Kimera, Daniela N. Rim, Heeyoul Choi

Abstract: Neural machine translation (NMT) has achieved great successes with large datasets, so NMT is more premised on high-resource languages. This continuously underpins the low resource languages such as Luganda due to the lack of high-quality parallel corpora, so even 'Google translate' does not serve Luganda at the time of this writing. In this paper, we build a parallel corpus with 41,070 pairwise se… ▽ More Neural machine translation (NMT) has achieved great successes with large datasets, so NMT is more premised on high-resource languages. This continuously underpins the low resource languages such as Luganda due to the lack of high-quality parallel corpora, so even 'Google translate' does not serve Luganda at the time of this writing. In this paper, we build a parallel corpus with 41,070 pairwise sentences for Luganda and English which is based on three different open-sourced corpora. Then, we train NMT models with hyper-parameter search on the dataset. Experiments gave us a BLEU score of 21.28 from Luganda to English and 17.47 from English to Luganda. Some translation examples show high quality of the translation. We believe that our model is the first Luganda-English NMT model. The bilingual dataset we built will be available to the public. △ Less

Submitted 6 January, 2023; originally announced January 2023.

Journal ref: Journal of KIISE, Vol. 49, No. 11, pp. 1009-1016, 2022. 11

arXiv:2210.04958 [pdf, other]

Mining Causality from Continuous-time Dynamics Models: An Application to Tsunami Forecasting

Authors: Fan Wu, Sanghyun Hong, Donsub Rim, Noseong Park, Kook** Lee

Abstract: Continuous-time dynamics models, such as neural ordinary differential equations, have enabled the modeling of underlying dynamics in time-series data and accurate forecasting. However, parameterization of dynamics using a neural network makes it difficult for humans to identify causal structures in the data. In consequence, this opaqueness hinders the use of these models in the domains where captu… ▽ More Continuous-time dynamics models, such as neural ordinary differential equations, have enabled the modeling of underlying dynamics in time-series data and accurate forecasting. However, parameterization of dynamics using a neural network makes it difficult for humans to identify causal structures in the data. In consequence, this opaqueness hinders the use of these models in the domains where capturing causal relationships carries the same importance as accurate predictions, e.g., tsunami forecasting. In this paper, we address this challenge by proposing a mechanism for mining causal structures from continuous-time models. We train models to capture the causal structure by enforcing sparsity in the weights of the input layers of the dynamics models. We first verify the effectiveness of our method in the scenario where the exact causal-structures of time-series are known as a priori. We next apply our method to a real-world problem, namely tsunami forecasting, where the exact causal-structures are difficult to characterize. Experimental results show that the proposed method is effective in learning physically-consistent causal relationships while achieving high forecasting accuracy. △ Less

Submitted 13 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

arXiv:2109.09075 [pdf, other]

Adversarial Training with Contrastive Learning in NLP

Authors: Daniela N. Rim, DongNyeong Heo, Heeyoul Choi

Abstract: For years, adversarial training has been extensively studied in natural language processing (NLP) settings. The main goal is to make models robust so that similar inputs derive in semantically similar outcomes, which is not a trivial problem since there is no objective measure of semantic similarity in language. Previous works use an external pre-trained NLP model to tackle this challenge, introdu… ▽ More For years, adversarial training has been extensively studied in natural language processing (NLP) settings. The main goal is to make models robust so that similar inputs derive in semantically similar outcomes, which is not a trivial problem since there is no objective measure of semantic similarity in language. Previous works use an external pre-trained NLP model to tackle this challenge, introducing an extra training stage with huge memory consumption during training. However, the recent popular approach of contrastive learning in language processing hints a convenient way of obtaining such similarity restrictions. The main advantage of the contrastive learning approach is that it aims for similar data points to be mapped close to each other and further from different ones in the representation space. In this work, we propose adversarial training with contrastive learning (ATCL) to adversarially train a language processing task using the benefits of contrastive learning. The core idea is to make linear perturbations in the embedding space of the input via fast gradient methods (FGM) and train the model to keep the original and perturbed representations close via contrastive learning. In NLP experiments, we applied ATCL to language modeling and neural machine translation tasks. The results show not only an improvement in the quantitative (perplexity and BLEU) scores when compared to the baselines, but ATCL also achieves good qualitative results in the semantic level for both tasks without using a pre-trained model. △ Less

Submitted 19 September, 2021; originally announced September 2021.

arXiv:2105.11681 [pdf, other]

Deep Neural Networks and End-to-End Learning for Audio Compression

Authors: Daniela N. Rim, Inseon Jang, Heeyoul Choi

Abstract: Recent achievements in end-to-end deep learning have encouraged the exploration of tasks dealing with highly structured data with unified deep network models. Having such models for compressing audio signals has been challenging since it requires discrete representations that are not easy to train with end-to-end backpropagation. In this paper, we present an end-to-end deep learning approach that… ▽ More Recent achievements in end-to-end deep learning have encouraged the exploration of tasks dealing with highly structured data with unified deep network models. Having such models for compressing audio signals has been challenging since it requires discrete representations that are not easy to train with end-to-end backpropagation. In this paper, we present an end-to-end deep learning approach that combines recurrent neural networks (RNNs) within the training strategy of variational autoencoders (VAEs) with a binary representation of the latent space. We apply a reparametrization trick for the Bernoulli distribution for the discrete representations, which allows smooth backpropagation. In addition, our approach allows the separation of the encoder and decoder, which is necessary for compression tasks. To our best knowledge, this is the first end-to-end learning for a single audio compression model with RNNs, and our model achieves a Signal to Distortion Ratio (SDR) of 20.54. △ Less

Submitted 13 July, 2021; v1 submitted 25 May, 2021; originally announced May 2021.

arXiv:2010.05360 [pdf, other]

A range characterization of the single-quadrant ADRT

Authors: Weilin Li, Kui Ren, Donsub Rim

Abstract: This work characterizes the range of the single-quadrant approximate discrete Radon transform (ADRT) of square images. The characterization follows from a set of linear constraints on the codomain. We show that for data satisfying these constraints, the exact and fast inversion formula [Rim, Appl. Math. Lett. 102 106159, 2020] yields a square image in a stable manner. The range characterization is… ▽ More This work characterizes the range of the single-quadrant approximate discrete Radon transform (ADRT) of square images. The characterization follows from a set of linear constraints on the codomain. We show that for data satisfying these constraints, the exact and fast inversion formula [Rim, Appl. Math. Lett. 102 106159, 2020] yields a square image in a stable manner. The range characterization is obtained by first showing that the ADRT is a bijection between images supported on infinite half-strips, then identifying the linear subspaces that stay finitely supported under the inversion formula. △ Less

Submitted 22 March, 2022; v1 submitted 11 October, 2020; originally announced October 2020.

MSC Class: 44A12; 65R10; 92C55; 68U05; 15A04

arXiv:2007.13977 [pdf, other]

Depth separation for reduced deep networks in nonlinear model reduction: Distilling shock waves in nonlinear hyperbolic problems

Authors: Donsub Rim, Luca Venturi, Joan Bruna, Benjamin Peherstorfer

Abstract: Classical reduced models are low-rank approximations using a fixed basis designed to achieve dimensionality reduction of large-scale systems. In this work, we introduce reduced deep networks, a generalization of classical reduced models formulated as deep neural networks. We prove depth separation results showing that reduced deep networks approximate solutions of parametrized hyperbolic partial d… ▽ More Classical reduced models are low-rank approximations using a fixed basis designed to achieve dimensionality reduction of large-scale systems. In this work, we introduce reduced deep networks, a generalization of classical reduced models formulated as deep neural networks. We prove depth separation results showing that reduced deep networks approximate solutions of parametrized hyperbolic partial differential equations with approximation error $ε$ with $\mathcal{O}(|\log(ε)|)$ degrees of freedom, even in the nonlinear setting where solutions exhibit shock waves. We also show that classical reduced models achieve exponentially worse approximation rates by establishing lower bounds on the relevant Kolmogorov $N$-widths. △ Less

Submitted 27 July, 2020; originally announced July 2020.

MSC Class: 68T07; 65M22; 41A46

arXiv:1912.13024 [pdf, other]

Manifold Approximations via Transported Subspaces: Model reduction for transport-dominated problems

Authors: Donsub Rim, Benjamin Peherstorfer, Kyle T. Mandli

Abstract: This work presents a method for constructing online-efficient reduced models of large-scale systems governed by parametrized nonlinear scalar conservation laws. The solution manifolds induced by transport-dominated problems such as hyperbolic conservation laws typically exhibit nonlinear structures, which means that traditional model reduction methods based on linear approximations are inefficient… ▽ More This work presents a method for constructing online-efficient reduced models of large-scale systems governed by parametrized nonlinear scalar conservation laws. The solution manifolds induced by transport-dominated problems such as hyperbolic conservation laws typically exhibit nonlinear structures, which means that traditional model reduction methods based on linear approximations are inefficient when applied to these problems. In contrast, the approach introduced in this work derives reduced approximations that are nonlinear by explicitly composing global transport dynamics with locally linear approximations of the solution manifolds. A time-step** scheme evolves the nonlinear reduced models by transporting local approximation spaces along the characteristic curves of the governing equations. The proposed computational procedure allows an offline/online decomposition and is online-efficient in the sense that the complexity of accurately time-step** the nonlinear reduced model is independent of that of the full model. Numerical experiments with transport through heterogeneous media and the Burgers' equation show orders of magnitude speedups of the proposed nonlinear reduced models based on transported subspaces compared to traditional linear reduced models and full models. △ Less

Submitted 30 December, 2020; v1 submitted 30 December, 2019; originally announced December 2019.

MSC Class: 78M34; 41A46; 35F20; 78M12

arXiv:1908.00887 [pdf, ps, other]

doi 10.1016/j.aml.2019.106159

Exact and fast inversion of the approximate discrete Radon transform from partial data

Authors: Donsub Rim

Abstract: We give an exact inversion formula for the approximate discrete Radon transform introduced in [Brady, SIAM J. Comput., 27(1), 107--119] that is of cost $O(N \log N)$ for a square 2D image with $N$ pixels and requires only partial data. We give an exact inversion formula for the approximate discrete Radon transform introduced in [Brady, SIAM J. Comput., 27(1), 107--119] that is of cost $O(N \log N)$ for a square 2D image with $N$ pixels and requires only partial data. △ Less

Submitted 18 May, 2020; v1 submitted 2 August, 2019; originally announced August 2019.

Comments: 4 pages, 1 figure

MSC Class: 44A12; 65R10; 65F05; 65Q30

Journal ref: Appl. Math. Lett. 102 106159 (2020)

arXiv:1901.09893 [pdf, ps, other]

A simple electronic device to experiment with the Hopf bifurcation

Authors: Daniela N. Rim, Pablo Cremades, Pablo Kaluza

Abstract: We present a simple low-cost electronic circuit that is able to show two different dynamical regimens with oscillations of voltages and with constant values of them. This device is designed as a negative feedback three-node network inspired in the genetic repressilator. The circuit's behavior is modeled by a system of differential equations which is studied in several different ways by applying th… ▽ More We present a simple low-cost electronic circuit that is able to show two different dynamical regimens with oscillations of voltages and with constant values of them. This device is designed as a negative feedback three-node network inspired in the genetic repressilator. The circuit's behavior is modeled by a system of differential equations which is studied in several different ways by applying the dynamical system formalism, making numerical simulations and constructing and measuring it experimentally. We find that the most important characteristics of the Hopf bifurcation can be found and controlled. Particularly, a resistor value plays the role of the bifurcation parameter, which can be easily varied experimentally. As a result, this system can be employed to introduce many aspects of a research in a real physical system and it enables us to study one of the most important kinds of bifurcation. △ Less

Submitted 26 January, 2019; originally announced January 2019.

Journal ref: Rev. Mex. de Física E 65 (2019) 58-63

arXiv:1805.05938 [pdf, other]

Model reduction of a parametrized scalar hyperbolic conservation law using displacement interpolation

Authors: Donsub Rim, Kyle T. Mandli

Abstract: We propose a model reduction technique for parametrized partial differential equations arising from scalar hyperbolic conservation laws. The key idea of the technique is to construct basis functions that are local in parameter and time space via displacement interpolation. The construction is motivated by the observation that the derivative of solutions to hyperbolic conservation laws satisfy a co… ▽ More We propose a model reduction technique for parametrized partial differential equations arising from scalar hyperbolic conservation laws. The key idea of the technique is to construct basis functions that are local in parameter and time space via displacement interpolation. The construction is motivated by the observation that the derivative of solutions to hyperbolic conservation laws satisfy a contractive property with respect to the Wasserstein metric [Bolley et al. J. Hyperbolic Differ. Equ. 02 (2005), pp. 91-107]. We will discuss the approximation properties of the displacement interpolation, and show that it can naturally complement linear interpolation. Numerical experiments illustrate that we can successfully achieve the model reduction of a parametrized Burgers' equation, and that the reduced order model is suitable for performing typical tasks in uncertainty quantification. △ Less

Submitted 15 May, 2018; originally announced May 2018.

arXiv:1712.04028 [pdf, other]

doi 10.1137/18M1168315

Displacement interpolation using monotone rearrangement

Authors: Donsub Rim, Kyle T. Mandli

Abstract: When approximating a function that depends on a parameter, one encounters many practical examples where linear interpolation or linear approximation with respect to the parameters prove ineffective. This is particularly true for responses from hyperbolic partial differential equations (PDEs) where linear, low-dimensional bases are difficult to construct. We propose the use of displacement interpol… ▽ More When approximating a function that depends on a parameter, one encounters many practical examples where linear interpolation or linear approximation with respect to the parameters prove ineffective. This is particularly true for responses from hyperbolic partial differential equations (PDEs) where linear, low-dimensional bases are difficult to construct. We propose the use of displacement interpolation where the interpolation is done on the optimal transport map between the functions at nearby parameters, to achieve an effective dimensionality reduction of hyperbolic phenomena. We further propose a multi-dimensional extension by using the intertwining property of the Radon transform. This extension is a generalization of the classical translational representation of Lax-Philips [Lax and Philips, Bull. Amer. Math. Soc. 70 (1964), pp.130--142]. △ Less

Submitted 3 September, 2018; v1 submitted 11 December, 2017; originally announced December 2017.

MSC Class: 65D05; 65D15; 65K10

Journal ref: SIAM/ASA J. Uncertainty Quantification, 6(4) (2018) 1503-1531

arXiv:1711.03137 [pdf, other]

doi 10.1088/1361-6420/aabe5a

Imaging of isotropic and anisotropic conductivities from power densities in three dimensions

Authors: François Monard, Donsub Rim

Abstract: We present numerical reconstructions of anisotropic conductivity tensors in three dimensions, from knowledge of a finite family of power density functionals. Such a problem arises in the coupled-physics imaging modality Ultrasound Modulated Electrical Impedance Tomography for instance. We improve on the algorithms previously derived in [Bal et al, Inverse Probl Imaging (2013), pp.353-375, Monard a… ▽ More We present numerical reconstructions of anisotropic conductivity tensors in three dimensions, from knowledge of a finite family of power density functionals. Such a problem arises in the coupled-physics imaging modality Ultrasound Modulated Electrical Impedance Tomography for instance. We improve on the algorithms previously derived in [Bal et al, Inverse Probl Imaging (2013), pp.353-375, Monard and Bal, Comm. PDE (2013), pp.1183-1207] for both isotropic and anisotropic cases, and we address the well-known issue of vanishing determinants in particular. The algorithm is implemented and we provide numerical results that illustrate the improvements. △ Less

Submitted 19 March, 2018; v1 submitted 8 November, 2017; originally announced November 2017.

MSC Class: 65M32; 35R30; 35J15

arXiv:1705.03609 [pdf, other]

doi 10.1137/17M1135633

Dimensional splitting of hyperbolic partial differential equations using the Radon transform

Authors: Donsub Rim

Abstract: We introduce a dimensional splitting method based on the intertwining property of the Radon transform, with a particular focus on its applications related to hyperbolic partial differential equations (PDEs). This dimensional splitting has remarkable properties that makes it useful in a variety of contexts, including multi-dimensional extension of large time-step (LTS) methods, absorbing boundary c… ▽ More We introduce a dimensional splitting method based on the intertwining property of the Radon transform, with a particular focus on its applications related to hyperbolic partial differential equations (PDEs). This dimensional splitting has remarkable properties that makes it useful in a variety of contexts, including multi-dimensional extension of large time-step (LTS) methods, absorbing boundary conditions, displacement interpolation, and multi-dimensional generalization of transport reversal. △ Less

Submitted 6 December, 2018; v1 submitted 10 May, 2017; originally announced May 2017.

Comments: 25 pages

MSC Class: 65N08; 35L60; 35L65; 65R32

Journal ref: SIAM J. Sci. Comput., 40(6) (2018), A4184-A4207

arXiv:1701.07529 [pdf, other]

Transport reversal for model reduction of hyperbolic partial differential equations

Authors: Donsub Rim, Scott Moe, Randall J. LeVeque

Abstract: Snapshot matrices built from solutions to hyperbolic partial differential equations exhibit slow decay in singular values, whereas fast decay is crucial for the success of projection- based model reduction methods. To overcome this problem, we build on previous work in symmetry reduction [Rowley and Marsden, Physica D (2000), pp. 1-19] and propose an iterative algorithm that decomposes the snapsho… ▽ More Snapshot matrices built from solutions to hyperbolic partial differential equations exhibit slow decay in singular values, whereas fast decay is crucial for the success of projection- based model reduction methods. To overcome this problem, we build on previous work in symmetry reduction [Rowley and Marsden, Physica D (2000), pp. 1-19] and propose an iterative algorithm that decomposes the snapshot matrix into multiple shifting profiles, each with a corresponding speed. Its applicability to typical hyperbolic problems is demonstrated through numerical examples, and other natural extensions that modify the shift operator are considered. Finally, we give a geometric interpretation of the algorithm. △ Less

Submitted 25 January, 2017; originally announced January 2017.

arXiv:1605.02863 [pdf, other]

doi 10.1137/17M1113679

Generating Random Earthquake Events for PTHA

Authors: Randall J. LeVeque, Knut Waagan, Frank I. González, Donsub Rim, Guang Lin

Abstract: In order to perform probabilistic tsunami hazard assessment (PTHA) based on subduction zone earthquakes, it is necessary to start with a catalog of possible future events along with the annual probability of occurance, or a probability distribution of such events that can be easily sampled. For nearfield events, the distribution of slip on the fault can have a significant effect on the resulting t… ▽ More In order to perform probabilistic tsunami hazard assessment (PTHA) based on subduction zone earthquakes, it is necessary to start with a catalog of possible future events along with the annual probability of occurance, or a probability distribution of such events that can be easily sampled. For nearfield events, the distribution of slip on the fault can have a significant effect on the resulting tsunami. We present an approach to defining a probability distribution based on subdividing the fault geometry into many subfaults and prescribing a desired covariance matrix relating slip on one subfault to slip on any other subfault. The eigenvalues and eigenvectors of this matrix are then used to define a Karhunen-Loève expansion for random slip patterns. This is similar to a spectral representation of random slip based on Fourier series but conforms to a general fault geometry. We show that only a few terms in this series are needed to represent the features of the slip distribution that are most important in tsunami generation, first with a simple one-dimensional example where slip varies only in the down-dip direction and then on a portion of the Cascadia Subduction Zone. △ Less

Submitted 10 May, 2016; originally announced May 2016.

Comments: 24 pages, 12 figures, code provided at <a href="https://github.com/rjleveque/KLslip-paper">this URL</a>

MSC Class: 86-08; 65

Journal ref: SIAM/ASA J. Uncertainty Quantification, 6(1), (2018) 118-150

arXiv:1512.08212 [pdf, other]

Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression

Authors: David Rim, Sina Honari, Md Kamrul Hasan, Chris Pal

Abstract: We present techniques for improving performance driven facial animation, emotion recognition, and facial key-point or landmark prediction using learned identity invariant representations. Established approaches to these problems can work well if sufficient examples and labels for a particular identity are available and factors of variation are highly controlled. However, labeled examples of facial… ▽ More We present techniques for improving performance driven facial animation, emotion recognition, and facial key-point or landmark prediction using learned identity invariant representations. Established approaches to these problems can work well if sufficient examples and labels for a particular identity are available and factors of variation are highly controlled. However, labeled examples of facial expressions, emotions and key-points for new individuals are difficult and costly to obtain. In this paper we improve the ability of techniques to generalize to new and unseen individuals by explicitly modeling previously seen variations related to identity and expression. We use a weakly-supervised approach in which identity labels are used to learn the different factors of variation linked to identity separately from factors related to expression. We show how probabilistic modeling of these sources of variation allows one to learn identity-invariant representations for expressions which can then be used to identity-normalize various procedures for facial expression analysis and animation control. We also show how to extend the widely used techniques of active appearance models and constrained local models through replacing the underlying point distribution models which are typically constructed using principal component analysis with identity-expression factorized representations. We present a wide variety of experiments in which we consistently improve performance on emotion recognition, markerless performance-driven facial animation and facial key-point tracking. △ Less

Submitted 22 May, 2016; v1 submitted 27 December, 2015; originally announced December 2015.

Comments: to appear in Image and Vision Computing Journal (IMAVIS)

arXiv:1505.04240 [pdf, ps, other]

An Elementary Proof That Symplectic Matrices Have Determinant One

Authors: Donsub Rim

Abstract: We give one more proof of the fact that symplectic matrices over real and complex fields have determinant one. While this has already been proved many times, there has been lasting interest in finding an elementary proof. Our result is restricted to the real and complex case due to its reliance on field-dependent spectral theory, however in this setting we obtain a proof which is more elementary i… ▽ More We give one more proof of the fact that symplectic matrices over real and complex fields have determinant one. While this has already been proved many times, there has been lasting interest in finding an elementary proof. Our result is restricted to the real and complex case due to its reliance on field-dependent spectral theory, however in this setting we obtain a proof which is more elementary in the sense that it is direct and requires only well-known facts. Finally, an explicit formula for the determinant of conjugate symplectic matrices in terms of its square subblocks is given. △ Less

Submitted 23 March, 2018; v1 submitted 15 May, 2015; originally announced May 2015.

MSC Class: 15A15; 15A42; 37J10

Journal ref: Adv. Dyn. Syst. Appl. (2017) 12 (1) 15-20

Showing 1–28 of 28 results for author: Rim, D