Search | arXiv e-print repository

doi 10.1007/978-3-031-21743-2_43

A Combination of BERT and Transformer for Vietnamese Spelling Correction

Authors: Hieu Ngo Trung, Duong Tran Ham, Tin Huynh, Kiem Hoang

Abstract: Recently, many studies have shown the efficiency of using Bidirectional Encoder Representations from Transformers (BERT) in various Natural Language Processing (NLP) tasks. Specifically, English spelling correction task that uses Encoder-Decoder architecture and takes advantage of BERT has achieved state-of-the-art result. However, to our knowledge, there is no implementation in Vietnamese yet. Th… ▽ More Recently, many studies have shown the efficiency of using Bidirectional Encoder Representations from Transformers (BERT) in various Natural Language Processing (NLP) tasks. Specifically, English spelling correction task that uses Encoder-Decoder architecture and takes advantage of BERT has achieved state-of-the-art result. However, to our knowledge, there is no implementation in Vietnamese yet. Therefore, in this study, a combination of Transformer architecture (state-of-the-art for Encoder-Decoder model) and BERT was proposed to deal with Vietnamese spelling correction. The experiment results have shown that our model outperforms other approaches as well as the Google Docs Spell Checking tool, achieves an 86.24 BLEU score on this task. △ Less

Submitted 4 May, 2024; originally announced May 2024.

Comments: 13 pages

Journal ref: ACIIDS 2022, LNCS, vol 13757, Springer, Cham

arXiv:2404.01954 [pdf, other]

HyperCLOVA X Technical Report

Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in develo** their sovereign LLMs. △ Less

Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 44 pages; updated authors list and fixed author names

arXiv:2403.19917 [pdf, other]

Doubly robust estimation and inference for a log-concave counterfactual density

Authors: Daeyoung Ham, Ted Westling, Charles R. Doss

Abstract: We consider the problem of causal inference based on observational data (or the related missing data problem) with a binary or discrete treatment variable. In that context we study counterfactual density estimation, which provides more nuanced information than counterfactual mean estimation (i.e., the average treatment effect). We impose the shape-constraint of log-concavity (a unimodality constra… ▽ More We consider the problem of causal inference based on observational data (or the related missing data problem) with a binary or discrete treatment variable. In that context we study counterfactual density estimation, which provides more nuanced information than counterfactual mean estimation (i.e., the average treatment effect). We impose the shape-constraint of log-concavity (a unimodality constraint) on the counterfactual densities, and then develop doubly robust estimators of the log-concave counterfactual density (based on an augmented inverse-probability weighted pseudo-outcome), and show the consistency in various global metrics of that estimator. Based on that estimator we also develop asymptotically valid pointwise confidence intervals for the counterfactual density. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2401.05868 [pdf, other]

Efficient N-to-M Checkpointing Algorithm for Finite Element Simulations

Authors: David A. Ham, Vaclav Hapla, Matthew G. Knepley, Lawrence Mitchell, Koki Sagiyama

Abstract: In this work, we introduce a new algorithm for N-to-M checkpointing in finite element simulations. This new algorithm allows efficient saving/loading of functions representing physical quantities associated with the mesh representing the physical domain. Specifically, the algorithm allows for using different numbers of parallel processes for saving and loading, allowing for restarting and post-pro… ▽ More In this work, we introduce a new algorithm for N-to-M checkpointing in finite element simulations. This new algorithm allows efficient saving/loading of functions representing physical quantities associated with the mesh representing the physical domain. Specifically, the algorithm allows for using different numbers of parallel processes for saving and loading, allowing for restarting and post-processing on the process count appropriate to the given phase of the simulation and other conditions. For demonstration, we implemented this algorithm in PETSc, the Portable, Extensible Toolkit for Scientific Computation, and added a convenient high-level interface into Firedrake, a system for solving partial differential equations using finite element methods. We evaluated our new implementation by saving and loading data involving 8.2 billion finite element degrees of freedom using 8,192 parallel processes on ARCHER2, the UK National Supercomputing Service. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2312.05912 [pdf, other]

Circular objects do not melt the slowest in water

Authors: Rui Yang, Thijs van den Ham, Roberto Verzicco, Detlef Lohse, Sander G. Huisman

Abstract: We report on the melting dynamics of ice suspended in fresh water and subject to natural convective flows. Using direct numerical simulations we investigate the melt rate of ellipsoidal objects for $2.32\times 10^4 \leq \text{Ra} \leq 7.61\times 10^8$, where \text{Ra} is the Rayleigh number defined with the temperature difference between the ice and the surrounding water. We reveal that the system… ▽ More We report on the melting dynamics of ice suspended in fresh water and subject to natural convective flows. Using direct numerical simulations we investigate the melt rate of ellipsoidal objects for $2.32\times 10^4 \leq \text{Ra} \leq 7.61\times 10^8$, where \text{Ra} is the Rayleigh number defined with the temperature difference between the ice and the surrounding water. We reveal that the system exhibits non-monotonic behavior in three control parameters. As a function of the aspect ratio of the ellipsoidal, the melting time shows a distinct minimum that is different from a disk which has the minimum perimeter. Furthermore, also with \text{Ra} the system shows a non-monotonic trend, since for large \text{Ra} and large aspect ratio the flow separates, leading to distinctly different dynamics. Lastly, since the density of water is non-monotonic with temperature, the melt rate depends non-monotonically also on the ambient temperature, as for intermediate temperatures ($\unit{4}{\celsius}$--$\unit{7}{\celsius}$) the flow is (partially) reversed. In general, the shape which melts the slowest is quite distinct from that of a disk. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2307.03317 [pdf, other]

Fitted value shrinkage

Authors: Daeyoung Ham, Adam J. Rothman

Abstract: We propose a penalized least-squares method to fit the linear regression model with fitted values that are invariant to invertible linear transformations of the design matrix. This invariance is important, for example, when practitioners have categorical predictors and interactions. Our method has the same computational cost as ridge-penalized least squares, which lacks this invariance. We derive… ▽ More We propose a penalized least-squares method to fit the linear regression model with fitted values that are invariant to invertible linear transformations of the design matrix. This invariance is important, for example, when practitioners have categorical predictors and interactions. Our method has the same computational cost as ridge-penalized least squares, which lacks this invariance. We derive the expected squared distance between the vector of population fitted values and its shrinkage estimator as well as the tuning parameter value that minimizes this expectation. In addition to using cross validation, we construct two estimators of this optimal tuning parameter value and study their asymptotic properties. Our numerical experiments and data examples show that our method performs similarly to ridge-penalized least-squares. △ Less

Submitted 7 May, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

arXiv:2304.06058 [pdf, other]

Consistent Point Data Assimilation in Firedrake and Icepack

Authors: Reuben W. Nixon-Hill, Daniel Shapero, Colin J. Cotter, David A. Ham

Abstract: When estimating quantities and fields that are difficult to measure directly, such as the fluidity of ice, from point data sources, such as satellite altimetry, it is important to solve a numerical inverse problem that is formulated with Bayesian consistency. Otherwise, the resultant probability density function for the difficult to measure quantity or field will not be appropriately clustered aro… ▽ More When estimating quantities and fields that are difficult to measure directly, such as the fluidity of ice, from point data sources, such as satellite altimetry, it is important to solve a numerical inverse problem that is formulated with Bayesian consistency. Otherwise, the resultant probability density function for the difficult to measure quantity or field will not be appropriately clustered around the truth. In particular, the inverse problem should be formulated by evaluating the numerical solution at the true point locations for direct comparison with the point data source. If the data are first fitted to a gridded or meshed field on the computational grid or mesh, and the inverse problem formulated by comparing the numerical solution to the fitted field, the benefits of additional point data values below the grid density will be lost. We demonstrate, with examples in the fields of groundwater hydrology and glaciology, that a consistent formulation can increase the accuracy of results and aid discourse between modellers and observationalists. To do this, we bring point data into the finite element method ecosystem as discontinuous fields on meshes of disconnected vertices. Point evaluation can then be formulated as a finite element interpolation operation (dual-evaluation). This new abstraction is well-suited to automation, including automatic differentiation. We demonstrate this through implementation in Firedrake, which generates highly optimised code for solving Partial Differential Equations (PDEs) with the finite element method. Our solution integrates with dolfin-adjoint/pyadjoint, allowing PDE-constrained optimisation problems, such as data assimilation, to be solved through forward and adjoint mode automatic differentiation. △ Less

Submitted 9 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

Comments: This version: Added missing affiliation

arXiv:2303.06871 [pdf, other]

Physics-driven machine learning models coupling PyTorch and Firedrake

Authors: Nacime Bouziani, David A. Ham

Abstract: Partial differential equations (PDEs) are central to describing and modelling complex physical systems that arise in many disciplines across science and engineering. However, in many realistic applications PDE modelling provides an incomplete description of the physics of interest. PDE-based machine learning techniques are designed to address this limitation. In this approach, the PDE is used as a… ▽ More Partial differential equations (PDEs) are central to describing and modelling complex physical systems that arise in many disciplines across science and engineering. However, in many realistic applications PDE modelling provides an incomplete description of the physics of interest. PDE-based machine learning techniques are designed to address this limitation. In this approach, the PDE is used as an inductive bias enabling the coupled model to rely on fundamental physical laws while requiring less training data. The deployment of high-performance simulations coupling PDEs and machine learning to complex problems necessitates the composition of capabilities provided by machine learning and PDE-based frameworks. We present a simple yet effective coupling between the machine learning framework PyTorch and the PDE system Firedrake that provides researchers, engineers and domain specialists with a high productive way of specifying coupled models while only requiring trivial changes to existing code. △ Less

Submitted 1 April, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

Comments: Accepted at the ICLR 2023 Workshop on Physics for Machine Learning

arXiv:2303.02735 [pdf, other]

Scalable Object Detection on Embedded Devices Using Weight Pruning and Singular Value Decomposition

Authors: Dohyun Ham, Jaeyeop Jeong, June-Kyoo Park, Raehyeon Jeong, Seungmin Jeon, Hyeongjun Jeon, Yewon Lim

Abstract: This paper presents a method for optimizing object detection models by combining weight pruning and singular value decomposition (SVD). The proposed method was evaluated on a custom dataset of street work images obtained from https://universe.roboflow.com/roboflow-100/street-work. The dataset consists of 611 training images, 175 validation images, and 87 test images with 7 classes. We compared the… ▽ More This paper presents a method for optimizing object detection models by combining weight pruning and singular value decomposition (SVD). The proposed method was evaluated on a custom dataset of street work images obtained from https://universe.roboflow.com/roboflow-100/street-work. The dataset consists of 611 training images, 175 validation images, and 87 test images with 7 classes. We compared the performance of the optimized models with the original unoptimized model in terms of frame rate, mean average precision (mAP@50), and weight size. The results show that the weight pruning + SVD model achieved a 0.724 mAP@50 with a frame rate of 1.48 FPS and a weight size of 12.1 MB, outperforming the original model (0.717 mAP@50, 1.50 FPS, and 12.3 MB). Precision-recall curves were also plotted for all models. Our work demonstrates that the proposed method can effectively optimize object detection models while balancing accuracy, speed, and model size. △ Less

Submitted 17 March, 2023; v1 submitted 5 March, 2023; originally announced March 2023.

Comments: 8 pages, 3 figures. A report of the project done as part of the Yonsei-Roboin project for the 2nd semester, 2022

arXiv:2302.14136 [pdf, other]

Design-Based Inference for Multi-arm Bandits

Authors: Dae Woong Ham, Iavor Bo**ov, Michael Lindon, Martin Tingley

Abstract: Multi-arm bandits are gaining popularity as they enable real-world sequential decision-making across application areas, including clinical trials, recommender systems, and online decision-making. Consequently, there is an increased desire to use the available adaptively collected datasets to distinguish whether one arm was more effective than the other, e.g., which product or treatment was more ef… ▽ More Multi-arm bandits are gaining popularity as they enable real-world sequential decision-making across application areas, including clinical trials, recommender systems, and online decision-making. Consequently, there is an increased desire to use the available adaptively collected datasets to distinguish whether one arm was more effective than the other, e.g., which product or treatment was more effective. Unfortunately, existing tools fail to provide valid inference when data is collected adaptively or require many untestable and technical assumptions, e.g., stationarity, iid rewards, bounded random variables, etc. Our paper introduces the design-based approach to inference for multi-arm bandits, where we condition the full set of potential outcomes and perform inference on the obtained sample. Our paper constructs valid confidence intervals for both the reward mean of any arm and the mean reward difference between any arms in an assumption-light manner, allowing the rewards to be arbitrarily distributed, non-iid, and from non-stationary distributions. In addition to confidence intervals, we also provide valid design-based confidence sequences, sequences of confidence intervals that have uniform type-1 error guarantees over time. Confidence sequences allow the agent to perform a hypothesis test as the data arrives sequentially and stop the experiment as soon as the agent is satisfied with the inference, e.g., the mean reward of an arm is statistically significantly higher than a desired threshold. △ Less

Submitted 27 February, 2023; originally announced February 2023.

arXiv:2212.10504 [pdf, other]

Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?

Authors: Sang-Woo Lee, Sungdong Kim, Donghyeon Ko, Donghoon Ham, Youngki Hong, Shin Ah Oh, Hyunhoon Jung, Wangkyo Jung, Kyunghyun Cho, Donghyun Kwak, Hyungsuk Noh, Woomyoung Park

Abstract: Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i.e., slots) to fulfill a specific task. A series of approaches based on this framework achieved remarkable success on various TOD benchmarks. However, we argue that the current TOD benchmarks are limited to surrogate real-worl… ▽ More Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i.e., slots) to fulfill a specific task. A series of approaches based on this framework achieved remarkable success on various TOD benchmarks. However, we argue that the current TOD benchmarks are limited to surrogate real-world scenarios and that the current TOD models are still a long way to cover the scenarios. In this position paper, we first identify current status and limitations of SF-TOD systems. After that, we explore the WebTOD framework, the alternative direction for building a scalable TOD system when a web/mobile interface is available. In WebTOD, the dialogue system learns how to understand the web/mobile interface that the human agent interacts with, powered by a large-scale language model. △ Less

Submitted 24 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

arXiv:2210.08639 [pdf, other]

Design-Based Confidence Sequences: A General Approach to Risk Mitigation in Online Experimentation

Authors: Dae Woong Ham, Iavor Bo**ov, Michael Lindon, Martin Tingley

Abstract: Randomized experiments have become the standard method for companies to evaluate the performance of new products or services. In addition to augmenting managers' decision-making, experimentation mitigates risk by limiting the proportion of customers exposed to innovation. Since many experiments are on customers arriving sequentially, a potential solution is to allow managers to "peek" at the resul… ▽ More Randomized experiments have become the standard method for companies to evaluate the performance of new products or services. In addition to augmenting managers' decision-making, experimentation mitigates risk by limiting the proportion of customers exposed to innovation. Since many experiments are on customers arriving sequentially, a potential solution is to allow managers to "peek" at the results when new data becomes available and stop the test if the results are statistically significant. Unfortunately, peeking invalidates the statistical guarantees for standard statistical analysis and leads to uncontrolled type-1 error. Our paper provides valid design-based confidence sequences, sequences of confidence intervals with uniform type-1 error guarantees over time for various sequential experiments in an assumption-light manner. In particular, we focus on finite-sample estimands defined on the study participants as a direct measure of the incurred risks by companies. Our proposed confidence sequences are valid for a large class of experiments, including multi-arm bandits, time series, and panel experiments. We further provide a variance reduction technique incorporating modeling assumptions and covariates. Finally, we demonstrate the effectiveness of our proposed approach through a simulation study and three real-world applications from Netflix. Our results show that by using our confidence sequence, harmful experiments could be stopped after only observing a handful of units; for instance, an experiment that Netflix ran on its sign-up page on 30,000 potential customers would have been stopped by our method on the first day before 100 observations. △ Less

Submitted 24 May, 2023; v1 submitted 16 October, 2022; originally announced October 2022.

arXiv:2210.08589 [pdf, other]

Anytime-Valid Linear Models and Regression Adjusted Causal Inference in Randomized Experiments

Authors: Michael Lindon, Dae Woong Ham, Martin Tingley, Iavor Bo**ov

Abstract: Linear regression adjustment is commonly used to analyse randomised controlled experiments due to its efficiency and robustness against model misspecification. Current testing and interval estimation procedures leverage the asymptotic distribution of such estimators to provide Type-I error and coverage guarantees that hold only at a single sample size. Here, we develop the theory for the anytime-v… ▽ More Linear regression adjustment is commonly used to analyse randomised controlled experiments due to its efficiency and robustness against model misspecification. Current testing and interval estimation procedures leverage the asymptotic distribution of such estimators to provide Type-I error and coverage guarantees that hold only at a single sample size. Here, we develop the theory for the anytime-valid analogues of such procedures, enabling linear regression adjustment in the sequential analysis of randomised experiments. We first provide sequential $F$-tests and confidence sequences for the parametric linear model, which provide time-uniform Type-I error and coverage guarantees that hold for all sample sizes. We then relax all linear model parametric assumptions in randomised designs and provide nonparametric model-free sequential tests and confidence sequences for treatment effects. This formally allows experiments to be continuously monitored for significance, stopped early, and safeguards against statistical malpractices in data collection. A particular feature of our results is their simplicity. Our test statistics and confidence sequences all emit closed-form expressions, which are functions of statistics directly available from a standard linear regression table. We illustrate our methodology with the sequential analysis of software A/B experiments at Netflix, performing regression adjustment with pre-treatment outcomes. △ Less

Submitted 7 February, 2024; v1 submitted 16 October, 2022; originally announced October 2022.

arXiv:2205.08644 [pdf, other]

Benefits and costs of matching prior to a Difference in Differences analysis when parallel trends does not hold

Authors: Dae Woong Ham, Luke Miratrix

Abstract: The Difference in Difference (DiD) estimator is a popular estimator built on the "parallel trends" assumption, which is an assertion that the treatment group, absent treatment, would change "similarly" to the control group over time. To bolster such a claim, one might generate a comparison group, via matching, that is similar to the treated group with respect to pre-treatment outcomes and/or pre-t… ▽ More The Difference in Difference (DiD) estimator is a popular estimator built on the "parallel trends" assumption, which is an assertion that the treatment group, absent treatment, would change "similarly" to the control group over time. To bolster such a claim, one might generate a comparison group, via matching, that is similar to the treated group with respect to pre-treatment outcomes and/or pre-treatment covariates. Unfortunately, as has been previously pointed out, this intuitively appealing approach also has a cost in terms of bias. To assess the trade-offs of matching in our application, we first characterize the bias of matching prior to a DiD analysis under a linear structural model that allows for time-invariant observed and unobserved confounders with time-varying effects on the outcome. Given our framework, we verify that matching on baseline covariates generally reduces bias. We further show how additionally matching on pre-treatment outcomes has both cost and benefit. First, matching on pre-treatment outcomes partially balances unobserved confounders, which mitigates some bias. This reduction is proportional to the outcome's reliability, a measure of how coupled the outcomes are with the latent covariates. Offsetting these gains, matching also injects bias into the final estimate by undermining the second difference in the DiD via a regression-to-the-mean effect. Consequently, we provide heuristic guidelines for determining to what degree the bias reduction of matching is likely to outweigh the bias cost. We illustrate our guidelines by reanalyzing a principal turnover study that used matching prior to a DiD analysis and find that matching on both the pre-treatment outcomes and observed covariates makes the estimated treatment effect more credible. △ Less

Submitted 7 February, 2024; v1 submitted 17 May, 2022; originally announced May 2022.

arXiv:2205.02430 [pdf, other]

doi 10.1007/s11749-023-00861-2

Hypothesis Testing in Sequentially Sampled Data: AdapRT to Maximize Power Beyond iid Sampling

Authors: Dae Woong Ham, Jiaze Qiu

Abstract: Testing whether a variable of interest affects the outcome is one of the most fundamental problem in statistics and is often the main scientific question of interest. To tackle this problem, the conditional randomization test (CRT) is widely used to test the independence of variable(s) of interest (X) with an outcome (Y) holding other variable(s) (Z) fixed. The CRT uses randomization or design-bas… ▽ More Testing whether a variable of interest affects the outcome is one of the most fundamental problem in statistics and is often the main scientific question of interest. To tackle this problem, the conditional randomization test (CRT) is widely used to test the independence of variable(s) of interest (X) with an outcome (Y) holding other variable(s) (Z) fixed. The CRT uses randomization or design-based inference that relies solely on the iid sampling of (X,Z) to produce exact finite-sample p-values that are constructed using any test statistic. We propose a new method, the adaptive randomization test (ART), that tackles the independence problem while allowing the data to be adaptively sampled. We first showcase the ART in a particular multi-arm bandit problem known as the normal-mean model. Under this setting, we theoretically characterize the powers of both the iid sampling procedure and the adaptive sampling procedure and empirically find that the ART can uniformly outperform the CRT that pulls all arms independently with equal probability. We also surprisingly find that the ART can be more powerful than even the CRT that uses an oracle iid sampling procedure when the signal is relatively strong. We believe that the proposed adaptive procedure is successful because it takes arms that may initially look like "fake" signals due to random chance and stabilizes them closer to "null" signals. We additionally showcase the ART to a popular factorial survey design setting known as conjoint analysis. We find similar results through simulations and a recent application concerning the role of gender discrimination in political candidate evaluation. △ Less

Submitted 27 August, 2022; v1 submitted 5 May, 2022; originally announced May 2022.

arXiv:2205.00176 [pdf, other]

Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models

Authors: Sanghwan Bae, Donghyun Kwak, Sungdong Kim, Donghoon Ham, Soyoung Kang, Sang-Woo Lee, Woomyoung Park

Abstract: Recent open-domain dialogue models have brought numerous breakthroughs. However, building a chat system is not scalable since it often requires a considerable volume of human-human dialogue data, especially when enforcing features such as persona, style, or safety. In this work, we study the challenge of imposing roles on open-domain dialogue systems, with the goal of making the systems maintain c… ▽ More Recent open-domain dialogue models have brought numerous breakthroughs. However, building a chat system is not scalable since it often requires a considerable volume of human-human dialogue data, especially when enforcing features such as persona, style, or safety. In this work, we study the challenge of imposing roles on open-domain dialogue systems, with the goal of making the systems maintain consistent roles while conversing naturally with humans. To accomplish this, the system must satisfy a role specification that includes certain conditions on the stated features as well as a system policy on whether or not certain types of utterances are allowed. For this, we propose an efficient data collection framework leveraging in-context few-shot learning of large-scale language models for building role-satisfying dialogue dataset from scratch. We then compare various architectures for open-domain dialogue systems in terms of meeting role specifications while maintaining conversational abilities. Automatic and human evaluations show that our models return few out-of-bounds utterances, kee** competitive performance on general metrics. We release a Korean dialogue dataset we built for further research. △ Less

Submitted 30 April, 2022; originally announced May 2022.

Comments: Accepted to NAACL2022 as a long paper

arXiv:2203.12501 [pdf, other]

Quantum Logic Enhanced Sensing in Solid-State Spin Ensembles

Authors: Nithya Arunkumar, Kevin S. Olsson, Jner Tzern Oon, Connor Hart, Dominik B. Bucher, David Glenn, Mikhail D. Lukin, Hongkun Park, Donhee Ham, Ronald L. Walsworth

Abstract: We demonstrate quantum logic enhanced sensitivity for a macroscopic ensemble of solid-state, hybrid two-qubit sensors. We achieve a factor of 30 improvement in signal-to-noise ratio, translating to a sensitivity enhancement exceeding an order of magnitude. Using the electronic spins of nitrogen vacancy (NV) centers in diamond as sensors, we leverage the on-site nitrogen nuclear spins of the NV cen… ▽ More We demonstrate quantum logic enhanced sensitivity for a macroscopic ensemble of solid-state, hybrid two-qubit sensors. We achieve a factor of 30 improvement in signal-to-noise ratio, translating to a sensitivity enhancement exceeding an order of magnitude. Using the electronic spins of nitrogen vacancy (NV) centers in diamond as sensors, we leverage the on-site nitrogen nuclear spins of the NV centers as memory qubits, in combination with homogeneous bias and control fields, ensuring that all of the ${\sim}10^9$ two-qubit sensors are sufficiently identical to permit global control of the NV ensemble spin states. We find quantum logic sensitivity enhancement for multiple measurement protocols with varying optimal sensing intervals, including XY8 dynamical decoupling and correlation spectroscopy, using a synthetic AC magnetic field. The results are independent of the nature of the target signal and broadly applicable to metrology using NV centers and other solid-state ensembles. This work provides a benchmark for macroscopic ensembles of quantum sensors that employ quantum logic or quantum error correction algorithms for enhanced sensitivity. △ Less

Submitted 23 March, 2022; originally announced March 2022.

Comments: 7 pages, 5 figures; Supplemental: 4 pages, 3 figures

arXiv:2201.08343 [pdf, other]

doi 10.1017/pan.2023.41

Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Authors: Dae Woong Ham, Kosuke Imai, Lucas Janson

Abstract: Conjoint analysis is a popular experimental design used to measure multidimensional preferences. Researchers examine how varying a factor of interest, while controlling for other relevant factors, influences decision-making. Currently, there exist two methodological approaches to analyzing data from a conjoint experiment. The first focuses on estimating the average marginal effects of each factor… ▽ More Conjoint analysis is a popular experimental design used to measure multidimensional preferences. Researchers examine how varying a factor of interest, while controlling for other relevant factors, influences decision-making. Currently, there exist two methodological approaches to analyzing data from a conjoint experiment. The first focuses on estimating the average marginal effects of each factor while averaging over the other factors. Although this allows for straightforward design-based estimation, the results critically depend on the distribution of other factors and how interaction effects are aggregated. An alternative model-based approach can compute various quantities of interest, but requires researchers to correctly specify the model, a challenging task for conjoint analysis with many factors and possible interactions. In addition, a commonly used logistic regression has poor statistical properties even with a moderate number of factors when incorporating interactions. We propose a new hypothesis testing approach based on the conditional randomization test to answer the most fundamental question of conjoint analysis: Does a factor of interest matter in any way given the other factors? Our methodology is solely based on the randomization of factors, and hence is free from assumptions. Yet, it allows researchers to use any test statistic, including those based on complex machine learning algorithms. As a result, we are able to combine the strengths of the existing design-based and model-based approaches. We illustrate the proposed methodology through conjoint analysis of immigration preferences and political candidate evaluation. We also extend the proposed approach to test for regularity assumptions commonly used in conjoint analysis. An open-source software package is available for implementing the proposed methodology. △ Less

Submitted 17 August, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

Journal ref: Political Analysis, pg. 1-16, 2024

arXiv:2111.00945 [pdf, other]

Esca** the abstraction: a foreign function interface for the Unified Form Language [UFL]

Authors: Nacime Bouziani, David A. Ham

Abstract: High level domain specific languages for the finite element method underpin high productivity programming environments for simulations based on partial differential equations (PDE) while employing automatic code generation to achieve high performance. However, a limitation of this approach is that it does not support operators that are not directly expressible in the vector calculus. This is criti… ▽ More High level domain specific languages for the finite element method underpin high productivity programming environments for simulations based on partial differential equations (PDE) while employing automatic code generation to achieve high performance. However, a limitation of this approach is that it does not support operators that are not directly expressible in the vector calculus. This is critical in applications where PDEs are not enough to accurately describe the physical problem of interest. The use of deep learning techniques have become increasingly popular in filling this knowledge gap, for example to include features not represented in the differential equations, or closures for unresolved spatiotemporal scales. We introduce an interface within the Firedrake finite element system that enables a seamless interface with deep learning models. This new feature composes with the automatic differentiation capabilities of Firedrake, enabling the automated solution of inverse problems. Our implementation interfaces with PyTorch and can be extended to other machine learning libraries. The resulting framework supports complex models coupling PDEs and deep learning whilst maintaining separation of concerns between application scientists and software experts. △ Less

Submitted 1 November, 2021; originally announced November 2021.

Comments: First Workshop on Differentiable Programming (NeurIPS 2021)

arXiv:2109.04650 [pdf, other]

What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers

Authors: Boseop Kim, HyoungSeok Kim, Sang-Woo Lee, Gichang Lee, Donghyun Kwak, Dong Hyeon Jeon, Sunghyun Park, Sungju Kim, Seonhoon Kim, Dongpil Seo, Heungsub Lee, Minyoung Jeong, Sungjae Lee, Minsub Kim, Suk Hyun Ko, Seokhun Kim, Taeyong Park, **uk Kim, Soyoung Kang, Na-Hyeon Ryu, Kang Min Yoo, Minsuk Chang, Soobin Suh, Sookyo In, **seong Park , et al. (12 additional authors not shown)

Abstract: GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a K… ▽ More GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a Korean variant of 82B GPT-3 trained on a Korean-centric corpus of 560B tokens. Enhanced by our Korean-specific tokenization, HyperCLOVA with our training configuration shows state-of-the-art in-context zero-shot and few-shot learning performances on various downstream tasks in Korean. Also, we show the performance benefits of prompt-based learning and demonstrate how it can be integrated into the prompt engineering pipeline. Then we discuss the possibility of materializing the No Code AI paradigm by providing AI prototy** capabilities to non-experts of ML by introducing HyperCLOVA studio, an interactive prompt engineering interface. Lastly, we demonstrate the potential of our methods with three successful in-house applications. △ Less

Submitted 28 November, 2021; v1 submitted 9 September, 2021; originally announced September 2021.

Comments: Accepted to EMNLP2021 as a long paper. Fixed some typos

arXiv:2104.12986 [pdf, other]

doi 10.1145/3490485

Bringing Trimmed Serendipity Methods to Computational Practice in Firedrake

Authors: Justin Crum, Cyrus Cheng, David A. Ham, Lawrence Mitchell, Robert C. Kirby, Joshua A. Levine, Andrew Gillette

Abstract: We present an implementation of the trimmed serendipity finite element family, using the open source finite element package Firedrake. The new elements can be used seamlessly within the software suite for problems requiring $H^1$, \hcurl, or \hdiv-conforming elements on meshes of squares or cubes. To test how well trimmed serendipity elements perform in comparison to traditional tensor product ele… ▽ More We present an implementation of the trimmed serendipity finite element family, using the open source finite element package Firedrake. The new elements can be used seamlessly within the software suite for problems requiring $H^1$, \hcurl, or \hdiv-conforming elements on meshes of squares or cubes. To test how well trimmed serendipity elements perform in comparison to traditional tensor product elements, we perform a sequence of numerical experiments including the primal Poisson, mixed Poisson, and Maxwell cavity eigenvalue problems. Overall, we find that the trimmed serendipity elements converge, as expected, at the same rate as the respective tensor product elements while being able to offer significant savings in the time or memory required to solve certain problems. △ Less

Submitted 8 October, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

Comments: 19 pages, 7 figures, 3 tables, 2 listings

Journal ref: ACM Transactions on Mathematical Software 48(1):8:1-8:19 (2022)

arXiv:2104.08012 [pdf, other]

Code generation for productive portable scalable finite element simulation in Firedrake

Authors: Jack D. Betteridge, Patrick E. Farrell, David A. Ham

Abstract: Creating scalable, high performance PDE-based simulations requires a suitable combination of discretizations, differential operators, preconditioners and solvers. The required combination changes with the application and with the available hardware, yet software development time is a severely limited resource for most scientists and engineers. Here we demonstrate that generating simulation code fr… ▽ More Creating scalable, high performance PDE-based simulations requires a suitable combination of discretizations, differential operators, preconditioners and solvers. The required combination changes with the application and with the available hardware, yet software development time is a severely limited resource for most scientists and engineers. Here we demonstrate that generating simulation code from a high-level Python interface provides an effective mechanism for creating high performance simulations from very few lines of user code. We demonstrate that moving from one supercomputer to another can require significant algorithmic changes to achieve scalable performance, but that the code generation approach enables these algorithmic changes to be achieved with minimal development effort. △ Less

Submitted 16 April, 2021; originally announced April 2021.

arXiv:2101.05158 [pdf, ps, other]

UFL Dual Spaces, a proposal

Authors: David A. Ham

Abstract: This white paper highlights current limitations in the algebraic closure Unified Form Language (UFL). UFL currently represents forms over finite element spaces, however finite element problems naturally result in objects in the dual to a finite element space, and operators map** between primal and dual finite element spaces. This document sketches the relevant mathematical areas and proposes cha… ▽ More This white paper highlights current limitations in the algebraic closure Unified Form Language (UFL). UFL currently represents forms over finite element spaces, however finite element problems naturally result in objects in the dual to a finite element space, and operators map** between primal and dual finite element spaces. This document sketches the relevant mathematical areas and proposes changes to the UFL language to support dual spaces as first class types in UFL. △ Less

Submitted 13 January, 2021; originally announced January 2021.

arXiv:1903.08243 [pdf, other]

doi 10.1177/1094342020945005

A study of vectorization for matrix-free finite element methods

Authors: Tianjiao Sun, Lawrence Mitchell, Kaushik Kulkarni, Andreas Klöckner, David A. Ham, Paul H. J. Kelly

Abstract: Vectorization is increasingly important to achieve high performance on modern hardware with SIMD instructions. Assembly of matrices and vectors in the finite element method, which is characterized by iterating a local assembly kernel over unstructured meshes, poses difficulties to effective vectorization. Maintaining a user-friendly high-level interface with a suitable degree of abstraction while… ▽ More Vectorization is increasingly important to achieve high performance on modern hardware with SIMD instructions. Assembly of matrices and vectors in the finite element method, which is characterized by iterating a local assembly kernel over unstructured meshes, poses difficulties to effective vectorization. Maintaining a user-friendly high-level interface with a suitable degree of abstraction while generating efficient, vectorized code for the finite element method is a challenge for numerical software systems and libraries. In this work, we study cross-element vectorization in the finite element framework Firedrake via code transformation and demonstrate the efficacy of such an approach by evaluating a wide range of matrix-free operators spanning different polynomial degrees and discretizations on two recent CPUs using three mainstream compilers. Our experiments show that our approaches for cross-element vectorization achieve 30\% of theoretical peak performance for many examples of practical significance, and exceed 50\% for cases with high arithmetic intensities, with consistent speed-up over (intra-element) vectorization restricted to the local assembly kernels. △ Less

Submitted 19 May, 2020; v1 submitted 19 March, 2019; originally announced March 2019.

Journal ref: International Journal of High Performance Computing Applications (2020)

arXiv:1901.06219 [pdf, other]

Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

Authors: Oleksandr Bailo, DongShik Ham, Young Min Shin

Abstract: In this paper, we describe how to apply image-to-image translation techniques to medical blood smear data to generate new data samples and meaningfully increase small datasets. Specifically, given the segmentation mask of the microscopy image, we are able to generate photorealistic images of blood cells which are further used alongside real data during the network training for segmentation and obj… ▽ More In this paper, we describe how to apply image-to-image translation techniques to medical blood smear data to generate new data samples and meaningfully increase small datasets. Specifically, given the segmentation mask of the microscopy image, we are able to generate photorealistic images of blood cells which are further used alongside real data during the network training for segmentation and object detection tasks. This image data generation approach is based on conditional generative adversarial networks which have proven capabilities to high-quality image synthesis. In addition to synthesizing blood images, we synthesize segmentation mask as well which leads to a diverse variety of generated samples. The effectiveness of the technique is thoroughly analyzed and quantified through a number of experiments on a manually collected and annotated dataset of blood smear taken under a microscope. △ Less

Submitted 8 March, 2019; v1 submitted 18 January, 2019; originally announced January 2019.

arXiv:1808.08083 [pdf, other]

doi 10.1007/s00158-019-02281-z

Automated shape differentiation in the Unified Form Language

Authors: David A. Ham, Lawrence Mitchell, Alberto Paganini, Florian Wechsung

Abstract: We discuss automating the calculation of weak shape derivatives in the Unified Form Language (Alnæs et al., ACM Trans. Math. Softw., 2014) by introducing an appropriate additional step in the pullback from physical to reference space that computes Gâteaux derivatives with respect to the coordinate field. We illustrate the ease of use with several examples. We discuss automating the calculation of weak shape derivatives in the Unified Form Language (Alnæs et al., ACM Trans. Math. Softw., 2014) by introducing an appropriate additional step in the pullback from physical to reference space that computes Gâteaux derivatives with respect to the coordinate field. We illustrate the ease of use with several examples. △ Less

Submitted 4 April, 2019; v1 submitted 24 August, 2018; originally announced August 2018.

Journal ref: Structural and Multidisciplinary Optimization 60:1813-1820 (2019)

arXiv:1802.00303 [pdf, other]

doi 10.5194/gmd-13-735-2020

Slate: extending Firedrake's domain-specific abstraction to hybridized solvers for geoscience and beyond

Authors: Thomas H. Gibson, Lawrence Mitchell, David A. Ham, Colin J. Cotter

Abstract: Within the finite element community, discontinuous Galerkin (DG) and mixed finite element methods have become increasingly popular in simulating geophysical flows. However, robust and efficient solvers for the resulting saddle-point and elliptic systems arising from these discretizations continue to be an on-going challenge. One possible approach for addressing this issue is to employ a method kno… ▽ More Within the finite element community, discontinuous Galerkin (DG) and mixed finite element methods have become increasingly popular in simulating geophysical flows. However, robust and efficient solvers for the resulting saddle-point and elliptic systems arising from these discretizations continue to be an on-going challenge. One possible approach for addressing this issue is to employ a method known as hybridization, where the discrete equations are transformed such that classic static condensation and local post-processing methods can be employed. However, it is challenging to implement hybridization as performant parallel code within complex models, whilst maintaining separation of concerns between applications scientists and software experts. In this paper, we introduce a domain-specific abstraction within the Firedrake finite element library that permits the rapid execution of these hybridization techniques within a code-generating framework. The resulting framework composes naturally with Firedrake's solver environment, allowing for the implementation of hybridization and static condensation as runtime-configurable preconditioners via the Python interface to PETSc, petsc4py. We provide examples derived from second order elliptic problems and geophysical fluid dynamics. In addition, we demonstrate that hybridization shows great promise for improving the performance of solvers for mixed finite element discretizations of equations related to large-scale geophysical flows. △ Less

Submitted 1 April, 2019; v1 submitted 1 February, 2018; originally announced February 2018.

Comments: Revisions for submission to GMD

ACM Class: G.4; G.1.8; I.1

Journal ref: Geoscientific Model Development 13:735-761 (2020)

arXiv:1711.08552 [pdf, other]

doi 10.5194/gmd-11-4359-2018

Thetis coastal ocean model: discontinuous Galerkin discretization for the three-dimensional hydrostatic equations

Authors: Tuomas Kärnä, Stephan C. Kramer, Lawrence Mitchell, David A. Ham, Matthew D. Piggott, António M. Baptista

Abstract: Unstructured grid ocean models are advantageous for simulating the coastal ocean and river-estuary-plume systems. However, unstructured grid models tend to be diffusive and/or computationally expensive which limits their applicability to real life problems. In this paper, we describe a novel discontinuous Galerkin (DG) finite element discretization for the hydrostatic equations. The formulation is… ▽ More Unstructured grid ocean models are advantageous for simulating the coastal ocean and river-estuary-plume systems. However, unstructured grid models tend to be diffusive and/or computationally expensive which limits their applicability to real life problems. In this paper, we describe a novel discontinuous Galerkin (DG) finite element discretization for the hydrostatic equations. The formulation is fully conservative and second-order accurate in space and time. Monotonicity of the advection scheme is ensured by using a strong stability preserving time integration method and slope limiters. Compared to previous DG models advantages include a more accurate mode splitting method, revised viscosity formulation, and new second-order time integration scheme. We demonstrate that the model is capable of simulating baroclinic flows in the eddying regime with a suite of test cases. Numerical dissipation is well-controlled, being comparable or lower than in existing state-of-the-art structured grid models. △ Less

Submitted 18 October, 2018; v1 submitted 22 November, 2017; originally announced November 2017.

Comments: Submitted to Geoscientific Model Development

Journal ref: Geoscientific Model Development 11:4359-4382 (2018)

arXiv:1711.02473 [pdf, other]

Exposing and exploiting structure: optimal code generation for high-order finite element methods

Authors: Miklós Homolya, Robert C. Kirby, David A. Ham

Abstract: Code generation based software platforms, such as Firedrake, have become popular tools for develo** complicated finite element discretisations of partial differential equations. We extended the code generation infrastructure in Firedrake with optimisations that can exploit the structure inherent to some finite elements. This includes sum factorisation on cuboid cells for continuous, discontinuou… ▽ More Code generation based software platforms, such as Firedrake, have become popular tools for develo** complicated finite element discretisations of partial differential equations. We extended the code generation infrastructure in Firedrake with optimisations that can exploit the structure inherent to some finite elements. This includes sum factorisation on cuboid cells for continuous, discontinuous, H(div) and H(curl) conforming elements. Our experiments confirm optimal algorithmic complexity for high-order finite element assembly. This is achieved through several novel contributions: the introduction of a more powerful interface between the form compiler and the library providing the finite elements; a more abstract, smarter library of finite elements called FInAT that explicitly communicates the structure of elements; and form compiler algorithms to automatically exploit this exposed structure. △ Less

Submitted 7 November, 2017; originally announced November 2017.

Comments: Submitted to ACM Transactions on Mathematical Software

arXiv:1705.03667 [pdf, other]

doi 10.1137/17M1130642

TSFC: a structure-preserving form compiler

Authors: Miklós Homolya, Lawrence Mitchell, Fabio Luporini, David A. Ham

Abstract: A form compiler takes a high-level description of the weak form of partial differential equations and produces low-level code that carries out the finite element assembly. In this paper we present the Two-Stage Form Compiler (TSFC), a new form compiler with the main motivation to maintain the structure of the input expression as long as possible. This facilitates the application of optimizations a… ▽ More A form compiler takes a high-level description of the weak form of partial differential equations and produces low-level code that carries out the finite element assembly. In this paper we present the Two-Stage Form Compiler (TSFC), a new form compiler with the main motivation to maintain the structure of the input expression as long as possible. This facilitates the application of optimizations at the highest possible level of abstraction. TSFC features a novel, structure-preserving method for separating the contributions of a form to the subblocks of the local tensor in discontinuous Galerkin problems. This enables us to preserve the tensor structure of expressions longer through the compilation process than other form compilers. This is also achieved in part by a two-stage approach that cleanly separates the lowering of finite element constructs to tensor algebra in the first stage, from the scheduling of those tensor operations in the second stage. TSFC also efficiently traverses complicated expressions, and experimental evaluation demonstrates good compile-time performance even for highly complex forms. △ Less

Submitted 9 April, 2018; v1 submitted 10 May, 2017; originally announced May 2017.

Comments: Accepted version. 28 pages plus 5 pages supplement

MSC Class: 68N20; 65M60; 65N30

Journal ref: SIAM Journal on Scientific Computing, 40 (2018), pp. C401-C428

arXiv:1606.08069 [pdf, other]

An iteration count estimate for a mesh-dependent steepest descent method based on finite elements and Riesz inner product representation

Authors: Tobias Schwedes, Simon W. Funke, David A. Ham

Abstract: Existing implementations of gradient-based optimisation methods typically assume that the problem is posed in Euclidean space. When solving optimality problems on function spaces, the functional derivative is then inaccurately represented with respect to $\ell^2$ instead of the inner product induced by the function space. This error manifests as a mesh dependence in the number of iterations requir… ▽ More Existing implementations of gradient-based optimisation methods typically assume that the problem is posed in Euclidean space. When solving optimality problems on function spaces, the functional derivative is then inaccurately represented with respect to $\ell^2$ instead of the inner product induced by the function space. This error manifests as a mesh dependence in the number of iterations required to solve the optimisation problem. In this paper, an analytic estimate is derived for this iteration count in the case of a simple and generic discretised optimisation problem. The system analysed is the steepest descent method applied to a finite element problem. The estimate is based on Kantorovich's inequality and on an upper bound for the condition number of Galerkin mass matrices. Computer simulations validate the iteration number estimate. Similar numerical results are found for a more complex optimisation problem constrained by a partial differential equation. Representing the functional derivative with respect to the inner product induced by the continuous control space leads to mesh independent convergence. △ Less

Submitted 26 June, 2016; originally announced June 2016.

Comments: 13 pages, 3 figures, 3 tables

arXiv:1604.05937 [pdf, other]

doi 10.5194/gmd-9-3803-2016

A structure-exploiting numbering algorithm for finite elements on extruded meshes, and its performance evaluation in Firedrake

Authors: Gheorghe-Teodor Bercea, Andrew T. T. McRae, David A. Ham, Lawrence Mitchell, Florian Rathgeber, Luigi Nardi, Fabio Luporini, Paul H. J. Kelly

Abstract: We present a generic algorithm for numbering and then efficiently iterating over the data values attached to an extruded mesh. An extruded mesh is formed by replicating an existing mesh, assumed to be unstructured, to form layers of prismatic cells. Applications of extruded meshes include, but are not limited to, the representation of 3D high aspect ratio domains employed by geophysical finite ele… ▽ More We present a generic algorithm for numbering and then efficiently iterating over the data values attached to an extruded mesh. An extruded mesh is formed by replicating an existing mesh, assumed to be unstructured, to form layers of prismatic cells. Applications of extruded meshes include, but are not limited to, the representation of 3D high aspect ratio domains employed by geophysical finite element simulations. These meshes are structured in the extruded direction. The algorithm presented here exploits this structure to avoid the performance penalty traditionally associated with unstructured meshes. We evaluate the implementation of this algorithm in the Firedrake finite element system on a range of low compute intensity operations which constitute worst cases for data layout performance exploration. The experiments show that having structure along the extruded direction enables the cost of the indirect data accesses to be amortized after 10-20 layers as long as the underlying mesh is well-ordered. We characterise the resulting spatial and temporal reuse in a representative set of both continuous-Galerkin and discontinuous-Galerkin discretisations. On meshes with realistic numbers of layers the performance achieved is between 70% and 90% of a theoretical hardware-specific limit. △ Less

Submitted 28 October, 2016; v1 submitted 20 April, 2016; originally announced April 2016.

Comments: Bibliography fixes, 23 pages

Journal ref: Geoscientific Model Development 9:3803-3815 (2016)

arXiv:1604.05872 [pdf, other]

doi 10.1145/3054944

An algorithm for the optimization of finite element integration loops

Authors: Fabio Luporini, David A. Ham, Paul H. J. Kelly

Abstract: We present an algorithm for the optimization of a class of finite element integration loop nests. This algorithm, which exploits fundamental mathematical properties of finite element operators, is proven to achieve a locally optimal operation count. In specified circumstances the optimum achieved is global. Extensive numerical experiments demonstrate significant performance improvements over the s… ▽ More We present an algorithm for the optimization of a class of finite element integration loop nests. This algorithm, which exploits fundamental mathematical properties of finite element operators, is proven to achieve a locally optimal operation count. In specified circumstances the optimum achieved is global. Extensive numerical experiments demonstrate significant performance improvements over the state of the art in finite element code generation in almost all cases. This validates the effectiveness of the algorithm presented here, and illustrates its limitations. △ Less

Submitted 20 April, 2016; originally announced April 2016.

ACM Class: G.1.8; G.4

arXiv:1505.03357 [pdf, other]

doi 10.1137/15M1021325

A parallel edge orientation algorithm for quadrilateral meshes

Authors: Miklós Homolya, David A. Ham

Abstract: One approach to achieving correct finite element assembly is to ensure that the local orientation of facets relative to each cell in the mesh is consistent with the global orientation of that facet. Rognes et al. have shown how to achieve this for any mesh composed of simplex elements, and deal.II contains a serial algorithm to construct a consistent orientation of any quadrilateral mesh of an ori… ▽ More One approach to achieving correct finite element assembly is to ensure that the local orientation of facets relative to each cell in the mesh is consistent with the global orientation of that facet. Rognes et al. have shown how to achieve this for any mesh composed of simplex elements, and deal.II contains a serial algorithm to construct a consistent orientation of any quadrilateral mesh of an orientable manifold. The core contribution of this paper is the extension of this algorithm for distributed memory parallel computers, which facilitates its seamless application as part of a parallel simulation system. Furthermore, our analysis establishes a link between the well-known Union-Find algorithm and the construction of a consistent orientation of a quadrilateral mesh. As a result, existing work on the parallelisation of the Union-Find algorithm can be easily adapted to construct further parallel algorithms for mesh orientations. △ Less

Submitted 11 December, 2015; v1 submitted 13 May, 2015; originally announced May 2015.

Comments: Second revision: minor changes

Journal ref: SIAM Journal on Scientific Computing, 38 (2016), pp. S48-S61

arXiv:1501.01809 [pdf, other]

doi 10.1145/2998441

Firedrake: automating the finite element method by composing abstractions

Authors: Florian Rathgeber, David A. Ham, Lawrence Mitchell, Michael Lange, Fabio Luporini, Andrew T. T. McRae, Gheorghe-Teodor Bercea, Graham R. Markall, Paul H. J. Kelly

Abstract: Firedrake is a new tool for automating the numerical solution of partial differential equations. Firedrake adopts the domain-specific language for the finite element method of the FEniCS project, but with a pure Python runtime-only implementation centred on the composition of several existing and new abstractions for particular aspects of scientific computing. The result is a more complete separat… ▽ More Firedrake is a new tool for automating the numerical solution of partial differential equations. Firedrake adopts the domain-specific language for the finite element method of the FEniCS project, but with a pure Python runtime-only implementation centred on the composition of several existing and new abstractions for particular aspects of scientific computing. The result is a more complete separation of concerns which eases the incorporation of separate contributions from computer scientists, numerical analysts and application specialists. These contributions may add functionality, or improve performance. Firedrake benefits from automatically applying new optimisations. This includes factorising mixed function spaces, transforming and vectorising inner loops, and intrinsically supporting block matrix operations. Importantly, Firedrake presents a simple public API for esca** the UFL abstraction. This allows users to implement common operations that fall outside pure variational formulations, such as flux-limiters. △ Less

Submitted 1 July, 2016; v1 submitted 8 January, 2015; originally announced January 2015.

Comments: Minor revisions to v2

ACM Class: G.1.8; G.4

Journal ref: ACM Transactions on Mathematical Software 43(3):24:1--24:27 (2016)

arXiv:1411.2940 [pdf, other]

doi 10.1137/15M1021167

Automated generation and symbolic manipulation of tensor product finite elements

Authors: Andrew T. T. McRae, Gheorghe-Teodor Bercea, Lawrence Mitchell, David A. Ham, Colin J. Cotter

Abstract: We describe and implement a symbolic algebra for scalar and vector-valued finite elements, enabling the computer generation of elements with tensor product structure on quadrilateral, hexahedral and triangular prismatic cells. The algebra is implemented as an extension to the domain-specific language UFL, the Unified Form Language. This allows users to construct many finite element spaces beyond t… ▽ More We describe and implement a symbolic algebra for scalar and vector-valued finite elements, enabling the computer generation of elements with tensor product structure on quadrilateral, hexahedral and triangular prismatic cells. The algebra is implemented as an extension to the domain-specific language UFL, the Unified Form Language. This allows users to construct many finite element spaces beyond those supported by existing software packages. We have made corresponding extensions to FIAT, the FInite element Automatic Tabulator, to enable numerical tabulation of such spaces. This tabulation is consequently used during the automatic generation of low-level code that carries out local assembly operations, within the wider context of solving finite element problems posed over such function spaces. We have done this work within the code-generation pipeline of the software package Firedrake; we make use of the full Firedrake package to present numerical examples. △ Less

Submitted 24 February, 2016; v1 submitted 11 November, 2014; originally announced November 2014.

Comments: Submitted to SISC special issue on CSE Software. Updated version, following reviewer comments

ACM Class: G.1.8; G.4

Journal ref: SIAM Journal on Scientific Computing 38(5):S25-S47 (2016)

arXiv:1410.3069 [pdf, other]

On the shallow atmosphere approximation in finite element dynamical cores

Authors: C. J. Cotter, D. A. Ham, A. T. T. McRae, L. Mitchell, A. Natale

Abstract: We provide an approach to implementing the shallow atmosphere approximation in three dimensional finite element discretisations for dynamical cores. The approach makes use of the fact that the shallow atmosphere approximation metric can be obtained by writing equations on a three-dimensional manifold embedded in $\mathbb{R}^4$ with a restriction of the Euclidean metric. We show that finite element… ▽ More We provide an approach to implementing the shallow atmosphere approximation in three dimensional finite element discretisations for dynamical cores. The approach makes use of the fact that the shallow atmosphere approximation metric can be obtained by writing equations on a three-dimensional manifold embedded in $\mathbb{R}^4$ with a restriction of the Euclidean metric. We show that finite element discretisations constructed this way are equivalent to the use of a modified three dimensional mesh for the construction of metric terms. We demonstrate our approach via a convergence test for a prototypical elliptic problem. △ Less

Submitted 12 October, 2014; originally announced October 2014.

Comments: Distributing this version for comments before submission

arXiv:1407.0904 [pdf, other]

doi 10.1145/2687415

COFFEE: an Optimizing Compiler for Finite Element Local Assembly

Authors: Fabio Luporini, Ana Lucia Varbanescu, Florian Rathgeber, Gheorghe-Teodor Bercea, J. Ramanujam, David A. Ham, Paul H. J. Kelly

Abstract: The numerical solution of partial differential equations using the finite element method is one of the key applications of high performance computing. Local assembly is its characteristic operation. This entails the execution of a problem-specific kernel to numerically evaluate an integral for each element in the discretized problem domain. Since the domain size can be huge, executing efficient ke… ▽ More The numerical solution of partial differential equations using the finite element method is one of the key applications of high performance computing. Local assembly is its characteristic operation. This entails the execution of a problem-specific kernel to numerically evaluate an integral for each element in the discretized problem domain. Since the domain size can be huge, executing efficient kernels is fundamental. Their op- timization is, however, a challenging issue. Even though affine loop nests are generally present, the short trip counts and the complexity of mathematical expressions make it hard to determine a single or unique sequence of successful transformations. Therefore, we present the design and systematic evaluation of COF- FEE, a domain-specific compiler for local assembly kernels. COFFEE manipulates abstract syntax trees generated from a high-level domain-specific language for PDEs by introducing domain-aware composable optimizations aimed at improving instruction-level parallelism, especially SIMD vectorization, and register locality. It then generates C code including vector intrinsics. Experiments using a range of finite-element forms of increasing complexity show that significant performance improvement is achieved. △ Less

Submitted 4 July, 2014; v1 submitted 3 July, 2014; originally announced July 2014.

Comments: Remove volume metadata

ACM Class: G.1.8; G.4

arXiv:1405.2356 [pdf, other]

Massive thermal fluctuation of massless graphene electrons

Authors: Hosang Yoon, Donhee Ham

Abstract: Whereas thermal current noise $\langle I^2 \rangle$ in typical conductors is proportional to temperature $T$, $\langle I^2 \rangle$ in graphene exhibits a nonlinear $T$ dependence due to the massless nature of individual electrons. This unique $\langle I^2 \rangle$ arising from individually massless electrons is intimately linked to the non-zero collective mass of graphene electrons; namely,… ▽ More Whereas thermal current noise $\langle I^2 \rangle$ in typical conductors is proportional to temperature $T$, $\langle I^2 \rangle$ in graphene exhibits a nonlinear $T$ dependence due to the massless nature of individual electrons. This unique $\langle I^2 \rangle$ arising from individually massless electrons is intimately linked to the non-zero collective mass of graphene electrons; namely, $\langle I^2 \rangle$ is set by the equipartition theorem applied to the collective mass's kinetic energy, with the nonlinear $T$-dependence arising from the $T$-dependence of the collective mass. This link between thermal fluctuation and collective dynamics unifies $\langle I^2 \rangle$ in graphene and typical conductors, while elucidating the uniqueness of the former at the same time. △ Less

Submitted 6 July, 2014; v1 submitted 9 May, 2014; originally announced May 2014.

Comments: 5 pages, 3 figures

arXiv:1401.4240 [pdf]

doi 10.1038/nnano.2014.112

Measurement of Collective Dynamical Mass of Dirac Fermions in Graphene

Authors: Hosang Yoon, Carlos Forsythe, Lei Wang, Nikolaos Tombros, Kenji Watanabe, Takashi Taniguchi, James Hone, Philip Kim, Donhee Ham

Abstract: Individual electrons in graphene behave as massless quasiparticles. In surprising twist, it is inferred from plasmonic investigations that collectively excited graphene electrons must exhibit non-zero mass and its inertial acceleration is essential for graphene plasmonics. Despite such importance, this collective mass has defied direct unequivocal measurement. It may be directly measured by accele… ▽ More Individual electrons in graphene behave as massless quasiparticles. In surprising twist, it is inferred from plasmonic investigations that collectively excited graphene electrons must exhibit non-zero mass and its inertial acceleration is essential for graphene plasmonics. Despite such importance, this collective mass has defied direct unequivocal measurement. It may be directly measured by accelerating it with a time-varying voltage and quantifying the phase delay of the resulting current; this voltage-current phase relation would manifest as kinetic inductance, representing the collective inertia's reluctance to accelerate. However, at optical (infrared) frequencies phase measurement of current is generally difficult and at microwave frequencies the inertial phase delay has been buried under electron scattering. Here we directly, precisely measure the kinetic inductance, thus, collective mass, by combining innovative device engineering that reduces electron scattering and delicate microwave phase measurements. Particularly, encapsulation of graphene between hexagonal-boron-nitride layers, one-dimensional edge contacts, and a proximate top gate configured as microwave ground together enable resolving the inertial phase delay from the electron scattering. Beside the fundamental importance, the kinetic inductance demonstrated here to be orders-of-magnitude larger than magnetic inductance can dramatically miniaturize radio-frequency integrated circuits. Moreover, its bias-dependency heralds a solid-state voltage-controlled inductor to complement the prevalent voltage-controlled capacitor. △ Less

Submitted 17 January, 2014; originally announced January 2014.

arXiv:1212.4170 [pdf]

doi 10.1063/1.4775668

Two-Path Solid-State Interferometry Using Ultra-Subwavelength 2D Plasmonic Waves

Authors: Kitty Y. M. Yeung, Hosang Yoon, William Andress, Ken West, Loren Pfeiffer, Donhee Ham

Abstract: We report an on-chip solid-state Mach-Zehnder interferometer operating on two-dimensional (2D) plasmonic waves at microwave frequencies. Two plasmonic paths are defined with GaAs/AlGaAs 2D electron gas 80 nm below a metallic gate. The gated 2D plasmonic waves achieve a velocity of ~c/300 (c: free-space light speed). Due to this ultra-subwavelength confinement, the resolution of the 2D plasmonic in… ▽ More We report an on-chip solid-state Mach-Zehnder interferometer operating on two-dimensional (2D) plasmonic waves at microwave frequencies. Two plasmonic paths are defined with GaAs/AlGaAs 2D electron gas 80 nm below a metallic gate. The gated 2D plasmonic waves achieve a velocity of ~c/300 (c: free-space light speed). Due to this ultra-subwavelength confinement, the resolution of the 2D plasmonic interferometer is two orders of magnitude higher than that of its electromagnetic counterpart at a given frequency. This GHz proof-of-concept at cryogenic temperatures can be scaled to the THz IR range for room temperature operation, while maintaining the benefits of the ultra-subwavelength confinement. △ Less

Submitted 17 December, 2012; originally announced December 2012.

Comments: 18 pages, 11 figures. The article has been submitted to Applied Physics Letters. After it is published, it will be found at http://apl.aip.org/

arXiv:1204.5577 [pdf, other]

doi 10.1137/120873558

Automated derivation of the adjoint of high-level transient finite element programs

Authors: Patrick E. Farrell, David A. Ham, Simon F. Funke, Marie E. Rognes

Abstract: In this paper we demonstrate a new technique for deriving discrete adjoint and tangent linear models of finite element models. The technique is significantly more efficient and automatic than standard algorithmic differentiation techniques. The approach relies on a high-level symbolic representation of the forward problem. In contrast to develo** a model directly in Fortran or C++, high-level sy… ▽ More In this paper we demonstrate a new technique for deriving discrete adjoint and tangent linear models of finite element models. The technique is significantly more efficient and automatic than standard algorithmic differentiation techniques. The approach relies on a high-level symbolic representation of the forward problem. In contrast to develo** a model directly in Fortran or C++, high-level systems allow the developer to express the variational problems to be solved in near-mathematical notation. As such, these systems have a key advantage: since the mathematical structure of the problem is preserved, they are more amenable to automated analysis and manipulation. The framework introduced here is implemented in a freely available software package named dolfin-adjoint, based on the FEniCS Project. Our approach to automated adjoint derivation relies on run-time annotation of the temporal structure of the model, and employs the FEniCS finite element form compiler to automatically generate the low-level code for the derived models. The approach requires only trivial changes to a large class of forward models, including complicated time-dependent nonlinear models. The adjoint model automatically employs optimal checkpointing schemes to mitigate storage requirements for nonlinear models, without any user management or intervention. Furthermore, both the tangent linear and adjoint models naturally work in parallel, without any need to differentiate through calls to MPI or to parse OpenMP directives. The generality, applicability and efficiency of the approach are demonstrated with examples from a wide range of scientific applications. △ Less

Submitted 16 October, 2013; v1 submitted 25 April, 2012; originally announced April 2012.

MSC Class: 65N30; 68N20; 49M29

Journal ref: SIAM Journal on Scientific Computing 2013 35:4, C369-C393

arXiv:0908.2214 [pdf, ps, other]

Phase Diffusion and Lamb-Shift-Like Spectrum Shift in Classical Oscillators

Authors: Xiaofeng Li, Wenjiang Zhu, Donhee Ham

Abstract: The phase diffusion in a self-sustained oscillator, which produces oscillator's spectral linewidth, is inherently governed by a nonlinear Langevin equation. Over past 40 years, the equation has been treated with linear approximation, rendering the nonlinearity's effects unknown. Here we solve the nonlinear Langevin equation using the perturbation method borrowed from quantum mechanics, and reveal… ▽ More The phase diffusion in a self-sustained oscillator, which produces oscillator's spectral linewidth, is inherently governed by a nonlinear Langevin equation. Over past 40 years, the equation has been treated with linear approximation, rendering the nonlinearity's effects unknown. Here we solve the nonlinear Langevin equation using the perturbation method borrowed from quantum mechanics, and reveal the physics of the nonlinearity: slower phase diffusion (linewidth narrowing) and a surprising oscillation frequency shift that formally corresponds to the Lamb shift in quantum electrodynamics. △ Less

Submitted 24 June, 2010; v1 submitted 16 August, 2009; originally announced August 2009.

Comments: 4 pages, 4 figures, 1 supplemental material

arXiv:0903.4385 [pdf, other]

doi 10.1364/OE.17.012929

Stable mode-locked pulses from mid-infrared semiconductor lasers

Authors: Christine Y. Wang, L. Kuznetsova, V. M. Gkortsas, L. Diehl, F. X. Kaertner, M. A. Belkin, A. Belyanin, X. Li, D. Ham, H Schneider, P. Grant, C. Y. Song, S. Haffouz, Z. R. Wasilewski, H. C. Liu, Federico Capasso

Abstract: We report the unequivocal demonstration of mid-infrared mode-locked pulses from a semiconductor laser. The train of short pulses was generated by actively modulating the current and hence the optical gain in a small section of an edge-emitting quantum cascade laser (QCL). Pulses with pulse duration at full-width-at-half-maximum of about 3 ps and energy of 0.5 pJ were characterized using a second… ▽ More We report the unequivocal demonstration of mid-infrared mode-locked pulses from a semiconductor laser. The train of short pulses was generated by actively modulating the current and hence the optical gain in a small section of an edge-emitting quantum cascade laser (QCL). Pulses with pulse duration at full-width-at-half-maximum of about 3 ps and energy of 0.5 pJ were characterized using a second-order interferometric autocorrelation technique based on a nonlinear quantum well infrared photodetector. The mode-locking dynamics in the QCLs was modelled and simulated based on Maxwell-Bloch equations in an open two-level system. We anticipate our results to be a significant step toward a compact, electrically-pumped source generating ultrashort light pulses in the mid-infrared and terahertz spectral ranges. △ Less

Submitted 25 March, 2009; originally announced March 2009.

Comments: 26 pages, 4 figures

Journal ref: Optics Express, Vol. 17, Issue 15, pp. 12929-12943 (2009)

arXiv:0805.4380 [pdf, other]

doi 10.1016/j.ocemod.2008.09.002

A mixed discontinuous/continuous finite element pair for shallow-water ocean modelling

Authors: C. J. Cotter, D. A. Ham, C. C. Pain

Abstract: We introduce a mixed discontinuous/continuous finite element pair for ocean modelling, with continuous quadratic pressure/layer depth and discontinuous velocity. We investigate the finite element pair applied to the linear shallow-water equations on an f-plane. The element pair has the property that all geostrophically balanced states which strongly satisfy the boundary conditions have discrete… ▽ More We introduce a mixed discontinuous/continuous finite element pair for ocean modelling, with continuous quadratic pressure/layer depth and discontinuous velocity. We investigate the finite element pair applied to the linear shallow-water equations on an f-plane. The element pair has the property that all geostrophically balanced states which strongly satisfy the boundary conditions have discrete divergence equal to exactly zero and hence are exactly steady states of the discretised equations. This means that the finite element pair has excellent geostrophic balance properties. We illustrate these properties using numerical tests and provide convergence calculations which show that the discretisation has quadratic errors, indicating that the element pair is stable. △ Less

Submitted 30 April, 2009; v1 submitted 28 May, 2008; originally announced May 2008.

arXiv:0707.4607 [pdf, other]

LBB Stability of a Mixed Discontinuous/Continuous Galerkin Finite Element Pair

Authors: C. J. Cotter, D. A. Ham, C. C. Pain, S. Reich

Abstract: We introduce a new mixed discontinuous/continuous Galerkin finite element for solving the 2- and 3-dimensional wave equations and equations of incompressible flow. The element, which we refer to as P1dg-P2, uses discontinuous piecewise linear functions for velocity and continuous piecewise quadratic functions for pressure. The aim of introducing the mixed formulation is to produce a new flexible… ▽ More We introduce a new mixed discontinuous/continuous Galerkin finite element for solving the 2- and 3-dimensional wave equations and equations of incompressible flow. The element, which we refer to as P1dg-P2, uses discontinuous piecewise linear functions for velocity and continuous piecewise quadratic functions for pressure. The aim of introducing the mixed formulation is to produce a new flexible element choice for triangular and tetrahedral meshes which satisfies the LBB stability condition and hence has no spurious zero-energy modes. We illustrate this property with numerical integrations of the wave equation in two dimensions, an analysis of the resultant discrete Laplace operator in two and three dimensions, and a normal mode analysis of the semi-discrete wave equation in one dimension. △ Less

Submitted 31 July, 2007; originally announced July 2007.

MSC Class: 65M60

arXiv:q-bio/0610043 [pdf]

Integrated Cell Manipulation Systems

Authors: Hakho Lee, Yong Liu, Donhee Ham, Robert M. Westervelt

Abstract: A new type of microfluidic system for biological cell manipulation, a CMOS/microfluidic hybrid, is demonstrated. The hybrid system starts with a custom-designed CMOS (complementary metal-oxide semiconductor) chip fabricated in a semiconductor foundry using standard integration circuit technology. A microfluidic channel is post-fabricated on top of the CMOS chip to provide biocompatible environme… ▽ More A new type of microfluidic system for biological cell manipulation, a CMOS/microfluidic hybrid, is demonstrated. The hybrid system starts with a custom-designed CMOS (complementary metal-oxide semiconductor) chip fabricated in a semiconductor foundry using standard integration circuit technology. A microfluidic channel is post-fabricated on top of the CMOS chip to provide biocompatible environment. The motion of individual biological cells that are tagged with magnetic beads is directly controlled by the CMOS chip that generates localized magnetic filed patterns using an on-chip array of micro-electromagnets. The speed and the programmability of the CMOS chip further allow for the dynamic reconfiguration of the magnetic fields, substantially increasing the manipulation capability of the hybrid system. The concept of a hybrid system is verified by simultaneously manipulating individual biological cells with microscopic resolution. A new operation protocol that exploits the fast speed of electronics to trap and move a large number of cells with less power consumption is also demonstrated. Combining the advantages of microelectronics, the CMOS/microfluidic hybrid approach presents a new model for a multifunctional lab-on-a chip for biological and medical applications. △ Less

Submitted 23 October, 2006; originally announced October 2006.

Comments: 7 pages, 7 figures

Showing 1–47 of 47 results for author: Ham, D