-
Lee-Wave Energy Sinks in Bottom-Intensified Flow: Reabsorption, Dissipation and Nonlinear Spectral Transfer
Authors:
Yue Cynthia Wu,
Eric Kunze,
Amit Tandon,
Amala Mahadevan
Abstract:
Idealized numerical simulation is used to explore energy sinks for lee waves trapped in their bottom-intensified generating flow. In addition to the loss to explicit dissipation and reabsorption predicted by linear wave action conservation, indirect dissipation due to a nonlinear forward cascade by parametric subharmonic instability represents a significant sink that substantially reduces reabsorp…
▽ More
Idealized numerical simulation is used to explore energy sinks for lee waves trapped in their bottom-intensified generating flow. In addition to the loss to explicit dissipation and reabsorption predicted by linear wave action conservation, indirect dissipation due to a nonlinear forward cascade by parametric subharmonic instability represents a significant sink that substantially reduces reabsorption. The partition of lee-wave energy loss between reabsorption and (explicit plus indirect) dissipation is independent of subgridscale dam** parameterization. Remote dissipation of freely propagating internal waves generated by shear instability at the lee-wave critical layer proves to be small. A general parameterization for lee-wave dissipation of the balanced flow requires a more complete exploration of the parameter space.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Optimizing a Data Science System for Text Reuse Analysis
Authors:
Ananth Mahadevan,
Michael Mathioudakis,
Eetu Mäkelä,
Mikko Tolonen
Abstract:
Text reuse is a methodological element of fundamental importance in humanities research: pieces of text that re-appear across different documents, verbatim or paraphrased, provide invaluable information about the historical spread and evolution of ideas. Large modern digitized corpora enable the joint analysis of text collections that span entire centuries and the detection of large-scale patterns…
▽ More
Text reuse is a methodological element of fundamental importance in humanities research: pieces of text that re-appear across different documents, verbatim or paraphrased, provide invaluable information about the historical spread and evolution of ideas. Large modern digitized corpora enable the joint analysis of text collections that span entire centuries and the detection of large-scale patterns, impossible to detect with traditional small-scale analysis. For this opportunity to materialize, it is necessary to develop efficient data science systems that perform the corresponding analysis tasks.
In this paper, we share insights from ReceptionReader, a system for analyzing text reuse in large historical corpora. The system is built upon billions of instances of text reuses from large digitized corpora of 18th-century texts. Its main functionality is to perform downstream text reuse analysis tasks, such as finding reuses that stem from a given article or identifying the most reused quotes from a set of documents, with each task expressed as a database query. For the purposes of the paper, we discuss the related design choices including various database normalization levels and query execution frameworks, such as distributed data processing (Apache Spark), indexed row store engine (MariaDB Aria), and compressed column store engine (MariaDB Columnstore). Moreover, we present an extensive evaluation with various metrics of interest (latency, storage size, and computing costs) for varying workloads, and we offer insights from the trade-offs we observed and the choices that emerged as optimal in our setting. In summary, our results show that (1) for the workloads that are most relevant to text-reuse analysis, the MariaDB Aria framework emerges as the overall optimal choice, (2) big data processing (Apache Spark) is irreplaceable for all processing stages of the system's pipeline.
△ Less
Submitted 14 January, 2024;
originally announced January 2024.
-
Cost-Effective Retraining of Machine Learning Models
Authors:
Ananth Mahadevan,
Michael Mathioudakis
Abstract:
It is important to retrain a machine learning (ML) model in order to maintain its performance as the data changes over time. However, this can be costly as it usually requires processing the entire dataset again. This creates a trade-off between retraining too frequently, which leads to unnecessary computing costs, and not retraining often enough, which results in stale and inaccurate ML models. T…
▽ More
It is important to retrain a machine learning (ML) model in order to maintain its performance as the data changes over time. However, this can be costly as it usually requires processing the entire dataset again. This creates a trade-off between retraining too frequently, which leads to unnecessary computing costs, and not retraining often enough, which results in stale and inaccurate ML models. To address this challenge, we propose ML systems that make automated and cost-effective decisions about when to retrain an ML model. We aim to optimize the trade-off by considering the costs associated with each decision. Our research focuses on determining whether to retrain or keep an existing ML model based on various factors, including the data, the model, and the predictive queries answered by the model. Our main contribution is a Cost-Aware Retraining Algorithm called Cara, which optimizes the trade-off over streams of data and queries. To evaluate the performance of Cara, we analyzed synthetic datasets and demonstrated that Cara can adapt to different data drifts and retraining costs while performing similarly to an optimal retrospective algorithm. We also conducted experiments with real-world datasets and showed that Cara achieves better accuracy than drift detection baselines while making fewer retraining decisions, ultimately resulting in lower total costs.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Random geometry at an infinite-randomness fixed point
Authors:
Akshat Pandey,
Aditya Mahadevan,
Aditya Cowsik
Abstract:
We study the low-energy physics of the critical (2+1)-dimensional random transverse-field Ising model. The one-dimensional version of the model is a paradigmatic example of a system governed by an infinite-randomness fixed point, for which many results on the distributions of observables are known via an asymptotically exact renormalization group (RG) approach. In two dimensions, the same RG rules…
▽ More
We study the low-energy physics of the critical (2+1)-dimensional random transverse-field Ising model. The one-dimensional version of the model is a paradigmatic example of a system governed by an infinite-randomness fixed point, for which many results on the distributions of observables are known via an asymptotically exact renormalization group (RG) approach. In two dimensions, the same RG rules have been implemented numerically, and demonstrate a flow to infinite randomness. However, analytical understanding of the structure of this RG has remained elusive due to the development of geometrical structure in the graph of interacting spins. To understand the character of the fixed point, we consider the RG flow acting on a joint ensemble of graphs and couplings. We propose that the RG effectively occurs in two stages: (1) randomization of the interaction graph until it belongs to a certain ensemble of random triangulations of the plane, and (2) a flow of the distributions of couplings to infinite randomness while the graph ensemble remains invariant. This picture is substantiated by a numerical RG in which one obtains a steady-state graph degree distribution and subsequently infinite-randomness scaling distributions of the couplings. Both of these aspects of the RG flow can be approximately reproduced in simplified analytical models.
△ Less
Submitted 2 August, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
Reception Reader: Exploring Text Reuse in Early Modern British Publications
Authors:
David Rosson,
Eetu Mäkelä,
Ville Vaara,
Ananth Mahadevan,
Yann Ryan,
Mikko Tolonen
Abstract:
The Reception Reader is a web tool for studying text reuse in the Early English Books Online (EEBO-TCP) and Eighteenth Century Collections Online (ECCO) data. Users can: 1) explore a visual overview of the reception of a work, or its incoming connections, across time based on shared text segments, 2) interactively survey the details of connected documents, and 3) examine the context of reused text…
▽ More
The Reception Reader is a web tool for studying text reuse in the Early English Books Online (EEBO-TCP) and Eighteenth Century Collections Online (ECCO) data. Users can: 1) explore a visual overview of the reception of a work, or its incoming connections, across time based on shared text segments, 2) interactively survey the details of connected documents, and 3) examine the context of reused text for "close reading". We show examples of how the tool streamlines research and exploration tasks, and discuss the utility and limitations of the user interface along with its current data sources.
△ Less
Submitted 18 April, 2023; v1 submitted 8 February, 2023;
originally announced February 2023.
-
Lagrangian surface signatures reveal upper-ocean vertical displacement conduits near oceanic density fronts
Authors:
H. M. Aravind,
Vicky Verma,
Sutanu Sarkar,
Mara A. Freilich,
Amala Mahadevan,
Patrick J. Haley,
Pierre F. J. Lermusiaux,
Michael R. Allshouse
Abstract:
Vertical transport in the ocean plays a critical role in the exchange of freshwater, heat, nutrients, and other biogeochemical tracers. While there are situations where vertical fluxes are important, studying the vertical transport and displacement of material requires analysis over a finite interval of time. One such example is the subduction of fluid from the mixed layer into the pycnocline, whi…
▽ More
Vertical transport in the ocean plays a critical role in the exchange of freshwater, heat, nutrients, and other biogeochemical tracers. While there are situations where vertical fluxes are important, studying the vertical transport and displacement of material requires analysis over a finite interval of time. One such example is the subduction of fluid from the mixed layer into the pycnocline, which is known to occur near density fronts. Divergence has been used to estimate vertical velocities indicating that surface measurements, where observational data is most widely available, can be used to locate these vertical transport conduits. We evaluate the correlation between surface signatures derived from Eulerian (horizontal divergence, density gradient, and vertical velocity) and Lagrangian (dilation rate and finite time Lyapunov exponent) metrics and vertical displacement conduits. Two submesoscale resolving models of density fronts and a data-assimilative model of the western Mediterranean were analyzed. The Lagrangian surface signatures locate significantly more of the strongest displacement features and the difference in the expected displacements relative to Eulerian ones increases with the length of the time interval considered. Ensemble analysis of forecasts from the Mediterranean model demonstrates that the Lagrangian surface signatures can be used to identify regions of strongest downward vertical displacement even without knowledge of the true ocean state.
△ Less
Submitted 21 September, 2022;
originally announced September 2022.
-
Seagrass deformation affects fluid instability and tracer exchange in canopy flow
Authors:
Guilherme S. Vieira,
Michael R. Allshouse,
Amala Mahadevan
Abstract:
Monami is the synchronous waving of a submerged seagrass bed in response to unidirectional fluid flow. Here we develop a multiphase model for the dynamical instabilities and flow-driven collective motions of buoyant, deformable seagrass. We show that the impedance to flow due to the seagrass results in an unstable velocity shear layer at the canopy interface, leading to a periodic array of vortice…
▽ More
Monami is the synchronous waving of a submerged seagrass bed in response to unidirectional fluid flow. Here we develop a multiphase model for the dynamical instabilities and flow-driven collective motions of buoyant, deformable seagrass. We show that the impedance to flow due to the seagrass results in an unstable velocity shear layer at the canopy interface, leading to a periodic array of vortices that propagate downstream. Each passing vortex locally weakens the along-stream velocity at the canopy top, reducing the drag and allowing the deformed grass to straighten up just beneath it. This causes the grass to oscillate periodically. Crucially, the maximal grass deflection is out of phase with the vortices. A phase diagram for the onset of instability shows its dependence on the fluid Reynolds number and an effective buoyancy parameter. Less buoyant grass is more easily deformed by the flow and forms a weaker shear layer, with smaller vortices and less material exchange across the canopy top. While higher Reynolds number leads to stronger vortices and larger waving amplitudes of the seagrass, waving is maximized at intermediate grass buoyancy. All together, our theory and computations correct some misconceptions in interpretation of the mechanism and provide a robust explanation consistent with a number of experimental observations.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Oceanic frontal divergence alters phytoplankton competition and distribution
Authors:
Abigail Plummer,
Mara Freilich,
Roberto Benzi,
Chang Jae Choi,
Lisa Sudek,
Alexandra Z. Worden,
Federico Toschi,
Amala Mahadevan
Abstract:
Ecological interactions among phytoplankton occur in a moving fluid environment. Oceanic flows can modulate the competition and coexistence between phytoplankton populations, which in turn can affect ecosystem function and biogeochemical cycling. We explore the impact of submesoscale velocity gradients on phytoplankton ecology using observations, simulations, and theory. Observations reveal that t…
▽ More
Ecological interactions among phytoplankton occur in a moving fluid environment. Oceanic flows can modulate the competition and coexistence between phytoplankton populations, which in turn can affect ecosystem function and biogeochemical cycling. We explore the impact of submesoscale velocity gradients on phytoplankton ecology using observations, simulations, and theory. Observations reveal that the relative abundance of Synechoccocus oligotypes varies on 1--10 km scales at an ocean front with submesoscale velocity gradients at the same scale. Simulations in realistic flow fields demonstrate that regions of divergence in the horizontal flow field can substantially modify ecological competition and dispersal on timescales of hours to days. Regions of positive (negative) divergence provide an advantage (disadvantage) to local populations, resulting in up to ~20% variation in community composition in our model. We propose that submesoscale divergence is a plausible contributor to observed taxonomic variability at oceanic fronts, and can lead to regional variability in community composition.
△ Less
Submitted 25 September, 2023; v1 submitted 23 February, 2022;
originally announced February 2022.
-
Diversity of growth rates maximizes phytoplankton productivity in an eddying ocean
Authors:
Mara Freilich,
Glenn Flierl,
Amala Mahadevan
Abstract:
In the subtropical gyres, phytoplankton rely on eddies for transporting nutrients from depth to the euphotic zone. But, what controls the rate of nutrient supply for new production? We show that vertical nutrient flux depends both on the vertical motion within the eddying flow and varies nonlinearly with the growth rate of the phytoplankton itself. Flux is maximized when the growth rate matches th…
▽ More
In the subtropical gyres, phytoplankton rely on eddies for transporting nutrients from depth to the euphotic zone. But, what controls the rate of nutrient supply for new production? We show that vertical nutrient flux depends both on the vertical motion within the eddying flow and varies nonlinearly with the growth rate of the phytoplankton itself. Flux is maximized when the growth rate matches the inverse of the decorrelation timescale for vertical motion. Using a three-dimensional ocean model and a linear nutrient uptake model, we find that phytoplankton productivity is maximized for a growth rate of 1/3 day$^{-1}$, which corresponds to the timescale of submesoscale dynamics. Variability in the frequency of vertical motion across different physical features of the flow favors phytoplankton production with different growth rates. Such a growth-transport feedback can generate diversity in the phytoplankton community structure at submesoscales and higher net productivity in the presence of community diversity.
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Certifiable Machine Unlearning for Linear Models
Authors:
Ananth Mahadevan,
Michael Mathioudakis
Abstract:
Machine unlearning is the task of updating machine learning (ML) models after a subset of the training data they were trained on is deleted. Methods for the task are desired to combine effectiveness and efficiency, i.e., they should effectively "unlearn" deleted data, but in a way that does not require excessive computation effort (e.g., a full retraining) for a small amount of deletions. Such a c…
▽ More
Machine unlearning is the task of updating machine learning (ML) models after a subset of the training data they were trained on is deleted. Methods for the task are desired to combine effectiveness and efficiency, i.e., they should effectively "unlearn" deleted data, but in a way that does not require excessive computation effort (e.g., a full retraining) for a small amount of deletions. Such a combination is typically achieved by tolerating some amount of approximation in the unlearning. In addition, laws and regulations in the spirit of "the right to be forgotten" have given rise to requirements for certifiability, i.e., the ability to demonstrate that the deleted data has indeed been unlearned by the ML model.
In this paper, we present an experimental study of the three state-of-the-art approximate unlearning methods for linear models and demonstrate the trade-offs between efficiency, effectiveness and certifiability offered by each method. In implementing the study, we extend some of the existing works and describe a common ML pipeline to compare and evaluate the unlearning methods on six real-world datasets and a variety of settings. We provide insights into the effect of the quantity and distribution of the deleted data on ML models and the performance of each unlearning method in different settings. We also propose a practical online strategy to determine when the accumulated error from approximate unlearning is large enough to warrant a full retrain of the ML model.
△ Less
Submitted 16 August, 2021; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Is the brain macroscopically linear? A system identification of resting state dynamics
Authors:
Erfan Nozari,
Maxwell A. Bertolero,
Jennifer Stiso,
Lorenzo Caciagli,
Eli J. Cornblath,
Xiaosong He,
Arun S. Mahadevan,
George J. Pappas,
Dani Smith Bassett
Abstract:
A central challenge in the computational modeling of neural dynamics is the trade-off between accuracy and simplicity. At the level of individual neurons, nonlinear dynamics are both experimentally established and essential for neuronal functioning. An implicit assumption has thus formed that an accurate computational model of whole-brain dynamics must also be highly nonlinear, whereas linear mode…
▽ More
A central challenge in the computational modeling of neural dynamics is the trade-off between accuracy and simplicity. At the level of individual neurons, nonlinear dynamics are both experimentally established and essential for neuronal functioning. An implicit assumption has thus formed that an accurate computational model of whole-brain dynamics must also be highly nonlinear, whereas linear models may provide a first-order approximation. Here, we provide a rigorous and data-driven investigation of this hypothesis at the level of whole-brain blood-oxygen-level-dependent (BOLD) and macroscopic field potential dynamics by leveraging the theory of system identification. Using functional MRI (fMRI) and intracranial EEG (iEEG), we model the resting state activity of 700 subjects in the Human Connectome Project (HCP) and 122 subjects from the Restoring Active Memory (RAM) project using state-of-the-art linear and nonlinear model families. We assess relative model fit using predictive power, computational complexity, and the extent of residual dynamics unexplained by the model. Contrary to our expectations, linear auto-regressive models achieve the best measures across all three metrics, eliminating the trade-off between accuracy and simplicity. To understand and explain this linearity, we highlight four properties of macroscopic neurodynamics which can counteract or mask microscopic nonlinear dynamics: averaging over space, averaging over time, observation noise, and limited data samples. Whereas the latter two are technological limitations and can improve in the future, the former two are inherent to aggregated macroscopic brain activity. Our results, together with the unparalleled interpretability of linear models, can greatly facilitate our understanding of macroscopic neural dynamics and the principled design of model-based interventions for the treatment of neuropsychiatric disorders.
△ Less
Submitted 11 August, 2021; v1 submitted 22 December, 2020;
originally announced December 2020.
-
Collision Avoidance Robotics Via Meta-Learning (CARML)
Authors:
Abhiram Iyer,
Aravind Mahadevan
Abstract:
This paper presents an approach to exploring a multi-objective reinforcement learning problem with Model-Agnostic Meta-Learning. The environment we used consists of a 2D vehicle equipped with a LIDAR sensor. The goal of the environment is to reach some pre-determined target location but also effectively avoid any obstacles it may find along its path. We also compare this approach against a baselin…
▽ More
This paper presents an approach to exploring a multi-objective reinforcement learning problem with Model-Agnostic Meta-Learning. The environment we used consists of a 2D vehicle equipped with a LIDAR sensor. The goal of the environment is to reach some pre-determined target location but also effectively avoid any obstacles it may find along its path. We also compare this approach against a baseline TD3 solution that attempts to solve the same problem.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.
-
Flow-driven branching in a frangible porous medium
Authors:
Nicholas J. Derr,
David C. Fronk,
Christoph A. Weber,
Amala Mahadevan,
Chris H. Rycroft,
L. Mahadevan
Abstract:
Channel formation and branching is widely seen in physical systems where movement of fluid through a porous structure causes the spatiotemporal evolution of the medium in response to the flow, in turn causing flow pathways to evolve. We provide a simple theoretical framework that embodies this feedback mechanism in a multi-phase model for flow through a fragile porous medium with a dynamic permeab…
▽ More
Channel formation and branching is widely seen in physical systems where movement of fluid through a porous structure causes the spatiotemporal evolution of the medium in response to the flow, in turn causing flow pathways to evolve. We provide a simple theoretical framework that embodies this feedback mechanism in a multi-phase model for flow through a fragile porous medium with a dynamic permeability. Numerical simulations of the model show the emergence of branched networks whose topology is determined by the geometry of external flow forcing. This allows us to delineate the conditions under which splitting and/or coalescing branched network formation is favored, with potential implications for both understanding and controlling branching in soft frangible media.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
Sustenance of phytoplankton in the subpolar North Atlantic during the winter through patchiness
Authors:
Farid Karimpour,
Amit Tandon,
Amala Mahadevan
Abstract:
This study investigates the influence of two factors that change the mixed layer depth and can potentially contribute to the phytoplankton sustenance over winter: 1) variability of air-sea fluxes and 2) three-dimensional processes arising from strong fronts. To study the role of these factors, we perform several three-dimensional numerical simulations forced with air-sea fluxes at different tempor…
▽ More
This study investigates the influence of two factors that change the mixed layer depth and can potentially contribute to the phytoplankton sustenance over winter: 1) variability of air-sea fluxes and 2) three-dimensional processes arising from strong fronts. To study the role of these factors, we perform several three-dimensional numerical simulations forced with air-sea fluxes at different temporal averaging frequencies as well as different spatial resolutions. Results show that in the winter, when the average mixed layer is much deeper than the euphotic layer and the days are short, phytoplankton production is relatively insensitive to the high-frequency variability in air-sea fluxes. The duration of upper ocean stratification due to high-frequency variability in air-sea fluxes is short and hence has a small impact on phytoplankton production. On the other hand, slum** of fronts creates patchy, stratified, shallow regions that persist considerably longer than stratification caused by changes in air-sea fluxes. Simulations show that before spring warming, the average MLD with fronts is about 700 m shallower than the average MLD without fronts. Therefore, fronts increase the residence time of phytoplankton in the euphotic layer and contribute to phytoplankton growth. Results show that before the spring warming, the depth-integrated phytoplankton concentration is about twice as large as phytoplankton concentration when there are no fronts. Hence, fronts are important for setting the MLD and sustaining phytoplankton in the winter. Model results also show that higher numerical resolution leads to stronger restratification, shallower mixed layers, greater variability in the MLD and higher production of phytoplankton.
△ Less
Submitted 18 November, 2017;
originally announced November 2017.
-
Monami as an oscillatory hydrodynamic instability in a submerged sea grass bed
Authors:
Ravi Singh,
M. M. Bandi,
Amala Mahadevan,
Shreyas Mandre
Abstract:
The onset of monami ~-- the synchronous waving of sea grass beds driven by a steady flow -- is modeled as a linear instability of the flow. Unlike previous works, our model considers the drag exerted by the grass in establishing the steady flow profile, and in dam** out perturbations to it. We find two distinct modes of instability, which we label Mode 1 and Mode 2. Mode 1 is closely related to…
▽ More
The onset of monami ~-- the synchronous waving of sea grass beds driven by a steady flow -- is modeled as a linear instability of the flow. Unlike previous works, our model considers the drag exerted by the grass in establishing the steady flow profile, and in dam** out perturbations to it. We find two distinct modes of instability, which we label Mode 1 and Mode 2. Mode 1 is closely related to Kelvin-Helmholtz instability modified by vegetation drag, whereas Mode 2 is unrelated to Kelvin-Helmholtz and arises from an interaction between the flow in the vegetated and unvegetated layers. The vegetation dam**, according to our model, leads to a finite threshold flow for both these modes. Experimental observations for the onset and frequency of waving compare well with model predictions for the instability onset criteria and the imaginary part of the complex growth rate respectively, but experiments lie in a parameter regime where the two modes can not be distinguished. % The inclusion of vegetation drag differentiates our mechanism from the previous linear stability analyses of monami.
△ Less
Submitted 17 September, 2015; v1 submitted 3 November, 2014;
originally announced November 2014.
-
Flow-induced channelization in a porous medium
Authors:
Amala Mahadevan,
L. Mahadevan
Abstract:
We propose a theory for erosional channelization induced by fluid flow in a saturated granular porous medium. When the local fluid flow-induced stress is larger than a critical threshold, grains are dislodged and carried away so that the porosity of the medium is altered by erosion. This in turn affects the local hydraulic conductivity and pressure in the medium and results in the growth and devel…
▽ More
We propose a theory for erosional channelization induced by fluid flow in a saturated granular porous medium. When the local fluid flow-induced stress is larger than a critical threshold, grains are dislodged and carried away so that the porosity of the medium is altered by erosion. This in turn affects the local hydraulic conductivity and pressure in the medium and results in the growth and development of channels that preferentially conduct the flow. Our multiphase model involves a dynamical porosity field that evolves along with the volume fraction of the mobile and immobile grains in response to fluid flow that couples the spatiotemporal dynamics of the three phases. Numerical solutions of the resulting initial boundary value problem show how channels form in porous media and highlights how heterogeneity in the erosion threshold dictates the form of the patterns and thus the ability to control them.
△ Less
Submitted 2 September, 2010;
originally announced September 2010.