-
Ultrafast Laser Ablation, Intrinsic Threshold, and Nanopatterning of Monolayer Molybdenum Disulfide
Authors:
Joel M. Solomon,
Sabeeh Irfan Ahmad,
Arpit Dave,
Li-Syuan Lu,
Fatemeh HadavandMirzaee,
Shih-Chu Lin,
Sih-Hua Chen,
Chih-Wei Luo,
Wen-Hao Chang,
Tsing-Hua Her
Abstract:
Laser direct writing is an attractive method for patterning 2D materials without contamination. Literature shows that the femtosecond ablation threshold of graphene across substrates varies by an order of magnitude. Some attribute it to the thermal coupling to the substrates, but it remains by and large an open question. For the first time the effect of substrates on femtosecond ablation of 2D mat…
▽ More
Laser direct writing is an attractive method for patterning 2D materials without contamination. Literature shows that the femtosecond ablation threshold of graphene across substrates varies by an order of magnitude. Some attribute it to the thermal coupling to the substrates, but it remains by and large an open question. For the first time the effect of substrates on femtosecond ablation of 2D materials is studied using MoS$_{2}$ as an example. We show unambiguously that femtosecond ablation of MoS$_{2}$ is an adiabatic process with negligible heat transfer to the substrates. The observed threshold variation is due to the etalon effect which was not identified before for the laser ablation of 2D materials. Subsequently, an intrinsic ablation threshold is proposed as a true threshold parameter for 2D materials. Additionally, we demonstrate for the first time femtosecond laser patterning of monolayer MoS$_{2}$ with sub-micron resolution and mm/s speed. Moreover, engineered substrates are shown to enhance the ablation efficiency, enabling patterning with low-power femtosecond oscillators. Finally, a zero-thickness approximation is introduced to predict the field enhancement with simple analytical expressions. Our work clarifies the role of substrates on ablation and firmly establishes femtosecond laser ablation as a viable route to pattern 2D materials.
△ Less
Submitted 1 November, 2021;
originally announced November 2021.
-
Decentralized Control of Two Agents with Nested Accessible Information
Authors:
Aditya Dave,
Nishanth Venkatesh,
Andreas A. Malikopoulos
Abstract:
In this paper, we investigate a decentralized stochastic control problem with two agents, where a part of the memory of the second agent is also available to the first agent at each instance of time. We derive a structural form for optimal control strategies which allows us to restrict their domain to a set which does not grow in size with time. We also present a dynamic programming (DP) decomposi…
▽ More
In this paper, we investigate a decentralized stochastic control problem with two agents, where a part of the memory of the second agent is also available to the first agent at each instance of time. We derive a structural form for optimal control strategies which allows us to restrict their domain to a set which does not grow in size with time. We also present a dynamic programming (DP) decomposition which can utilize our results to derive optimal strategies for arbitrarily long time horizons. Since obtaining optimal control strategies by solving this DP decomposition is computationally intensive, we present potential resolutions in the form of simplified strategies by imposing additional conditions on our model, and an approximation technique which can be used to implement our results with a bounded loss of optimality.
△ Less
Submitted 9 March, 2022; v1 submitted 23 September, 2021;
originally announced September 2021.
-
On Decentralized Minimax Control with Nested Subsystems
Authors:
Aditya Dave,
Nishanth Venkatesh,
Andreas A. Malikopoulos
Abstract:
In this paper, we investigate a decentralized control problem with nested subsystems, which is a general model for one-directional communication amongst many subsystems. The noises in our dynamics are modelled as uncertain variables which take values in finite sets. The objective is to minimize a worst-case shared cost. We demonstrate how the prescription approach can simplify the information stru…
▽ More
In this paper, we investigate a decentralized control problem with nested subsystems, which is a general model for one-directional communication amongst many subsystems. The noises in our dynamics are modelled as uncertain variables which take values in finite sets. The objective is to minimize a worst-case shared cost. We demonstrate how the prescription approach can simplify the information structure and derive a structural form for optimal control strategies. The structural form allows us to restrict attention to control strategies whose domains do not grow in size with time, and thus, this form can be utilized in systems with long time horizons. Finally, we present a dynamic program to derive the optimal control strategies and validate our results with a numerical example.
△ Less
Submitted 20 March, 2022; v1 submitted 13 September, 2021;
originally announced September 2021.
-
Thermal Image Processing via Physics-Inspired Deep Networks
Authors:
Vishwanath Saragadam,
Akshat Dave,
Ashok Veeraraghavan,
Richard Baraniuk
Abstract:
We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (t…
▽ More
We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (that is well-represented using a deep network-based regularizer). DeepIR requires neither training data nor periodic ground-truth calibration with a known black body target--making it well suited for practical computer vision tasks. We demonstrate the power of going DeepIR by develo** new denoising and super-resolution algorithms that exploit multiple images of the scene captured with camera jitter. Simulated and real data experiments demonstrate that DeepIR can perform high-quality non-uniformity correction with as few as three images, achieving a 10dB PSNR improvement over competing approaches.
△ Less
Submitted 25 August, 2021; v1 submitted 18 August, 2021;
originally announced August 2021.
-
Empirical Models for Multidimensional Regression of Fission Systems
Authors:
Akshay J. Dave,
Jiankai Yu,
Jarod Wilson,
Bren Phillips,
Kaichao Sun,
Benoit Forget
Abstract:
The development of next-generation autonomous control of fission systems, such as nuclear power plants, will require leveraging advancements in machine learning. For fission systems, accurate prediction of nuclear transport is important to quantify the safety margin and optimize performance. The state-of-the-art approach to this problem is costly Monte Carlo (MC) simulations to approximate solutio…
▽ More
The development of next-generation autonomous control of fission systems, such as nuclear power plants, will require leveraging advancements in machine learning. For fission systems, accurate prediction of nuclear transport is important to quantify the safety margin and optimize performance. The state-of-the-art approach to this problem is costly Monte Carlo (MC) simulations to approximate solutions of the neutron transport equation. Such an approach is feasible for offline calculations e.g., for design or licensing, but is precluded from use as a model-based controller. In this work, we explore the use of Artificial Neural Networks (ANN), Gradient Boosting Regression (GBR), Gaussian Process Regression (GPR) and Support Vector Regression (SVR) to generate empirical models. The empirical model can then be deployed, e.g., in a model predictive controller. Two fission systems are explored: the subcritical MIT Graphite Exponential Pile (MGEP), and the critical MIT Research Reactor (MITR).
Findings from this work establish guidelines for develo** empirical models for multidimensional regression of neutron transport. An assessment of the accuracy and precision finds that the SVR, followed closely by ANN, performs the best. For both MGEP and MITR, the optimized SVR model exhibited a domain-averaged, test, mean absolute percentage error of 0.17 %. A spatial distribution of performance metrics indicates that physical regions of poor performance coincide with locations of largest neutron flux perturbation -- this outcome is mitigated by ANN and SVR. Even at local maxima, ANN and SVR bias is within experimental uncertainty bounds. A comparison of the performance vs. training dataset size found that SVR is more data-efficient than ANN. Both ANN and SVR achieve a greater than 7 order reduction in evaluation time vs. a MC simulation.
△ Less
Submitted 30 May, 2021;
originally announced May 2021.
-
Opening up Open-World Tracking
Authors:
Yang Liu,
Idil Esen Zulfikar,
Jonathon Luiten,
Achal Dave,
Deva Ramanan,
Bastian Leibe,
Aljoša Ošep,
Laura Leal-Taixé
Abstract:
Tracking and detecting any object, including ones never-seen-before during model training, is a crucial but elusive capability of autonomous systems. An autonomous agent that is blind to never-seen-before objects poses a safety hazard when operating in the real world - and yet this is how almost all current systems work. One of the main obstacles towards advancing tracking any object is that this…
▽ More
Tracking and detecting any object, including ones never-seen-before during model training, is a crucial but elusive capability of autonomous systems. An autonomous agent that is blind to never-seen-before objects poses a safety hazard when operating in the real world - and yet this is how almost all current systems work. One of the main obstacles towards advancing tracking any object is that this task is notoriously difficult to evaluate. A benchmark that would allow us to perform an apples-to-apples comparison of existing efforts is a crucial first step towards advancing this important research field. This paper addresses this evaluation deficit and lays out the landscape and evaluation methodology for detecting and tracking both known and unknown objects in the open-world setting. We propose a new benchmark, TAO-OW: Tracking Any Object in an Open World, analyze existing efforts in multi-object tracking, and construct a baseline for this task while highlighting future challenges. We hope to open a new front in multi-object tracking research that will hopefully bring us a step closer to intelligent systems that can operate safely in the real world. https://openworldtracking.github.io/
△ Less
Submitted 28 March, 2022; v1 submitted 22 April, 2021;
originally announced April 2021.
-
Media Cloud: Massive Open Source Collection of Global News on the Open Web
Authors:
Hal Roberts,
Rahul Bhargava,
Linas Valiukas,
Dennis Jen,
Momin M. Malik,
Cindy Bishop,
Emily Ndulue,
Aashka Dave,
Justin Clark,
Bruce Etling,
Rob Faris,
Anushka Shah,
Jasmin Rubinovitz,
Alexis Hope,
Catherine D'Ignazio,
Fernando Bermejo,
Yochai Benkler,
Ethan Zuckerman
Abstract:
We present the first full description of Media Cloud, an open source platform based on crawling hyperlink structure in operation for over 10 years, that for many uses will be the best way to collect data for studying the media ecosystem on the open web. We document the key choices behind what data Media Cloud collects and stores, how it processes and organizes these data, and its open API access a…
▽ More
We present the first full description of Media Cloud, an open source platform based on crawling hyperlink structure in operation for over 10 years, that for many uses will be the best way to collect data for studying the media ecosystem on the open web. We document the key choices behind what data Media Cloud collects and stores, how it processes and organizes these data, and its open API access as well as user-facing tools. We also highlight the strengths and limitations of the Media Cloud collection strategy compared to relevant alternatives. We give an overview two sample datasets generated using Media Cloud and discuss how researchers can use the platform to create their own datasets.
△ Less
Submitted 1 May, 2021; v1 submitted 8 April, 2021;
originally announced April 2021.
-
A Dynamic Program for a Team of Two Agents with Nested Information
Authors:
Aditya Dave,
Andreas A. Malikopoulos
Abstract:
In this paper, we investigate a sequential dynamic team problem consisting of two agents with a nested information structure. We use a combination of the person-by-person and prescription approach to derive structural results for optimal control strategies for the team. We then use these structural results to present a dynamic programming (DP) decomposition to derive the optimal control strategies…
▽ More
In this paper, we investigate a sequential dynamic team problem consisting of two agents with a nested information structure. We use a combination of the person-by-person and prescription approach to derive structural results for optimal control strategies for the team. We then use these structural results to present a dynamic programming (DP) decomposition to derive the optimal control strategies for a finite time horizon. We show that our DP utilizes the nested information structure to simplify the computation of the optimal control laws for the team at the final time step.
△ Less
Submitted 12 September, 2021; v1 submitted 18 March, 2021;
originally announced March 2021.
-
Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details
Authors:
Achal Dave,
Piotr Dollár,
Deva Ramanan,
Alexander Kirillov,
Ross Girshick
Abstract:
By design, average precision (AP) for object detection aims to treat all classes independently: AP is computed independently per category and averaged. On one hand, this is desirable as it treats all classes equally. On the other hand, it ignores cross-category confidence calibration, a key property in real-world use cases. Unfortunately, under important conditions (i.e., large vocabulary, high in…
▽ More
By design, average precision (AP) for object detection aims to treat all classes independently: AP is computed independently per category and averaged. On one hand, this is desirable as it treats all classes equally. On the other hand, it ignores cross-category confidence calibration, a key property in real-world use cases. Unfortunately, under important conditions (i.e., large vocabulary, high instance counts) the default implementation of AP is neither category independent, nor does it directly reward properly calibrated detectors. In fact, we show that on LVIS the default implementation produces a gameable metric, where a simple, un-intuitive re-ranking policy can improve AP by a large margin. To address these limitations, we introduce two complementary metrics. First, we present a simple fix to the default AP implementation, ensuring that it is independent across categories as originally intended. We benchmark recent LVIS detection advances and find that many reported gains do not translate to improvements under our new evaluation, suggesting recent improvements may arise from difficult to interpret changes to cross-category rankings. Given the importance of reliably benchmarking cross-category rankings, we consider a pooled version of AP (AP-Pool) that rewards properly calibrated detectors by directly comparing cross-category rankings. Finally, we revisit classical approaches for calibration and find that explicitly calibrating detectors improves state-of-the-art on AP-Pool by 1.7 points
△ Less
Submitted 15 March, 2022; v1 submitted 1 February, 2021;
originally announced February 2021.
-
SEDAT:Security Enhanced Device Attestation with TPM2.0
Authors:
Avani Dave,
Monty Wiseman,
David Safford
Abstract:
Remote attestation is one of the ways to verify the state of an untrusted device. Earlier research has attempted remote verification of a devices' state using hardware, software, or hybrid approaches. Majority of them have used Attestation Key as a hardware root of trust, which does not detect hardware modification or counterfeit issues. In addition, they do not have a secure communication channel…
▽ More
Remote attestation is one of the ways to verify the state of an untrusted device. Earlier research has attempted remote verification of a devices' state using hardware, software, or hybrid approaches. Majority of them have used Attestation Key as a hardware root of trust, which does not detect hardware modification or counterfeit issues. In addition, they do not have a secure communication channel between verifier and prover, which makes them susceptible to modern security attacks. This paper presents SEDAT, a novel methodology for remote attestation of the device via a security enhanced communication channel. SEDAT performs hardware, firmware, and software attestation. SEDAT enhances the communication protocol security between verifier and prover by using the Single Packet Authorization (SPA) technique, which provides replay and Denial of Service (DoS) protection. SEDAT provides a way for verifier to get on-demand device integrity and authenticity status via a secure channel. It also enables the verifier to detect counterfeit hardware, change in firmware, and software code on the device. SEDAT validates the manufacturers` root CA certificate, platform certificate, endorsement certificate (EK), and attributes certificates to perform platform hardware attestation. SEDAT is the first known tool that represents firmware, and Integrity Measurement Authority (IMA) event logs in the Canonical Event Logs (CEL) format (recommended by Trusted Computing Group). SEDAT is the first implementation, to the best of our knowledge, that showcases end to end hardware, firmware, and software remote attestation using Trusted Platform Module (TPM2.0) which is resilient to DoS and replay attacks. SEDAT is the first remote verifier that is capable of retrieving a TPM2.0 quote from prover and validate it after regeneration, using a software TPM2.0 quote check.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
CARE: Lightweight Attack Resilient Secure Boot Architecturewith Onboard Recovery for RISC-V based SOC
Authors:
Avani Dave,
Nilanjan Banerjee,
Chintan Patel
Abstract:
Recent technological advancements have proliferated the use of small embedded devices for collecting, processing, and transferring the security-critical information. The Internet of Things (IoT) has enabled remote access and control of these network-connected devices. Consequently, an attacker can exploit security vulnerabilities and compromise these devices. In this context, the secure boot becom…
▽ More
Recent technological advancements have proliferated the use of small embedded devices for collecting, processing, and transferring the security-critical information. The Internet of Things (IoT) has enabled remote access and control of these network-connected devices. Consequently, an attacker can exploit security vulnerabilities and compromise these devices. In this context, the secure boot becomes a useful security mechanism to verify the integrity and authenticity of the software state of the devices. However, the current secure boot schemes focus on detecting the presence of potential malware on the device but not on disinfecting and restoring the soft-ware to a benign state. This manuscript presents CARE- the first secure boot framework that provides detection, resilience, and onboard recovery mechanism for the com-promised devices. The framework uses a prototype hybrid CARE: Code Authentication and Resilience Engine to verify the software state and restore it to a benign state. It uses Physical Memory Protection (PMP) and other security enchaining techniques of RISC-V processor to pro-vide resilience from modern attacks. The state-of-the-art comparison and performance analysis results indicate that the proposed secure boot framework provides a promising resilience and recovery mechanism with very little 8 % performance and resource overhead
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
SRACARE: Secure Remote Attestation with Code Authentication and Resilience Engine
Authors:
Avani Dave,
Nilanjan Banerjee,
Chintan Patel
Abstract:
Recent technological advancements have enabled proliferated use of small embedded and IoT devices for collecting, processing, and transferring the security-critical information and user data. This exponential use has acted as a catalyst in the recent growth of sophisticated attacks such as the replay, man-in-the-middle, and malicious code modification to slink, leak, tweak or exploit the security-…
▽ More
Recent technological advancements have enabled proliferated use of small embedded and IoT devices for collecting, processing, and transferring the security-critical information and user data. This exponential use has acted as a catalyst in the recent growth of sophisticated attacks such as the replay, man-in-the-middle, and malicious code modification to slink, leak, tweak or exploit the security-critical information in malevolent activities. Therefore, secure communication and software state assurance (at run-time and boot-time) of the device has emerged as open security problems. Furthermore, these devices need to have an appropriate recovery mechanism to bring them back to the known-good operational state. Previous researchers have demonstrated independent methods for attack detection and safeguard. However, the majority of them lack in providing onboard system recovery and secure communication techniques. To bridge this gap, this manuscript proposes SRACARE- a framework that utilizes the custom lightweight, secure communication protocol that performs remote/local attestation, and secure boot with an onboard resilience recovery mechanism to protect the devices from the above-mentioned attacks. The prototype employs an efficient lightweight, low-power 32-bit RISC-V processor, secure communication protocol, code authentication, and resilience engine running on the Artix 7 Field Programmable Gate Array(FPGA) board. This work presents the performance evaluation and state-of-the-art comparison results, which shows promising resilience to attacks and demonstrate the novel protection mechanism with onboard recovery. The framework achieves these with only 8 % performance overhead and a very small increase in hardware-software footprint.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
Detecting Invisible People
Authors:
Tarasha Khurana,
Achal Dave,
Deva Ramanan
Abstract:
Monocular object detection and tracking have improved drastically in recent years, but rely on a key assumption: that objects are visible to the camera. Many offline tracking approaches reason about occluded objects post-hoc, by linking together tracklets after the object re-appears, making use of reidentification (ReID). However, online tracking in embodied robotic agents (such as a self-driving…
▽ More
Monocular object detection and tracking have improved drastically in recent years, but rely on a key assumption: that objects are visible to the camera. Many offline tracking approaches reason about occluded objects post-hoc, by linking together tracklets after the object re-appears, making use of reidentification (ReID). However, online tracking in embodied robotic agents (such as a self-driving vehicle) fundamentally requires object permanence, which is the ability to reason about occluded objects before they re-appear. In this work, we re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects, focusing on the illustrative case of people. We demonstrate that current detection and tracking systems perform dramatically worse on this task. We introduce two key innovations to recover much of this performance drop. We treat occluded object detection in temporal sequences as a short-term forecasting challenge, bringing to bear tools from dynamic sequence prediction. Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks. To our knowledge, ours is the first work to demonstrate the effectiveness of monocular depth estimation for the task of tracking and detecting occluded objects. Our approach strongly improves by 11.4% over the baseline in ablations and by 5.0% over the state-of-the-art in F1 score.
△ Less
Submitted 15 December, 2020;
originally announced December 2020.
-
AutoMat: Accelerated Computational Electrochemical systems Discovery
Authors:
Emil Annevelink,
Rachel Kurchin,
Eric Muckley,
Lance Kavalsky,
Vinay I. Hegde,
Valentin Sulzer,
Shang Zhu,
Jiankun Pu,
David Farina,
Matthew Johnson,
Dhairya Gandhi,
Adarsh Dave,
Hongyi Lin,
Alan Edelman,
Bharath Ramsundar,
James Saal,
Christopher Rackauckas,
Viral Shah,
Bryce Meredig,
Venkatasubramanian Viswanathan
Abstract:
Large-scale electrification is vital to addressing the climate crisis, but several scientific and technological challenges remain to fully electrify both the chemical industry and transportation. In both of these areas, new electrochemical materials will be critical, but their development currently relies heavily on human-time-intensive experimental trial and error and computationally expensive fi…
▽ More
Large-scale electrification is vital to addressing the climate crisis, but several scientific and technological challenges remain to fully electrify both the chemical industry and transportation. In both of these areas, new electrochemical materials will be critical, but their development currently relies heavily on human-time-intensive experimental trial and error and computationally expensive first-principles, meso-scale and continuum simulations. We present an automated workflow, AutoMat, that accelerates these computational steps by introducing both automated input generation and management of simulations across scales from first principles to continuum device modeling. Furthermore, we show how to seamlessly integrate multi-fidelity predictions such as machine learning surrogates or automated robotic experiments "in-the-loop". The automated framework is implemented with design space search techniques to dramatically accelerate the overall materials discovery pipeline by implicitly learning design features that optimize device performance across several metrics. We discuss the benefits of AutoMat using examples in electrocatalysis and energy storage and highlight lessons learned.
△ Less
Submitted 13 May, 2022; v1 submitted 3 November, 2020;
originally announced November 2020.
-
Deep Surrogate Models for Multi-dimensional Regression of Reactor Power
Authors:
Akshay J. Dave,
Jarod Wilson,
Kaichao Sun
Abstract:
There is renewed interest in develo** small modular reactors and micro-reactors. Innovation is necessary in both construction and operation methods of these reactors to be financially attractive. For operation, an area of interest is the development of fully autonomous reactor control. Significant efforts are necessary to demonstrate an autonomous control framework for a nuclear system, while ad…
▽ More
There is renewed interest in develo** small modular reactors and micro-reactors. Innovation is necessary in both construction and operation methods of these reactors to be financially attractive. For operation, an area of interest is the development of fully autonomous reactor control. Significant efforts are necessary to demonstrate an autonomous control framework for a nuclear system, while adhering to established safety criteria. Our group has proposed and received support for demonstration of an autonomous framework on a subcritical system: the MIT Graphite Exponential Pile. In order to have a fast response (on the order of miliseconds), we must extract specific capabilities of general-purpose system codes to a surrogate model. Thus, we have adopted current state-of-the-art neural network libraries to build surrogate models.
This work focuses on establishing the capability of neural networks to provide an accurate and precise multi-dimensional regression of a nuclear reactor's power distribution. We assess using a neural network surrogate against a previously validated model: an MCNP5 model of the MIT reactor. The results indicate that neural networks are an appropriate choice for surrogate models to implement in an autonomous reactor control framework. The MAPE across all test datasets was < 1.16 % with a corresponding standard deviation of < 0.77 %. The error is low, considering that the node-wise fission power can vary from 7 kW to 30 kW across the core.
△ Less
Submitted 13 July, 2020; v1 submitted 10 July, 2020;
originally announced July 2020.
-
Measuring Mars Atmospheric Winds From Orbit
Authors:
Scott Guzewich J. B. Abshire. M. M. Baker,
J. M. Battalio,
T. Bertrand,
A. J. Brown,
A. Colaprete,
A. M. Cook,
D. R. Cremons,
M. M. Crismani,
A. I. Dave,
M. Day,
M. -C. Desjean,
M. Elrod,
L. K. Fenton,
J. Fisher,
L. L. Gordley,
P. O. Hayne,
N. G. Heavens,
J. L. Hollingsworth,
D. Jha,
V. Jha,
M. A. Kahre,
A. SJ. Khayat,
A. M. Kling,
S. R. Lewis,
B. T. Marshall
, et al. (16 additional authors not shown)
Abstract:
Wind is the process that connects Mars' climate system. Measurements of Mars atmospheric winds from orbit would dramatically advance our understanding of Mars and help prepare for human exploration of the Red Planet. Multiple instrument candidates are in development and will be ready for flight in the next decade. We urge the Decadal Survey to make these measurements a priority for 2023-2032.
Wind is the process that connects Mars' climate system. Measurements of Mars atmospheric winds from orbit would dramatically advance our understanding of Mars and help prepare for human exploration of the Red Planet. Multiple instrument candidates are in development and will be ready for flight in the next decade. We urge the Decadal Survey to make these measurements a priority for 2023-2032.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
Measuring Robustness to Natural Distribution Shifts in Image Classification
Authors:
Rohan Taori,
Achal Dave,
Vaishaal Shankar,
Nicholas Carlini,
Benjamin Recht,
Ludwig Schmidt
Abstract:
We study how robust current ImageNet models are to distribution shifts arising from natural variations in datasets. Most research on robustness focuses on synthetic image perturbations (noise, simulated weather artifacts, adversarial examples, etc.), which leaves open how robustness on synthetic distribution shift relates to distribution shift arising in real data. Informed by an evaluation of 204…
▽ More
We study how robust current ImageNet models are to distribution shifts arising from natural variations in datasets. Most research on robustness focuses on synthetic image perturbations (noise, simulated weather artifacts, adversarial examples, etc.), which leaves open how robustness on synthetic distribution shift relates to distribution shift arising in real data. Informed by an evaluation of 204 ImageNet models in 213 different test conditions, we find that there is often little to no transfer of robustness from current synthetic to natural distribution shift. Moreover, most current techniques provide no robustness to the natural distribution shifts in our testbed. The main exception is training on larger and more diverse datasets, which in multiple cases increases robustness, but is still far from closing the performance gaps. Our results indicate that distribution shifts arising in real data are currently an open research problem. We provide our testbed and data as a resource for future work at https://modestyachts.github.io/imagenet-testbed/ .
△ Less
Submitted 14 September, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
TAO: A Large-Scale Benchmark for Tracking Any Object
Authors:
Achal Dave,
Tarasha Khurana,
Pavel Tokmakov,
Cordelia Schmid,
Deva Ramanan
Abstract:
For many years, multi-object tracking benchmarks have focused on a handful of categories. Motivated primarily by surveillance and self-driving applications, these datasets provide tracks for people, vehicles, and animals, ignoring the vast majority of objects in the world. By contrast, in the related field of object detection, the introduction of large-scale, diverse datasets (e.g., COCO) have fos…
▽ More
For many years, multi-object tracking benchmarks have focused on a handful of categories. Motivated primarily by surveillance and self-driving applications, these datasets provide tracks for people, vehicles, and animals, ignoring the vast majority of objects in the world. By contrast, in the related field of object detection, the introduction of large-scale, diverse datasets (e.g., COCO) have fostered significant progress in develo** highly robust solutions. To bridge this gap, we introduce a similarly diverse dataset for Tracking Any Object (TAO). It consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average. Importantly, we adopt a bottom-up approach for discovering a large vocabulary of 833 categories, an order of magnitude more than prior tracking benchmarks. To this end, we ask annotators to label objects that move at any point in the video, and give names to them post factum. Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets. To ensure scalability of annotation, we employ a federated approach that focuses manual effort on labeling tracks for those relevant objects in a video (e.g., those that move). We perform an extensive evaluation of state-of-the-art trackers and make a number of important discoveries regarding large-vocabulary tracking in an open-world. In particular, we show that existing single- and multi-object trackers struggle when applied to this scenario in the wild, and that detection-based, multi-object trackers are in fact competitive with user-initialized ones. We hope that our dataset and analysis will boost further progress in the tracking community.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
Inference of Gas-liquid Flowrate using Neural Networks
Authors:
Akshay J. Dave,
Annalisa Manera
Abstract:
The metering of gas-liquid flows is difficult due to the non-linear relationship between flow regimes and fluid properties, flow orientation, channel geometry, etc. In fact, a majority of commercial multiphase flow meters have a low accuracy, limited range of operation or require a physical separation of the phases. We introduce the inference of gas-liquid flowrates using a neural network model th…
▽ More
The metering of gas-liquid flows is difficult due to the non-linear relationship between flow regimes and fluid properties, flow orientation, channel geometry, etc. In fact, a majority of commercial multiphase flow meters have a low accuracy, limited range of operation or require a physical separation of the phases. We introduce the inference of gas-liquid flowrates using a neural network model that is trained by wire-mesh sensor (WMS) experimental data. The WMS is an experimental tool that records high-resolution high-frequency 3D void fraction distributions in gas-liquid flows. The experimental database utilized spans over two orders of superficial velocity magnitude and multiple flow regimes for a vertical small-diameter pipe. Our findings indicate that a single network can provide accurate and precise inference with below a 7.5% MAP error across all flow regimes. The best performing networks have a combination of a 3D-Convolution head, and an LSTM tail. The finding indicates that the spatiotemporal features observed in gas-liquid flows can be systematically decomposed and used for inferring phase-wise flowrate. Our method does not involve any complex pre-processing of the void fraction matrices, resulting in an evaluation time that is negligible when contrasted to the input time-span. The efficiency of the model manifests in a response time two orders of magnitude lower than the current state-of-the-art.
△ Less
Submitted 25 May, 2020; v1 submitted 15 March, 2020;
originally announced March 2020.
-
Social Media and Misleading Information in a Democracy: A Mechanism Design Approach
Authors:
Aditya Dave,
Ioannis Vasileios Chremos,
Andreas A. Malikopoulos
Abstract:
In this paper, we present a resource allocation mechanism for the problem of incentivizing filtering among a finite number of strategic social media platforms. We consider the presence of a strategic government and private knowledge of how misinformation affects the users of the social media platforms. Our proposed mechanism incentivizes social media platforms to filter misleading information effi…
▽ More
In this paper, we present a resource allocation mechanism for the problem of incentivizing filtering among a finite number of strategic social media platforms. We consider the presence of a strategic government and private knowledge of how misinformation affects the users of the social media platforms. Our proposed mechanism incentivizes social media platforms to filter misleading information efficiently, and thus indirectly prevents the spread of fake news. In particular, we design an economically inspired mechanism that strongly implements all generalized Nash equilibria for efficient filtering of misleading information in the induced game. We show that our mechanism is individually rational, budget balanced, while it has at least one equilibrium. Finally, we show that for quasi-concave utilities and constraints, our mechanism admits a generalized Nash equilibrium and implements a Pareto efficient solution.
△ Less
Submitted 7 January, 2021; v1 submitted 12 March, 2020;
originally announced March 2020.
-
Autonomous discovery of battery electrolytes with robotic experimentation and machine-learning
Authors:
Adarsh Dave,
Jared Mitchell,
Kirthevasan Kandasamy,
Sven Burke,
Biswajit Paria,
Barnabas Poczos,
Jay Whitacre,
Venkatasubramanian Viswanathan
Abstract:
Innovations in batteries take years to formulate and commercialize, requiring extensive experimentation during the design and optimization phases. We approached the design and selection of a battery electrolyte through a black-box optimization algorithm directly integrated into a robotic test-stand. We report here the discovery of a novel battery electrolyte by this experiment completely guided by…
▽ More
Innovations in batteries take years to formulate and commercialize, requiring extensive experimentation during the design and optimization phases. We approached the design and selection of a battery electrolyte through a black-box optimization algorithm directly integrated into a robotic test-stand. We report here the discovery of a novel battery electrolyte by this experiment completely guided by the machine-learning software without human intervention. Motivated by the recent trend toward super-concentrated aqueous electrolytes for high-performance batteries, we utilize Dragonfly - a Bayesian machine-learning software package - to search mixtures of commonly used lithium and sodium salts for super-concentrated aqueous electrolytes with wide electrochemical stability windows. Dragonfly autonomously managed the robotic test-stand, recommending electrolyte designs to test and receiving experimental feedback in real time. In 40 hours of continuous experimentation over a four-dimensional design space with millions of potential candidates, Dragonfly discovered a novel, mixed-anion aqueous sodium electrolyte with a wider electrochemical stability window than state-of-the-art sodium electrolyte. A human-guided design process may have missed this optimal electrolyte. This result demonstrates the possibility of integrating robotics with machine-learning to rapidly and autonomously discover novel battery materials.
△ Less
Submitted 22 October, 2019;
originally announced January 2020.
-
Learning to Track Any Object
Authors:
Achal Dave,
Pavel Tokmakov,
Cordelia Schmid,
Deva Ramanan
Abstract:
Object tracking can be formulated as "finding the right object in a video". We observe that recent approaches for class-agnostic tracking tend to focus on the "finding" part, but largely overlook the "object" part of the task, essentially doing a template matching over a frame in a sliding-window. In contrast, class-specific trackers heavily rely on object priors in the form of category-specific o…
▽ More
Object tracking can be formulated as "finding the right object in a video". We observe that recent approaches for class-agnostic tracking tend to focus on the "finding" part, but largely overlook the "object" part of the task, essentially doing a template matching over a frame in a sliding-window. In contrast, class-specific trackers heavily rely on object priors in the form of category-specific object detectors. In this work, we re-purpose category-specific appearance models into a generic objectness prior. Our approach converts a category-specific object detector into a category-agnostic, object-specific detector (i.e. a tracker) efficiently, on the fly. Moreover, at test time the same network can be applied to detection and tracking, resulting in a unified approach for the two tasks. We achieve state-of-the-art results on two recent large-scale tracking benchmarks (OxUvA and GOT, using external data). By simply adding a mask prediction branch, our approach is able to produce instance segmentation masks for the tracked object. Despite only using box-level information on the first frame, our method outputs high-quality masks, as evaluated on the DAVIS '17 video object segmentation benchmark.
△ Less
Submitted 25 October, 2019;
originally announced October 2019.
-
The Prescription Approach to Decentralized Stochastic Control with Word-of-Mouth Communication
Authors:
Aditya Dave,
Andreas A. Malikopoulos
Abstract:
In this paper we analyze a network of agents that communicates through word of mouth. In a word-of-mouth communication system, every agent communicates with its neighbors with delays in communication. This is a non-classical information structure where the topological and temporal restrictions in communication mean that information propagates slowly through the network. We present the prescription…
▽ More
In this paper we analyze a network of agents that communicates through word of mouth. In a word-of-mouth communication system, every agent communicates with its neighbors with delays in communication. This is a non-classical information structure where the topological and temporal restrictions in communication mean that information propagates slowly through the network. We present the prescription approach to derive structural results for such problems. The structural results lead to optimal control strategies with time invariant domain-sizes. We show that these domains are smaller in size than the control strategies derived using the common information approach.
△ Less
Submitted 24 November, 2021; v1 submitted 28 July, 2019;
originally announced July 2019.
-
Benchmarking conductivity predictions of the Advanced Electrolyte Model (AEM) for aqueous systems
Authors:
Adarsh Dave,
Kevin L. Gering,
Jared M. Mitchell,
Jay Whitacre,
Venkatasubramanian Viswanathan
Abstract:
High-concentration aqueous electrolytes have shown promise as candidates for a safer, lower-cost battery system. Ionic conductivity is a key property required in high performing electrolytes; the Advanced Electrolyte Model (AEM) has previously shown great accuracy in predicting ionic conductivity in highly-concentrated non-aqueous electrolytes. This work provides extensive experimental data for mi…
▽ More
High-concentration aqueous electrolytes have shown promise as candidates for a safer, lower-cost battery system. Ionic conductivity is a key property required in high performing electrolytes; the Advanced Electrolyte Model (AEM) has previously shown great accuracy in predicting ionic conductivity in highly-concentrated non-aqueous electrolytes. This work provides extensive experimental data for mixed and highly concentrated aqueous electrolyte systems, rapidly generated via a robotic electrolyte testing apparatus. These data demonstrate exceptional accuracy from AEM in predicting conductivity in aqueous systems, with the accuracy being maintained even in highly-concentrated and mixed-salt regimes.
△ Less
Submitted 14 June, 2019;
originally announced June 2019.
-
Do Image Classifiers Generalize Across Time?
Authors:
Vaishaal Shankar,
Achal Dave,
Rebecca Roelofs,
Deva Ramanan,
Benjamin Recht,
Ludwig Schmidt
Abstract:
We study the robustness of image classifiers to temporal perturbations derived from videos. As part of this study, we construct two datasets, ImageNet-Vid-Robust and YTBB-Robust , containing a total 57,897 images grouped into 3,139 sets of perceptually similar images. Our datasets were derived from ImageNet-Vid and Youtube-BB respectively and thoroughly re-annotated by human experts for image simi…
▽ More
We study the robustness of image classifiers to temporal perturbations derived from videos. As part of this study, we construct two datasets, ImageNet-Vid-Robust and YTBB-Robust , containing a total 57,897 images grouped into 3,139 sets of perceptually similar images. Our datasets were derived from ImageNet-Vid and Youtube-BB respectively and thoroughly re-annotated by human experts for image similarity. We evaluate a diverse array of classifiers pre-trained on ImageNet and show a median classification accuracy drop of 16 and 10 on our two datasets. Additionally, we evaluate three detection models and show that natural perturbations induce both classification as well as localization errors, leading to a median drop in detection mAP of 14 points. Our analysis demonstrates that perturbations occurring naturally in videos pose a substantial and realistic challenge to deploying convolutional neural networks in environments that require both reliable and low-latency predictions
△ Less
Submitted 9 December, 2019; v1 submitted 5 June, 2019;
originally announced June 2019.
-
Towards Segmenting Anything That Moves
Authors:
Achal Dave,
Pavel Tokmakov,
Deva Ramanan
Abstract:
Detecting and segmenting individual objects, regardless of their category, is crucial for many applications such as action detection or robotic interaction. While this problem has been well-studied under the classic formulation of spatio-temporal grou**, state-of-the-art approaches do not make use of learning-based methods. To bridge this gap, we propose a simple learning-based approach for spat…
▽ More
Detecting and segmenting individual objects, regardless of their category, is crucial for many applications such as action detection or robotic interaction. While this problem has been well-studied under the classic formulation of spatio-temporal grou**, state-of-the-art approaches do not make use of learning-based methods. To bridge this gap, we propose a simple learning-based approach for spatio-temporal grou**. Our approach leverages motion cues from optical flow as a bottom-up signal for separating objects from each other. Motion cues are then combined with appearance cues that provide a generic objectness prior for capturing the full extent of objects. We show that our approach outperforms all prior work on the benchmark FBMS dataset. One potential worry with learning-based methods is that they might overfit to the particular type of objects that they have been trained on. To address this concern, we propose two new benchmarks for generic, moving object detection, and show that our model matches top-down methods on common categories, while significantly out-performing both top-down and bottom-up methods on never-before-seen categories.
△ Less
Submitted 31 March, 2020; v1 submitted 10 February, 2019;
originally announced February 2019.
-
Structural Results for Decentralized Stochastic Control with a Word-of-Mouth Communication
Authors:
Aditya Dave,
Andreas A. Malikopoulos
Abstract:
In this paper, we analyze a network of agents that communicate through the ``word of mouth," in which, every agent communicates only with its neighbors. We introduce the prescription approach, present some of its properties and show that it leads to a new information state. We also state preliminary structural results for optimal control strategies in systems that evolve using word-of-mouth commun…
▽ More
In this paper, we analyze a network of agents that communicate through the ``word of mouth," in which, every agent communicates only with its neighbors. We introduce the prescription approach, present some of its properties and show that it leads to a new information state. We also state preliminary structural results for optimal control strategies in systems that evolve using word-of-mouth communication. The proposed approach can be generalized to analyze several decentralized systems.
△ Less
Submitted 14 March, 2020; v1 submitted 23 September, 2018;
originally announced September 2018.
-
Solving Inverse Computational Imaging Problems using Deep Pixel-level Prior
Authors:
Akshat Dave,
Anil Kumar Vadathya,
Ramana Subramanyam,
Rahul Baburajan,
Kaushik Mitra
Abstract:
Signal reconstruction is a challenging aspect of computational imaging as it often involves solving ill-posed inverse problems. Recently, deep feed-forward neural networks have led to state-of-the-art results in solving various inverse imaging problems. However, being task specific, these networks have to be learned for each inverse problem. On the other hand, a more flexible approach would be to…
▽ More
Signal reconstruction is a challenging aspect of computational imaging as it often involves solving ill-posed inverse problems. Recently, deep feed-forward neural networks have led to state-of-the-art results in solving various inverse imaging problems. However, being task specific, these networks have to be learned for each inverse problem. On the other hand, a more flexible approach would be to learn a deep generative model once and then use it as a signal prior for solving various inverse problems. We show that among the various state of the art deep generative models, autoregressive models are especially suitable for our purpose for the following reasons. First, they explicitly model the pixel level dependencies and hence are capable of reconstructing low-level details such as texture patterns and edges better. Second, they provide an explicit expression for the image prior which can then be used for MAP based inference along with the forward model. Third, they can model long range dependencies in images which make them ideal for handling global multiplexing as encountered in various compressive imaging systems. We demonstrate the efficacy of our proposed approach in solving three computational imaging problems: Single Pixel Camera (SPC), LiSens and FlatCam. For both real and simulated cases, we obtain better reconstructions than the state-of-the-art methods in terms of perceptual and quantitative metrics.
△ Less
Submitted 23 April, 2018; v1 submitted 27 February, 2018;
originally announced February 2018.
-
IITMSAT Communications System : A LeanSat Design Approach
Authors:
Akshay Gulati,
Shubham Chavan,
Joseph Samuel,
Sampoornam Srinivasan,
Pradeep Shekhar,
Akshat Dave,
Aditya Sant,
Sourbh Bhadane,
Mayug Maniparambil,
Vishnu Prasad Sivasankarakurup,
Dhanalakshmi Durairaj,
David Koilpillai,
Harishankar Ramachandran
Abstract:
IITMSAT is a student-built nano satellite mission of Indian Institute of Technology Madras, Chennai, India. The objective is to study the precipitation of high energy electrons and protons from Van-Allen radiation belts to lower altitude of 600-900 km due to resonance interaction with low frequency EM waves. The unique communications system design of IITMSAT evolves from the challenging downlink d…
▽ More
IITMSAT is a student-built nano satellite mission of Indian Institute of Technology Madras, Chennai, India. The objective is to study the precipitation of high energy electrons and protons from Van-Allen radiation belts to lower altitude of 600-900 km due to resonance interaction with low frequency EM waves. The unique communications system design of IITMSAT evolves from the challenging downlink data requirement of 1 MB per day in the UHF band posed by the mission and the satellite's payload, SPEED (Space based Proton and Electron Energy Detector). To ensure continuous downlink data stream in the short Low earth Orbit passes, a robust physical layer protocol was designed to counter time-varying aspects of a Space-Earth telecom link. For the on-board communications system, two types of design alternatives exist for each module. The first option is a custom design wherein a module is developed from scratch using discrete components.The other option is an integrated design wherein an electronics COTS module can be directly plugged into the subsystem. This module is evaluated by carrying out vibration and thermal tests. If an integrated module is low-cost and meets the design requirements, it is preferred over a custom design. In order to carry out performance tests under simulated link conditions, an RF attenuation test setup was designed that can work at extreme temperatures. Burn-In tests for 72 hours at ambient and extreme temperatures were carried out. Integrated tests indicate all IITMSAT design requirements have been met. Hence a robust communications system has been validated. The time taken for development of on-board telecom and GS was less than a year and was achieved at a low cost which agrees to a LeanSat approach.
△ Less
Submitted 3 November, 2017;
originally announced November 2017.
-
Predictive-Corrective Networks for Action Detection
Authors:
Achal Dave,
Olga Russakovsky,
Deva Ramanan
Abstract:
While deep feature learning has revolutionized techniques for static-image understanding, the same does not quite hold for video processing. Architectures and optimization techniques used for video are largely based off those for static images, potentially underutilizing rich video information. In this work, we rethink both the underlying network architecture and the stochastic learning paradigm f…
▽ More
While deep feature learning has revolutionized techniques for static-image understanding, the same does not quite hold for video processing. Architectures and optimization techniques used for video are largely based off those for static images, potentially underutilizing rich video information. In this work, we rethink both the underlying network architecture and the stochastic learning paradigm for temporal data. To do so, we draw inspiration from classic theory on linear dynamic systems for modeling time series. By extending such models to include nonlinear map**s, we derive a series of novel recurrent neural networks that sequentially make top-down predictions about the future and then correct those predictions with bottom-up observations. Predictive-corrective networks have a number of desirable properties: (1) they can adaptively focus computation on "surprising" frames where predictions require large corrections, (2) they simplify learning in that only "residual-like" corrective terms need to be learned over time and (3) they naturally decorrelate an input data stream in a hierarchical fashion, producing a more reliable signal for learning at each layer of a network. We provide an extensive analysis of our lightweight and interpretable framework, and demonstrate that our model is competitive with the two-stream network on three challenging datasets without the need for computationally expensive optical flow.
△ Less
Submitted 12 December, 2017; v1 submitted 12 April, 2017;
originally announced April 2017.
-
Compressive Image Recovery Using Recurrent Generative Model
Authors:
Akshat Dave,
Anil Kumar Vadathya,
Kaushik Mitra
Abstract:
Reconstruction of signals from compressively sensed measurements is an ill-posed problem. In this paper, we leverage the recurrent generative model, RIDE, as an image prior for compressive image reconstruction. Recurrent networks can model long-range dependencies in images and hence are suitable to handle global multiplexing in reconstruction from compressive imaging. We perform MAP inference with…
▽ More
Reconstruction of signals from compressively sensed measurements is an ill-posed problem. In this paper, we leverage the recurrent generative model, RIDE, as an image prior for compressive image reconstruction. Recurrent networks can model long-range dependencies in images and hence are suitable to handle global multiplexing in reconstruction from compressive imaging. We perform MAP inference with RIDE using back-propagation to the inputs and projected gradient method. We propose an entropy thresholding based approach for preserving texture in images well. Our approach shows superior reconstructions compared to recent global reconstruction approaches like D-AMP and TVAL3 on both simulated and real data.
△ Less
Submitted 3 May, 2017; v1 submitted 13 December, 2016;
originally announced December 2016.
-
GPGPU Based Parallelized Client-Server Framework for Providing High Performance Computation Support
Authors:
Poorna Banerjee,
Amit Dave
Abstract:
Parallel data processing has become indispensable for processing applications involving huge data sets. This brings into focus the Graphics Processing Units (GPUs) which emphasize on many-core computing. With the advent of General Purpose GPUs (GPGPU), applications not directly associated with graphics operations can also harness the computation capabilities of GPUs. Hence, it would be beneficial…
▽ More
Parallel data processing has become indispensable for processing applications involving huge data sets. This brings into focus the Graphics Processing Units (GPUs) which emphasize on many-core computing. With the advent of General Purpose GPUs (GPGPU), applications not directly associated with graphics operations can also harness the computation capabilities of GPUs. Hence, it would be beneficial if the computing capabilities of a given GPGPU could be task optimized and made available. This paper describes a client-server framework in which users can choose a processing task and submit large data-sets for processing to a remote GPGPU and receive the results back, using well defined interfaces. The framework provides extensibility in terms of the number and type of tasks that the client can choose or submit for processing at the remote GPGPU server machine, with complete transparency to the underlying hardware and operating systems. Parallelization of user-submitted tasks on the GPGPU has been achieved using NVIDIA Compute Unified Device Architecture (CUDA).
△ Less
Submitted 21 May, 2015;
originally announced May 2015.
-
The Photon Underproduction Crisis
Authors:
Juna A. Kollmeier,
David H. Weinberg,
Benjamin D. Oppenheimer,
Francesco Haardt,
Neal Katz,
Romeel A. Davé,
Mark Fardal,
Piero Madau,
Charles Danforth,
Amanda B. Ford,
Molly S. Peeples,
Joseph McEwen
Abstract:
We examine the statistics of the low-redshift Lyman-alpha forest from smoothed particle hydrodynamic simulations in light of recent improvements in the estimated evolution of the cosmic ultraviolet background (UVB) and recent observations from the Cosmic Origins Spectrograph (COS). We find that the value of the metagalactic photoionization rate required by our simulations to match the observed pro…
▽ More
We examine the statistics of the low-redshift Lyman-alpha forest from smoothed particle hydrodynamic simulations in light of recent improvements in the estimated evolution of the cosmic ultraviolet background (UVB) and recent observations from the Cosmic Origins Spectrograph (COS). We find that the value of the metagalactic photoionization rate required by our simulations to match the observed properties of the low-redshift Lyman-alpha forest is a factor of 5 larger than the value predicted by state-of-the art models for the evolution of this quantity. This mismatch results in the mean flux decrement of the Lyman-alpha forest being underpredicted by at least a factor of 2 (a 10-sigma discrepancy with observations) and a column density distribution of Lyman-alpha forest absorbers systematically and significantly elevated compared to observations over nearly two decades in column density. We examine potential resolutions to this mismatch and find that either conventional sources of ionizing photons (galaxies and quasars) must be significantly elevated relative to current observational estimates or our theoretical understanding of the low-redshift universe is in need of substantial revision.
△ Less
Submitted 10 April, 2014;
originally announced April 2014.
-
GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
Authors:
Reynold S. Xin,
Daniel Crankshaw,
Ankur Dave,
Joseph E. Gonzalez,
Michael J. Franklin,
Ion Stoica
Abstract:
From social networks to language modeling, the growing scale and importance of graph data has driven the development of numerous new graph-parallel systems (e.g., Pregel, GraphLab). By restricting the computation that can be expressed and introducing new techniques to partition and distribute the graph, these systems can efficiently execute iterative graph algorithms orders of magnitude faster tha…
▽ More
From social networks to language modeling, the growing scale and importance of graph data has driven the development of numerous new graph-parallel systems (e.g., Pregel, GraphLab). By restricting the computation that can be expressed and introducing new techniques to partition and distribute the graph, these systems can efficiently execute iterative graph algorithms orders of magnitude faster than more general data-parallel systems. However, the same restrictions that enable the performance gains also make it difficult to express many of the important stages in a typical graph-analytics pipeline: constructing the graph, modifying its structure, or expressing computation that spans multiple graphs. As a consequence, existing graph analytics pipelines compose graph-parallel and data-parallel systems using external storage systems, leading to extensive data movement and complicated programming model.
To address these challenges we introduce GraphX, a distributed graph computation framework that unifies graph-parallel and data-parallel computation. GraphX provides a small, core set of graph-parallel operators expressive enough to implement the Pregel and PowerGraph abstractions, yet simple enough to be cast in relational algebra. GraphX uses a collection of query optimization techniques such as automatic join rewrites to efficiently implement these graph-parallel operators. We evaluate GraphX on real-world graphs and workloads and demonstrate that GraphX achieves comparable performance as specialized graph computation systems, while outperforming them in end-to-end graph pipelines. Moreover, GraphX achieves a balance between expressiveness, performance, and ease of use.
△ Less
Submitted 11 February, 2014;
originally announced February 2014.
-
The Nature and Origin of Low-Redshift O VI Absorbers
Authors:
Benjamin D. Oppenheimer,
Romeel A. Davé
Abstract:
The O VI ion observed in quasar absorption line spectra is the most accessible tracer of the cosmic metal distribution in the low redshift (z<0.5) intergalactic medium (IGM). We explore the nature and origin of O VI absorbers using cosmological hydrodynamic simulations including galactic outflows. We consider the effects of ionization background variations, non-equilibrium ionization and cooling…
▽ More
The O VI ion observed in quasar absorption line spectra is the most accessible tracer of the cosmic metal distribution in the low redshift (z<0.5) intergalactic medium (IGM). We explore the nature and origin of O VI absorbers using cosmological hydrodynamic simulations including galactic outflows. We consider the effects of ionization background variations, non-equilibrium ionization and cooling, uniform metallicity, and small-scale (sub-resolution) turbulence. Our main results are 1) IGM O VI is predominantly photo-ionized with T= 10^(4.2+/-0.2) K. A key reason for this is that O VI absorbers preferentially trace over-enriched regions of the IGM at a given density, which enhances metal-line cooling such that absorbers can cool within a Hubble time. As such, O VI is not a good tracer of the WHIM. 2) The predicted O VI properties fit observables only if sub-resolution turbulence is added. The required turbulence increases with O VI absorber strength such that stronger absorbers arise from more recent outflows with turbulence dissipating on the order of a Hubble time. The amount of turbulence is consistent with other examples of turbulence observed in the IGM and galactic halos. 3) Metals traced by O VI and H I do not trace exactly the same baryons, but reside in the same large-scale structure. Observed alignment statistics are reproduced in our simulations. 4) Photo-ionized O VI traces gas in a variety of environments, and is not directly associated with the nearest galaxy, though is typically nearest to ~0.1L* galaxies. Weaker O VI components trace some of the oldest cosmic metals. 5) Very strong absorbers are more likely to be collisionally ionized, tracing more recent enrichment (<2 Gyr) within or near galactic halos.
△ Less
Submitted 27 February, 2009; v1 submitted 17 June, 2008;
originally announced June 2008.