Skip to main content

Showing 1–35 of 35 results for author: Burns, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.14118  [pdf, other

    cs.DS cs.AI

    Masked Matrix Multiplication for Emergent Sparsity

    Authors: Brian Wheatman, Meghana Madhyastha, Randal Burns

    Abstract: Artificial intelligence workloads, especially transformer models, exhibit emergent sparsity in which computations perform selective sparse access to dense data. The workloads are inefficient on hardware designed for dense computations and do not map well onto sparse data representations. We build a vectorized and parallel matrix-multiplication system A X B = C that eliminates unnecessary computati… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  2. arXiv:2402.04403  [pdf, other

    cs.DC cs.LG

    Edge-Parallel Graph Encoder Embedding

    Authors: Ariel Lubonja, Cencheng Shen, Carey Priebe, Randal Burns

    Abstract: New algorithms for embedding graphs have reduced the asymptotic complexity of finding low-dimensional representations. One-Hot Graph Encoder Embedding (GEE) uses a single, linear pass over edges and produces an embedding that converges asymptotically to the spectral embedding. The scaling and performance benefits of this approach have been limited by a serial implementation in an interpreted langu… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 4 pages, 4 figures

  3. arXiv:2309.12576  [pdf, other

    cs.AI cs.DC

    Understanding Patterns of Deep Learning ModelEvolution in Network Architecture Search

    Authors: Robert Underwood, Meghana Madhastha, Randal Burns, Bogdan Nicolae

    Abstract: Network Architecture Search and specifically Regularized Evolution is a common way to refine the structure of a deep learning model.However, little is known about how models empirically evolve over time which has design implications for designing caching policies, refining the search algorithm for particular applications, and other important use cases.In this work, we algorithmically analyze and q… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: 11 pages, 4 figures

    ACM Class: I.2.6; C.4

  4. arXiv:2305.05055  [pdf, other

    cs.DS cs.DC cs.PF

    CPMA: An Efficient Batch-Parallel Compressed Set Without Pointers

    Authors: Brian Wheatman, Randal Burns, Aydın Buluç, Helen Xu

    Abstract: This paper introduces the batch-parallel Compressed Packed Memory Array (CPMA), a compressed, dynamic, ordered set data structure based on the Packed Memory Array (PMA). Traditionally, batch-parallel sets are built on pointer-based data structures such as trees because pointer-based structures enable fast parallel unions via pointer manipulation. When compared with cache-optimized trees, PMAs were… ▽ More

    Submitted 18 February, 2024; v1 submitted 8 May, 2023; originally announced May 2023.

  5. arXiv:2201.07372  [pdf, other

    cs.LG cs.AI

    Prospective Learning: Principled Extrapolation to the Future

    Authors: Ashwin De Silva, Rahul Ramesh, Lyle Ungar, Marshall Hussain Shuler, Noah J. Cowan, Michael Platt, Chen Li, Leyla Isik, Seung-Eon Roh, Adam Charles, Archana Venkataraman, Brian Caffo, Javier J. How, Justus M Kebschull, John W. Krakauer, Maxim Bichuch, Kaleab Alemayehu Kinfu, Eva Yezerets, Dinesh Jayaraman, Jong M. Shin, Soledad Villar, Ian Phillips, Carey E. Priebe, Thomas Hartung, Michael I. Miller , et al. (18 additional authors not shown)

    Abstract: Learning is a process which can update decision rules, based on past experience, such that future performance improves. Traditionally, machine learning is often evaluated under the assumption that the future will be identical to the past in distribution or change adversarially. But these assumptions can be either too optimistic or pessimistic for many problems in the real world. Real world scenari… ▽ More

    Submitted 13 July, 2023; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: Accepted at the 2nd Conference on Lifelong Learning Agents (CoLLAs), 2023

  6. arXiv:2011.05383  [pdf, other

    cs.DC cs.LG

    PACSET (Packed Serialized Trees): Reducing Inference Latency for Tree Ensemble Deployment

    Authors: Meghana Madhyastha, Kunal Lillaney, James Browne, Joshua Vogelstein, Randal Burns

    Abstract: We present methods to serialize and deserialize tree ensembles that optimize inference latency when models are not already loaded into memory. This arises whenever models are larger than memory, but also systematically when models are deployed on low-resource devices, such as in the Internet of Things, or run as Web micro-services where resources are allocated on demand. Our packed serialized tree… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

    ACM Class: I.5.5

  7. arXiv:2002.02017  [pdf, other

    cs.DB

    Observations on Porting In-memory KV stores to Persistent Memory

    Authors: Brian Choi, Parv Saxena, Ryan Huang, Randal Burns

    Abstract: Systems that require high-throughput and fault tolerance, such as key-value stores and databases, are looking to persistent memory to combine the performance of in-memory systems with the data-consistent fault-tolerance of nonvolatile stores. Persistent memory devices provide fast bytea-ddressable access to non-volatile memory. We analyze the design space when integrating persistent memory into in… ▽ More

    Submitted 5 February, 2020; originally announced February 2020.

  8. arXiv:1908.11780  [pdf, other

    cs.DC

    Towards Marrying Files to Objects

    Authors: Kunal Lillaney, Vasily Tarasov, David Pease, Randal Burns

    Abstract: To deal with the constant growth of unstructured data, vendors have deployed scalable, resilient, and cost effective object-based storage systems built on RESTful web services. However, many applications rely on richer file-system APIs and semantics, and cannot benefit from object stores. This leads to storage sprawl, as object stores are deployed alongside file systems and data is accessed and ma… ▽ More

    Submitted 21 August, 2019; originally announced August 2019.

  9. arXiv:1907.03335  [pdf, other

    cs.DC cs.DB

    Graphyti: A Semi-External Memory Graph Library for FlashGraph

    Authors: Disa Mhembere, Da Zheng, Carey E. Priebe, Joshua T. Vogelstein, Randal Burns

    Abstract: Graph datasets exceed the in-memory capacity of most standalone machines. Traditionally, graph frameworks have overcome memory limitations through scale-out, distributing computing. Emerging frameworks avoid the network bottleneck of distributed data with Semi-External Memory (SEM) that uses a single multicore node and operates on graphs larger than memory. In SEM, $\mathcal{O}(m)$ data resides on… ▽ More

    Submitted 7 July, 2019; originally announced July 2019.

  10. arXiv:1907.02844  [pdf, other

    stat.ML cs.IR cs.LG stat.ME

    Geodesic Learning via Unsupervised Decision Forests

    Authors: Meghana Madhyastha, Percy Li, James Browne, Veronika Strnadova-Neeley, Carey E. Priebe, Randal Burns, Joshua T. Vogelstein

    Abstract: Geodesic distance is the shortest path between two points in a Riemannian manifold. Manifold learning algorithms, such as Isomap, seek to learn a manifold that preserves geodesic distances. However, such methods operate on the ambient dimensionality, and are therefore fragile to noise dimensions. We developed an unsupervised random forest method (URerF) to approximately learn geodesic distances in… ▽ More

    Submitted 5 July, 2019; originally announced July 2019.

  11. arXiv:1904.04174  [pdf, other

    cs.LG cs.DC cs.PF

    Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

    Authors: Rod Burns, John Lawson, Duncan McBain, Daniel Soutar

    Abstract: Over the past few years machine learning has seen a renewed explosion of interest, following a number of studies showing the effectiveness of neural networks in a range of tasks which had previously been considered incredibly hard. Neural networks' effectiveness in the fields of image recognition and natural language processing stems primarily from the vast amounts of data available to companies a… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: 4 pages, 3 figures. In International Workshop on OpenCL (IWOCL '19), May 13-15, 2019, Boston

  12. arXiv:1902.09527  [pdf, other

    cs.DC

    clusterNOR: A NUMA-Optimized Clustering Framework

    Authors: Disa Mhembere, Da Zheng, Carey E. Priebe, Joshua T. Vogelstein, Randal Burns

    Abstract: Clustering algorithms are iterative and have complex data access patterns that result in many small random memory accesses. The performance of parallel implementations suffer from synchronous barriers for each iteration and skewed workloads. We rethink the parallelization of clustering for modern non-uniform memory architectures (NUMA) to maximizes independent, asynchronous computation. We elimina… ▽ More

    Submitted 17 January, 2021; v1 submitted 24 February, 2019; originally announced February 2019.

    Comments: arXiv admin note: Journal version of arXiv:1606.08905

  13. arXiv:1901.00885  [pdf

    cs.RO cs.HC

    An Interactive Robotic Framework to Facilitate Sensory Experiences for Children with ASD

    Authors: Hifza Javed, Rachael Burns, Myounghoon Jeon, Ayanna M. Howard, Chung Hyuk Park

    Abstract: The diagnosis of Autism Spectrum Disorder (ASD) in children is commonly accompanied by a diagnosis of sensory processing disorders as well. Abnormalities are usually reported in multiple sensory processing domains, showing a higher prevalence of unusual responses, particularly to tactile, auditory and visual stimuli. This paper discusses a novel robot-based framework designed to target sensory dif… ▽ More

    Submitted 3 January, 2019; originally announced January 2019.

    Comments: 18 pages, 12 figures

  14. arXiv:1806.07300  [pdf, other

    cs.PF cs.DC

    Forest Packing: Fast, Parallel Decision Forests

    Authors: James Browne, Tyler M. Tomita, Disa Mhembere, Randal Burns, Joshua T. Vogelstein

    Abstract: Machine learning has an emerging critical role in high-performance computing to modulate simulations, extract knowledge from massive data, and replace numerical models with efficient approximations. Decision forests are a critical tool because they provide insight into model operation that is critical to interpreting learned results. While decision forests are trivially parallelizable, the travers… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

  15. arXiv:1606.08905  [pdf, other

    cs.DC

    knor: A NUMA-Optimized In-Memory, Distributed and Semi-External-Memory k-means Library

    Authors: Disa Mhembere, Da Zheng, Carey E. Priebe, Joshua T. Vogelstein, Randal Burns

    Abstract: k-means is one of the most influential and utilized machine learning algorithms. Its computation limits the performance and scalability of many statistical analysis and machine learning tasks. We rethink and optimize k-means in terms of modern NUMA architectures to develop a novel parallelization scheme that delays and minimizes synchronization barriers. The \textit{k-means NUMA Optimized Routine}… ▽ More

    Submitted 24 June, 2017; v1 submitted 28 June, 2016; originally announced June 2016.

  16. arXiv:1604.06414  [pdf, other

    cs.DC

    FlashR: R-Programmed Parallel and Scalable Machine Learning using SSDs

    Authors: Da Zheng, Disa Mhembere, Joshua T. Vogelstein, Carey E. Priebe, Randal Burns

    Abstract: R is one of the most popular programming languages for statistics and machine learning, but the R framework is relatively slow and unable to scale to large datasets. The general approach for speeding up an implementation in R is to implement the algorithms in C or FORTRAN and provide an R wrapper. FlashR takes a different approach: it executes R code in parallel and scales the code beyond memory c… ▽ More

    Submitted 18 May, 2017; v1 submitted 21 April, 2016; originally announced April 2016.

  17. Semi-External Memory Sparse Matrix Multiplication for Billion-Node Graphs

    Authors: Da Zheng, Disa Mhembere, Vince Lyzinski, Joshua Vogelstein, Carey E. Priebe, Randal Burns

    Abstract: Sparse matrix multiplication is traditionally performed in memory and scales to large matrices using the distributed memory of multiple nodes. In contrast, we scale sparse matrix multiplication beyond memory capacity by implementing sparse matrix dense matrix multiplication (SpMM) in a semi-external memory (SEM) fashion; i.e., we keep the sparse matrix on commodity SSDs and dense matrices in memor… ▽ More

    Submitted 14 October, 2016; v1 submitted 9 February, 2016; originally announced February 2016.

    Comments: published in IEEE Transactions on Parallel and Distributed Systems

  18. arXiv:1602.01421  [pdf, other

    cs.DC cs.MS

    An SSD-based eigensolver for spectral analysis on billion-node graphs

    Authors: Da Zheng, Randal Burns, Joshua Vogelstein, Carey E. Priebe, Alexander S. Szalay

    Abstract: Many eigensolvers such as ARPACK and Anasazi have been developed to compute eigenvalues of a large sparse matrix. These eigensolvers are limited by the capacity of RAM. They run in memory of a single machine for smaller eigenvalue problems and require the distributed memory for larger problems. In contrast, we develop an SSD-based eigensolver framework called FlashEigen, which extends Anasazi ei… ▽ More

    Submitted 26 February, 2016; v1 submitted 3 February, 2016; originally announced February 2016.

  19. arXiv:1506.07566  [pdf, other

    cs.OS

    Optimize Unsynchronized Garbage Collection in an SSD Array

    Authors: Da Zheng, Randal Burns, Alexander S. Szalay

    Abstract: Solid state disks (SSDs) have advanced to outperform traditional hard drives significantly in both random reads and writes. However, heavy random writes trigger fre- quent garbage collection and decrease the performance of SSDs. In an SSD array, garbage collection of individ- ual SSDs is not synchronized, leading to underutilization of some of the SSDs. We propose a software solution to tackle t… ▽ More

    Submitted 24 June, 2015; originally announced June 2015.

  20. arXiv:1506.03410  [pdf, other

    stat.ML cs.LG

    Sparse Projection Oblique Randomer Forests

    Authors: Tyler M. Tomita, James Browne, Cencheng Shen, Jaewon Chung, Jesse L. Patsolic, Benjamin Falk, Jason Yim, Carey E. Priebe, Randal Burns, Mauro Maggioni, Joshua T. Vogelstein

    Abstract: Decision forests, including Random Forests and Gradient Boosting Trees, have recently demonstrated state-of-the-art performance in a variety of machine learning settings. Decision forests are typically ensembles of axis-aligned decision trees; that is, trees that split only along feature dimensions. In contrast, many recent extensions to decision forests are based on axis-oblique splits. Unfortuna… ▽ More

    Submitted 3 October, 2019; v1 submitted 10 June, 2015; originally announced June 2015.

    Comments: 31 pages; submitted to Journal of Machine Learning Research for review

    MSC Class: 68T10 ACM Class: I.5.2

    Journal ref: Journal of Machine Learning Research 21(104), 1-39, 2020

  21. arXiv:1506.02079  [pdf, other

    cs.GR

    Gradient-Domain Fusion for Color Correction in Large EM Image Stacks

    Authors: Michael Kazhdan, Kunal Lillaney, William Roncal, Davi Bock, Joshua Vogelstein, Randal Burns

    Abstract: We propose a new gradient-domain technique for processing registered EM image stacks to remove inter-image discontinuities while preserving intra-image detail. To this end, we process the image stack by first performing anisotropic smoothing along the slice axis and then solving a Poisson equation within each slice to re-introduce the detail. The final image stack is continuous across the slice ax… ▽ More

    Submitted 5 June, 2015; originally announced June 2015.

  22. arXiv:1412.8576  [pdf, other

    cs.SI physics.soc-ph

    Active Community Detection in Massive Graphs

    Authors: Heng Wang, Da Zheng, Randal Burns, Carey Priebe

    Abstract: A canonical problem in graph mining is the detection of dense communities. This problem is exacerbated for a graph with a large order and size -- the number of vertices and edges -- as many community detection algorithms scale poorly. In this work we propose a novel framework for detecting active communities that consist of the most active vertices in massive graphs. The framework is applicable to… ▽ More

    Submitted 13 February, 2015; v1 submitted 30 December, 2014; originally announced December 2014.

    Comments: published in SDM-Networks 2015

  23. arXiv:1411.6880  [pdf, other

    q-bio.QM cs.CV

    An Automated Images-to-Graphs Framework for High Resolution Connectomics

    Authors: William Gray Roncal, Dean M. Kleissas, Joshua T. Vogelstein, Priya Manavalan, Kunal Lillaney, Michael Pekala, Randal Burns, R. Jacob Vogelstein, Carey E. Priebe, Mark A. Chevillet, Gregory D. Hager

    Abstract: Reconstructing a map of neuronal connectivity is a critical challenge in contemporary neuroscience. Recent advances in high-throughput serial section electron microscopy (EM) have produced massive 3D image volumes of nanoscale brain tissue for the first time. The resolution of EM allows for individual neurons and their synaptic connections to be directly observed. Recovering neuronal networks by m… ▽ More

    Submitted 30 April, 2015; v1 submitted 25 November, 2014; originally announced November 2014.

    Comments: 13 pages, first two authors contributed equally V2: Added additional experiments and clarifications; added information on infrastructure and pipeline environment

  24. arXiv:1408.0500  [pdf, other

    cs.DC

    FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs

    Authors: Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey E. Priebe, Alexander S. Szalay

    Abstract: Graph analysis performs many random reads and writes, thus, these workloads are typically performed in memory. Traditionally, analyzing large graphs requires a cluster of machines so the aggregate memory exceeds the graph size. We demonstrate that a multicore server can process graphs with billions of vertices and hundreds of billions of edges, utilizing commodity SSDs with minimal performance los… ▽ More

    Submitted 25 January, 2015; v1 submitted 3 August, 2014; originally announced August 2014.

    Comments: published in FAST'15

  25. arXiv:1405.1965  [pdf

    cs.CV

    Automatic Annotation of Axoplasmic Reticula in Pursuit of Connectomes using High-Resolution Neural EM Data

    Authors: Ayushi Sinha, William Gray Roncal, Narayanan Kasthuri, Jeff W. Lichtman, Randal Burns, Michael Kazhdan

    Abstract: Accurately estimating the wiring diagram of a brain, known as a connectome, at an ultrastructure level is an open research problem. Specifically, precisely tracking neural processes is difficult, especially across many image slices. Here, we propose a novel method to automatically identify and annotate small subcellular structures present in axons, known as axoplasmic reticula, through a 3D volume… ▽ More

    Submitted 16 April, 2014; originally announced May 2014.

    Comments: 2 pages, 1 figure; The 3rd Annual Hopkins Imaging Conference, The Johns Hopkins University, Baltimore, MD

  26. arXiv:1404.4800  [pdf, other

    cs.CV

    Automatic Annotation of Axoplasmic Reticula in Pursuit of Connectomes

    Authors: Ayushi Sinha, William Gray Roncal, Narayanan Kasthuri, Ming Chuang, Priya Manavalan, Dean M. Kleissas, Joshua T. Vogelstein, R. Jacob Vogelstein, Randal Burns, Jeff W. Lichtman, Michael Kazhdan

    Abstract: In this paper, we present a new pipeline which automatically identifies and annotates axoplasmic reticula, which are small subcellular structures present only in axons. We run our algorithm on the Kasthuri11 dataset, which was color corrected using gradient-domain techniques to adjust contrast. We use a bilateral filter to smooth out the noise in this data while preserving edges, which highlights… ▽ More

    Submitted 16 April, 2014; originally announced April 2014.

    Comments: 2 pages, 1 figure

  27. arXiv:1403.3724  [pdf, other

    cs.CV cs.CE q-bio.QM

    VESICLE: Volumetric Evaluation of Synaptic Interfaces using Computer vision at Large Scale

    Authors: William Gray Roncal, Michael Pekala, Verena Kaynig-Fittkau, Dean M. Kleissas, Joshua T. Vogelstein, Hanspeter Pfister, Randal Burns, R. Jacob Vogelstein, Mark A. Chevillet, Gregory D. Hager

    Abstract: An open challenge problem at the forefront of modern neuroscience is to obtain a comprehensive map** of the neural pathways that underlie human brain function; an enhanced understanding of the wiring diagram of the brain promises to lead to new breakthroughs in diagnosing and treating neurological disorders. Inferring brain structure from image data, such as that obtained via electron microscopy… ▽ More

    Submitted 7 September, 2015; v1 submitted 14 March, 2014; originally announced March 2014.

    Comments: v4: added clarifying figures and updates for readability. v3: fixed metadata. 11 pp v2: Added CNN classifier, significant changes to improve performance and generalization

    Journal ref: Proceedings of the British Machine Vision Conference (BMVC), pages 81.1-81.13. BMVA Press, September 2015

  28. MIGRAINE: MRI Graph Reliability Analysis and Inference for Connectomics

    Authors: William Gray Roncal, Zachary H. Koterba, Disa Mhembere, Dean M. Kleissas, Joshua T. Vogelstein, Randal Burns, Anita R. Bowles, Dimitrios K. Donavos, Sephira Ryman, Rex E. Jung, Lei Wu, Vince Calhoun, R. Jacob Vogelstein

    Abstract: Currently, connectomes (e.g., functional or structural brain graphs) can be estimated in humans at $\approx 1~mm^3$ scale using a combination of diffusion weighted magnetic resonance imaging, functional magnetic resonance imaging and structural magnetic resonance imaging scans. This manuscript summarizes a novel, scalable implementation of open-source algorithms to rapidly estimate magnetic resona… ▽ More

    Submitted 17 December, 2013; originally announced December 2013.

    Comments: Published as part of 2013 IEEE GlobalSIP conference

  29. arXiv:1310.0041  [pdf, other

    cs.GR

    Gradient-Domain Processing for Large EM Image Stacks

    Authors: Michael Kazhdan, Randal Burns, Bobby Kasthuri, Jeff Lichtman, Jacob Vogelstein, Joshua Vogelstein

    Abstract: We propose a new gradient-domain technique for processing registered EM image stacks to remove the inter-image discontinuities while preserving intra-image detail. To this end, we process the image stack by first performing anisotropic diffusion to smooth the data along the slice axis and then solving a screened-Poisson equation within each slice to re-introduce the detail. The final image stack i… ▽ More

    Submitted 30 September, 2013; originally announced October 2013.

  30. arXiv:1306.3543  [pdf, other

    cs.DC cs.CE q-bio.NC

    The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience

    Authors: Randal Burns, William Gray Roncal, Dean Kleissas, Kunal Lillaney, Priya Manavalan, Eric Perlman, Daniel R. Berger, Davi D. Bock, Kwanghun Chung, Logan Grosenick, Narayanan Kasthuri, Nicholas C. Weiler, Karl Deisseroth, Michael Kazhdan, Jeff Lichtman, R. Clay Reid, Stephen J. Smith, Alexander S. Szalay, Joshua T. Vogelstein, R. Jacob Vogelstein

    Abstract: We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes---neural connectivity maps of the brain---using the parallel execution of computer vision algorithms on hi… ▽ More

    Submitted 18 June, 2013; v1 submitted 14 June, 2013; originally announced June 2013.

    Comments: 11 pages, 13 figures

  31. arXiv:1107.1821  [pdf, other

    cs.CR

    Where Have You Been? Secure Location Provenance for Mobile Devices

    Authors: Ragib Hasan, Randal Burns

    Abstract: With the advent of mobile computing, location-based services have recently gained popularity. Many applications use the location provenance of users, i.e., the chronological history of the users' location for purposes ranging from access control, authentication, information sharing, and evaluation of policies. However, location provenance is subject to tampering and collusion attacks by malicious… ▽ More

    Submitted 9 July, 2011; originally announced July 2011.

    Comments: 14 pages

  32. arXiv:1106.6062  [pdf, other

    cs.ET

    The Life and Death of Unwanted Bits: Towards Proactive Waste Data Management in Digital Ecosystems

    Authors: Ragib Hasan, Randal Burns

    Abstract: Our everyday data processing activities create massive amounts of data. Like physical waste and trash, unwanted and unused data also pollutes the digital environment by degrading the performance and capacity of storage systems and requiring costly disposal. In this paper, we propose using the lessons from real life waste management in handling waste data. We show the impact of waste data on the pe… ▽ More

    Submitted 1 July, 2011; v1 submitted 29 June, 2011; originally announced June 2011.

    Comments: Fixed references

  33. arXiv:0909.1760  [pdf

    cs.DB

    LifeRaft: Data-Driven, Batch Processing for the Exploration of Scientific Databases

    Authors: Xiaodan Wang, Randal Burns, Tanu Malik

    Abstract: Workloads that comb through vast amounts of data are gaining importance in the sciences. These workloads consist of "needle in a haystack" queries that are long running and data intensive so that query throughput limits performance. To maximize throughput for data-intensive queries, we put forth LifeRaft: a query processing system that batches queries with overlap** data requirements. Rather t… ▽ More

    Submitted 9 September, 2009; originally announced September 2009.

    Comments: CIDR 2009

  34. arXiv:0901.3923  [pdf, ps, other

    cs.NI cs.CV

    Model-Based Event Detection in Wireless Sensor Networks

    Authors: Jayant Gupchup, Andreas Terzis, Randal Burns, Alex Szalay

    Abstract: In this paper we present an application of techniques from statistical signal processing to the problem of event detection in wireless sensor networks used for environmental monitoring. The proposed approach uses the well-established Principal Component Analysis (PCA) technique to build a compact model of the observed phenomena that is able to capture daily and seasonal trends in the collected m… ▽ More

    Submitted 25 January, 2009; originally announced January 2009.

    Journal ref: Workshop for Data Sharing and Interoperability on the World Wide Web (DSI 2007). April 2007, In Proceedings

  35. arXiv:cs/0701170  [pdf

    cs.DB cs.CE

    Life Under Your Feet: An End-to-End Soil Ecology Sensor Network, Database, Web Server, and Analysis Service

    Authors: Katalin Szlavecz, Andreas Terzis, Stuart Ozer, Razvan Musaloiu-E, Joshua Cogan, Sam Small, Randal Burns, Jim Gray, Alex Szalay

    Abstract: Wireless sensor networks can revolutionize soil ecology by providing measurements at temporal and spatial granularities previously impossible. This paper presents a soil monitoring system we developed and deployed at an urban forest in Baltimore as a first step towards realizing this vision. Motes in this network measure and save soil moisture and temperature in situ every minute. Raw measuremen… ▽ More

    Submitted 26 January, 2007; originally announced January 2007.

    Report number: MSR TR 2006 90