Skip to main content

Showing 1–27 of 27 results for author: Cottrell, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2212.14143  [pdf, other

    cs.CV

    Multimodal Wildland Fire Smoke Detection

    Authors: Siddhant Baldota, Shreyas Anantha Ramaprasad, Jaspreet Kaur Bhamra, Shane Luna, Ravi Ramachandra, Eugene Zen, Harrison Kim, Daniel Crawl, Ismael Perez, Ilkay Altintas, Garrison W. Cottrell, Mai H. Nguyen

    Abstract: Research has shown that climate change creates warmer temperatures and drier conditions, leading to longer wildfire seasons and increased wildfire risks in the United States. These factors have in turn led to increases in the frequency, extent, and severity of wildfires in recent years. Given the danger posed by wildland fires to people, property, wildlife, and the environment, there is an urgency… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

  2. FIgLib & SmokeyNet: Dataset and Deep Learning Model for Real-Time Wildland Fire Smoke Detection

    Authors: Anshuman Dewangan, Yash Pande, Hans-Werner Braun, Frank Vernon, Ismael Perez, Ilkay Altintas, Garrison W. Cottrell, Mai H. Nguyen

    Abstract: The size and frequency of wildland fires in the western United States have dramatically increased in recent years. On high-fire-risk days, a small fire ignition can rapidly grow and become out of control. Early detection of fire ignitions from initial smoke can assist the response to such fires before they become difficult to manage. Past deep learning approaches for wildfire smoke detection have… ▽ More

    Submitted 14 May, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Journal ref: Remote Sensing. 2022; 14(4):1007

  3. arXiv:2005.08072  [pdf, other

    eess.AS cs.LG cs.SD

    Speech Recognition and Multi-Speaker Diarization of Long Conversations

    Authors: Huanru Henry Mao, Shuyang Li, Julian McAuley, Garrison Cottrell

    Abstract: Speech recognition (ASR) and speaker diarization (SD) models have traditionally been trained separately to produce rich conversation transcripts with speaker labels. Recent advances have shown that joint ASR and SD models can learn to leverage audio-lexical inter-dependencies to improve word diarization performance. We introduce a new benchmark of hour-long podcasts collected from the weekly This… ▽ More

    Submitted 4 November, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

  4. arXiv:2004.02032  [pdf, other

    cs.AI cs.CL cs.CV

    Generating Rationales in Visual Question Answering

    Authors: Hammad A. Ayyubi, Md. Mehrab Tanjim, Julian J. McAuley, Garrison W. Cottrell

    Abstract: Despite recent advances in Visual QuestionAnswering (VQA), it remains a challenge todetermine how much success can be attributedto sound reasoning and comprehension ability.We seek to investigate this question by propos-ing a new task ofrationale generation. Es-sentially, we task a VQA model with generat-ing rationales for the answers it predicts. Weuse data from the Visual Commonsense Rea-soning… ▽ More

    Submitted 4 April, 2020; originally announced April 2020.

  5. arXiv:2003.04887  [pdf, other

    cs.LG cs.CL stat.ML

    ReZero is All You Need: Fast Convergence at Large Depth

    Authors: Thomas Bachlechner, Bodhisattwa Prasad Majumder, Huanru Henry Mao, Garrison W. Cottrell, Julian McAuley

    Abstract: Deep networks often suffer from vanishing or exploding gradients due to inefficient signal propagation, leading to long training times or convergence difficulties. Various architecture designs, sophisticated residual-style networks, and initialization schemes have been shown to improve deep signal propagation. Recently, Pennington et al. used free probability theory to show that dynamical isometry… ▽ More

    Submitted 24 June, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

  6. arXiv:2002.07405  [pdf, other

    cs.LG cs.CV stat.ML

    Deflecting Adversarial Attacks

    Authors: Yao Qin, Nicholas Frosst, Colin Raffel, Garrison Cottrell, Geoffrey Hinton

    Abstract: There has been an ongoing cycle where stronger defenses against adversarial attacks are subsequently broken by a more advanced defense-aware attack. We present a new approach towards ending this cycle where we "deflect'' adversarial attacks by causing the attacker to produce an input that semantically resembles the attack's target class. To this end, we first propose a stronger defense based on Ca… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

  7. arXiv:1908.09451  [pdf, ps, other

    cs.LG cs.CL stat.ML

    Improving Neural Story Generation by Targeted Common Sense Grounding

    Authors: Huanru Henry Mao, Bodhisattwa Prasad Majumder, Julian McAuley, Garrison W. Cottrell

    Abstract: Stories generated with neural language models have shown promise in grammatical and stylistic consistency. However, the generated stories are still lacking in common sense reasoning, e.g., they often contain sentences deprived of world knowledge. We propose a simple multi-task learning scheme to achieve quantitatively better common sense reasoning in language models by leveraging auxiliary trainin… ▽ More

    Submitted 27 February, 2020; v1 submitted 25 August, 2019; originally announced August 2019.

  8. arXiv:1907.04868  [pdf, other

    cs.SD cs.LG cs.MM eess.AS stat.ML

    LakhNES: Improving multi-instrumental music generation with cross-domain pre-training

    Authors: Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian McAuley

    Abstract: We are interested in the task of generating multi-instrumental music scores. The Transformer architecture has recently shown great promise for the task of piano score generation; here we adapt it to the multi-instrumental setting. Transformers are complex, high-dimensional language models which are capable of capturing long-term structure in sequence data, but require large amounts of data to fit.… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

    Comments: Published as a conference paper at ISMIR 2019

  9. arXiv:1907.02957  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions

    Authors: Yao Qin, Nicholas Frosst, Sara Sabour, Colin Raffel, Garrison Cottrell, Geoffrey Hinton

    Abstract: Adversarial examples raise questions about whether neural network models are sensitive to the same visual features as humans. In this paper, we first detect adversarial examples or otherwise corrupted images based on a class-conditional reconstruction of the input. To specifically attack our detection mechanism, we propose the Reconstructive Attack which seeks both to cause a misclassification and… ▽ More

    Submitted 18 February, 2020; v1 submitted 5 July, 2019; originally announced July 2019.

    Journal ref: ICLR 2020

  10. arXiv:1903.10346  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition

    Authors: Yao Qin, Nicholas Carlini, Ian Goodfellow, Garrison Cottrell, Colin Raffel

    Abstract: Adversarial examples are inputs to machine learning models designed by an adversary to cause an incorrect output. So far, adversarial examples have been studied most extensively in the image domain. In this domain, adversarial examples can be constructed by imperceptibly modifying images to cause misclassification, and are practical in the physical world. In contrast, current targeted adversarial… ▽ More

    Submitted 7 June, 2019; v1 submitted 22 March, 2019; originally announced March 2019.

    Comments: International Conference on Machine Learning (ICML), 2019

  11. arXiv:1805.08403  [pdf, other

    cs.CV

    Autofocus Layer for Semantic Segmentation

    Authors: Yao Qin, Konstantinos Kamnitsas, Siddharth Ancha, Jay Nanavati, Garrison Cottrell, Antonio Criminisi, Aditya Nori

    Abstract: We propose the autofocus convolutional layer for semantic segmentation with the objective of enhancing the capabilities of neural networks for multi-scale processing. Autofocus layers adaptively change the size of the effective receptive field based on the processed context to generate more powerful features. This is achieved by parallelising multiple convolutional layers with different dilation r… ▽ More

    Submitted 11 June, 2018; v1 submitted 22 May, 2018; originally announced May 2018.

    Comments: Published on MICCAI 2018

  12. DeepJ: Style-Specific Music Generation

    Authors: Huanru Henry Mao, Taylor Shin, Garrison W. Cottrell

    Abstract: Recent advances in deep neural networks have enabled algorithms to compose music that is comparable to music composed by humans. However, few algorithms allow the user to generate music with tunable parameters. The ability to tune properties of generated music will yield more practical benefits for aiding artists, filmmakers, and composers in their creative tasks. In this paper, we introduce DeepJ… ▽ More

    Submitted 2 January, 2018; originally announced January 2018.

  13. arXiv:1711.05255  [pdf, other

    cs.LG cs.AI

    Deep-ESN: A Multiple Projection-encoding Hierarchical Reservoir Computing Framework

    Authors: Qianli Ma, Lifeng Shen, Garrison W. Cottrell

    Abstract: As an efficient recurrent neural network (RNN) model, reservoir computing (RC) models, such as Echo State Networks, have attracted widespread attention in the last decade. However, while they have had great success with time series data [1], [2], many time series have a multiscale structure, which a single-hidden-layer RC model may have difficulty capturing. In this paper, we propose a novel hiera… ▽ More

    Submitted 13 November, 2017; originally announced November 2017.

  14. arXiv:1708.03901  [pdf, other

    cs.AI cs.CV

    Belief Tree Search for Active Object Recognition

    Authors: Mohsen Malmir, Garrison W. Cottrell

    Abstract: Active Object Recognition (AOR) has been approached as an unsupervised learning problem, in which optimal trajectories for object inspection are not known and are to be discovered by reducing label uncertainty measures or training with reinforcement learning. Such approaches have no guarantees of the quality of their solution. In this paper, we treat AOR as a Partially Observable Markov Decision P… ▽ More

    Submitted 13 August, 2017; originally announced August 2017.

    Comments: IROS 2017

  15. arXiv:1707.05911  [pdf, other

    cs.CV

    Recognizing and Curating Photo Albums via Event-Specific Image Importance

    Authors: Yufei Wang, Zhe Lin, Xiaohui Shen, Radomir Mech, Gavin Miller, Garrison W. Cottrell

    Abstract: Automatic organization of personal photos is a problem with many real world ap- plications, and can be divided into two main tasks: recognizing the event type of the photo collection, and selecting interesting images from the collection. In this paper, we attempt to simultaneously solve both tasks: album-wise event recognition and image- wise importance prediction. We collected an album dataset wi… ▽ More

    Submitted 18 July, 2017; originally announced July 2017.

    Comments: Accepted as oral in BMVC 2017

  16. arXiv:1705.09425  [pdf, other

    cs.CV

    Hierarchical Cellular Automata for Visual Saliency

    Authors: Yao Qin, Mengyang Feng, Huchuan Lu, Garrison W. Cottrell

    Abstract: Saliency detection, finding the most important parts of an image, has become increasingly popular in computer vision. In this paper, we introduce Hierarchical Cellular Automata (HCA) -- a temporally evolving model to intelligently detect salient objects. HCA consists of two main components: Single-layer Cellular Automata (SCA) and Cuboid Cellular Automata (CCA). As an unsupervised propagation mech… ▽ More

    Submitted 25 May, 2017; originally announced May 2017.

  17. arXiv:1705.04282  [pdf, other

    cs.CV cs.AI cs.LG

    Learning to see people like people

    Authors: Amanda Song, Linjie Li, Chad Atalla, Garrison Cottrell

    Abstract: Humans make complex inferences on faces, ranging from objective properties (gender, ethnicity, expression, age, identity, etc) to subjective judgments (facial attractiveness, trustworthiness, sociability, friendliness, etc). While the objective aspects of face perception have been extensively studied, relatively fewer computational models have been developed for the social impressions of faces. Br… ▽ More

    Submitted 5 May, 2017; originally announced May 2017.

    Comments: 10 pages

  18. arXiv:1704.06972  [pdf, other

    cs.CV

    Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition

    Authors: Yufei Wang, Zhe Lin, Xiaohui Shen, Scott Cohen, Garrison W. Cottrell

    Abstract: Recently, there has been a lot of interest in automatically generating descriptions for an image. Most existing language-model based approaches for this task learn to generate an image description word by word in its original word order. However, for humans, it is more natural to locate the objects and their relationships first, and then elaborate on each object, describing notable attributes. We… ▽ More

    Submitted 23 April, 2017; originally announced April 2017.

    Comments: Accepted by CVPR 2017

  19. arXiv:1704.02971  [pdf, other

    cs.LG stat.ML

    A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction

    Authors: Yao Qin, Dong** Song, Haifeng Chen, Wei Cheng, Guofei Jiang, Garrison Cottrell

    Abstract: The Nonlinear autoregressive exogenous (NARX) model, which predicts the current value of a time series based upon its previous values as well as the current and past values of multiple driving (exogenous) series, has been studied for decades. Despite the fact that various NARX models have been developed, few of them can capture the long-term temporal dependencies appropriately and select the relev… ▽ More

    Submitted 14 August, 2017; v1 submitted 7 April, 2017; originally announced April 2017.

    Comments: International Joint Conference on Artificial Intelligence (IJCAI), 2017

  20. arXiv:1702.08502  [pdf, other

    cs.CV

    Understanding Convolution for Semantic Segmentation

    Authors: Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, Garrison Cottrell

    Abstract: Recent advances in deep learning, especially deep convolutional neural networks (CNNs), have led to significant improvement over previous semantic segmentation systems. Here we show how to improve pixel-wise semantic segmentation by manipulating convolution-related operations that are of both theoretical and practical value. First, we design dense upsampling convolution (DUC) to generate pixel-lev… ▽ More

    Submitted 31 May, 2018; v1 submitted 27 February, 2017; originally announced February 2017.

    Comments: WACV 2018. Updated acknowledgements. Source code: https://github.com/TuSimple/TuSimple-DUC

  21. arXiv:1604.07872  [pdf

    q-bio.NC cs.CV

    Are Face and Object Recognition Independent? A Neurocomputational Modeling Exploration

    Authors: Panqu Wang, Isabel Gauthier, Garrison Cottrell

    Abstract: Are face and object recognition abilities independent? Although it is commonly believed that they are, Gauthier et al.(2014) recently showed that these abilities become more correlated as experience with nonface categories increases. They argued that there is a single underlying visual ability, v, that is expressed in performance with both face and nonface categories as experience grows. Using the… ▽ More

    Submitted 26 April, 2016; originally announced April 2016.

    Journal ref: Journal of Cognitive Neuroscience, 28(4):558-574. 2016

  22. arXiv:1604.07457  [pdf, other

    q-bio.NC cs.CV

    Modeling the Contribution of Central Versus Peripheral Vision in Scene, Object, and Face Recognition

    Authors: Panqu Wang, Garrison Cottrell

    Abstract: It is commonly believed that the central visual field is important for recognizing objects and faces, and the peripheral region is useful for scene recognition. However, the relative importance of central versus peripheral information for object, scene, and face recognition is unclear. In a behavioral study, Larson and Loschky (2009) investigated this question by measuring the scene recognition ac… ▽ More

    Submitted 25 April, 2016; originally announced April 2016.

    Comments: CogSci 2016 Conference Paper

  23. arXiv:1602.08486  [pdf, other

    q-bio.NC cs.CV cs.LG cs.NE

    A Single Model Explains both Visual and Auditory Precortical Coding

    Authors: Honghao Shan, Matthew H. Tong, Garrison W. Cottrell

    Abstract: Precortical neural systems encode information collected by the senses, but the driving principles of the encoding used have remained a subject of debate. We present a model of retinal coding that is based on three constraints: information preservation, minimization of the neural wiring, and response equalization. The resulting novel version of sparse principal components analysis successfully capt… ▽ More

    Submitted 7 April, 2016; v1 submitted 26 February, 2016; originally announced February 2016.

  24. arXiv:1512.05484  [pdf, other

    cs.AI

    Deep Active Object Recognition by Joint Label and Action Prediction

    Authors: Mohsen Malmir, Karan Sikka, Deborah Forster, Ian Fasel, Javier R. Movellan, Garrison W. Cottrell

    Abstract: An active object recognition system has the advantage of being able to act in the environment to capture images that are more suited for training and that lead to better performance at test time. In this paper, we propose a deep convolutional neural network for active object recognition that simultaneously predicts the object label, and selects the next action to perform on the object with the aim… ▽ More

    Submitted 17 December, 2015; originally announced December 2015.

  25. arXiv:1511.04103  [pdf, other

    cs.CV

    Basic Level Categorization Facilitates Visual Object Recognition

    Authors: Panqu Wang, Garrison W. Cottrell

    Abstract: Recent advances in deep learning have led to significant progress in the computer vision field, especially for visual object recognition tasks. The features useful for object classification are learned by feed-forward deep convolutional neural networks (CNNs) automatically, and they are shown to be able to predict and decode neural representations in the ventral visual pathway of humans and monkey… ▽ More

    Submitted 7 January, 2016; v1 submitted 12 November, 2015; originally announced November 2015.

    Comments: ICLR 2016 submission R1

  26. arXiv:1412.6177  [pdf, other

    cs.LG cs.AI stat.ML

    Example Selection For Dictionary Learning

    Authors: Tomoki Tsuchida, Garrison W. Cottrell

    Abstract: In unsupervised learning, an unbiased uniform sampling strategy is typically used, in order that the learned features faithfully encode the statistical structure of the training data. In this work, we explore whether active example selection strategies - algorithms that select which examples to use, based on the current estimate of the features - can accelerate learning. Specifically, we investiga… ▽ More

    Submitted 31 March, 2015; v1 submitted 18 December, 2014; originally announced December 2014.

  27. arXiv:1312.6077  [pdf

    cs.CV q-bio.NC

    Efficient Visual Coding: From Retina To V2

    Authors: Honghao Shan, Garrison Cottrell

    Abstract: The human visual system has a hierarchical structure consisting of layers of processing, such as the retina, V1, V2, etc. Understanding the functional roles of these visual processing layers would help to integrate the psychophysiological and neurophysiological models into a consistent theory of human vision, and would also provide insights to computer vision research. One classical theory of the… ▽ More

    Submitted 17 December, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

    Comments: For the ICLR 2014 conference