Skip to main content

Showing 1–7 of 7 results for author: Redmon, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:1804.02767  [pdf, other

    cs.CV

    YOLOv3: An Incremental Improvement

    Authors: Joseph Redmon, Ali Farhadi

    Abstract: We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good.… ▽ More

    Submitted 8 April, 2018; originally announced April 2018.

    Comments: Tech Report

  2. arXiv:1803.10827  [pdf, other

    cs.CV

    Who Let The Dogs Out? Modeling Dog Behavior From Visual Data

    Authors: Kiana Ehsani, Hessam Bagherinezhad, Joseph Redmon, Roozbeh Mottaghi, Ali Farhadi

    Abstract: We introduce the task of directly modeling a visually intelligent agent. Computer vision typically focuses on solving various subtasks related to visual intelligence. We depart from this standard approach to computer vision; instead we directly model a visually intelligent agent. Our model takes visual information as input and directly predicts the actions of the agent. Toward this end we introduc… ▽ More

    Submitted 17 May, 2018; v1 submitted 28 March, 2018; originally announced March 2018.

    Comments: Accepted to CVPR18

  3. arXiv:1712.03316  [pdf, other

    cs.CV

    IQA: Visual Question Answering in Interactive Environments

    Authors: Daniel Gordon, Aniruddha Kembhavi, Mohammad Rastegari, Joseph Redmon, Dieter Fox, Ali Farhadi

    Abstract: We introduce Interactive Question Answering (IQA), the task of answering questions that require an autonomous agent to interact with a dynamic visual environment. IQA presents the agent with a scene and a question, like: "Are there any apples in the fridge?" The agent must navigate around the scene, acquire visual understanding of scene elements, interact with objects (e.g. open refrigerators) and… ▽ More

    Submitted 6 September, 2018; v1 submitted 8 December, 2017; originally announced December 2017.

    Comments: Published in CVPR 2018

  4. arXiv:1612.08242  [pdf, other

    cs.CV

    YOLO9000: Better, Faster, Stronger

    Authors: Joseph Redmon, Ali Farhadi

    Abstract: We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both novel and drawn from prior work. The improved model, YOLOv2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78… ▽ More

    Submitted 25 December, 2016; originally announced December 2016.

  5. arXiv:1603.05279  [pdf, other

    cs.CV

    XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

    Authors: Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi

    Abstract: We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are approximated with binary values resulting in 32x memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This resu… ▽ More

    Submitted 2 August, 2016; v1 submitted 16 March, 2016; originally announced March 2016.

  6. arXiv:1506.02640  [pdf, other

    cs.CV

    You Only Look Once: Unified, Real-Time Object Detection

    Authors: Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi

    Abstract: We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detec… ▽ More

    Submitted 9 May, 2016; v1 submitted 8 June, 2015; originally announced June 2015.

  7. arXiv:1412.3128  [pdf, other

    cs.RO cs.CV

    Real-Time Grasp Detection Using Convolutional Neural Networks

    Authors: Joseph Redmon, Anelia Angelova

    Abstract: We present an accurate, real-time approach to robotic grasp detection based on convolutional neural networks. Our network performs single-stage regression to graspable bounding boxes without using standard sliding window or region proposal techniques. The model outperforms state-of-the-art approaches by 14 percentage points and runs at 13 frames per second on a GPU. Our network can simultaneously… ▽ More

    Submitted 28 February, 2015; v1 submitted 9 December, 2014; originally announced December 2014.

    Comments: Accepted to ICRA 2015