Skip to main content

Showing 1–21 of 21 results for author: Scott, M R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2108.07755  [pdf, other

    cs.CV

    TOOD: Task-aligned One-stage Object Detection

    Authors: Chengjian Feng, Yujie Zhong, Yu Gao, Matthew R. Scott, Weilin Huang

    Abstract: One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions between the two tasks. In this work, we propose a Task-aligned One-stage Object Detection (TOOD) that explicitly aligns the two tasks in a learning-based manner. Fir… ▽ More

    Submitted 28 August, 2021; v1 submitted 17 August, 2021; originally announced August 2021.

    Comments: ICCV2021 Oral

  2. arXiv:2107.04324  [pdf, other

    cs.CV

    Mutually-aware Sub-Graphs Differentiable Architecture Search

    Authors: Haoxian Tan, Sheng Guo, Yujie Zhong, Matthew R. Scott, Weilin Huang

    Abstract: Differentiable architecture search is prevalent in the field of NAS because of its simplicity and efficiency, where two paradigms, multi-path algorithms and single-path methods, are dominated. Multi-path framework (e.g. DARTS) is intuitive but suffers from memory usage and training collapse. Single-path methods (e.g.GDAS and ProxylessNAS) mitigate the memory issue and shrink the gap between search… ▽ More

    Submitted 5 November, 2021; v1 submitted 9 July, 2021; originally announced July 2021.

  3. arXiv:2103.14003  [pdf, other

    cs.CV

    Rethinking Deep Contrastive Learning with Embedding Memory

    Authors: Haozhi Zhang, Xun Wang, Weilin Huang, Matthew R. Scott

    Abstract: Pair-wise loss functions have been extensively studied and shown to continuously improve the performance of deep metric learning (DML). However, they are primarily designed with intuition based on simple toy examples, and experimentally identifying the truly effective design is difficult in complicated, real-world cases. In this paper, we provide a new methodology for systematically studying weigh… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

    Comments: Under review

  4. arXiv:2103.11587  [pdf, other

    cs.CV eess.IV

    Brain Image Synthesis with Unsupervised Multivariate Canonical CSC$\ell_4$Net

    Authors: Yawen Huang, Feng Zheng, Danyang Wang, Weilin Huang, Matthew R. Scott, Ling Shao

    Abstract: Recent advances in neuroscience have highlighted the effectiveness of multi-modal medical data for investigating certain pathologies and understanding human cognition. However, obtaining full sets of different modalities is limited by various factors, such as long acquisition times, high examination costs and artifact suppression. In addition, the complexity, high dimensionality and heterogeneity… ▽ More

    Submitted 22 March, 2021; originally announced March 2021.

    Comments: 10 pages, 5 figures CVPR2021 oral

  5. arXiv:2101.04028  [pdf, other

    cs.CV

    Unchain the Search Space with Hierarchical Differentiable Architecture Search

    Authors: Guanting Liu, Yujie Zhong, Sheng Guo, Matthew R. Scott, Weilin Huang

    Abstract: Differentiable architecture search (DAS) has made great progress in searching for high-performance architectures with reduced computational cost. However, DAS-based methods mainly focus on searching for a repeatable cell structure, which is then stacked sequentially in multiple stages to form the networks. This configuration significantly reduces the search space, and ignores the importance of con… ▽ More

    Submitted 11 January, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

    Comments: To appear in AAAI2021. Code is available

  6. arXiv:2007.12075  [pdf, other

    cs.CV

    Representation Sharing for Fast Object Detector Search and Beyond

    Authors: Yujie Zhong, Zelu Deng, Sheng Guo, Matthew R. Scott, Weilin Huang

    Abstract: Region Proposal Network (RPN) provides strong support for handling the scale variation of objects in two-stage object detection. For one-stage detectors which do not have RPN, it is more demanding to have powerful sub-networks capable of directly capturing objects of unknown sizes. To enhance such capability, we propose an extremely efficient neural architecture search method, named Fast And Diver… ▽ More

    Submitted 23 October, 2020; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: ECCV 2020 accepted

  7. arXiv:2004.06711  [pdf, other

    cs.CV

    Deformable Siamese Attention Networks for Visual Object Tracking

    Authors: Yuechen Yu, Yilei Xiong, Weilin Huang, Matthew R. Scott

    Abstract: Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of the target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention Networks, referred to as SiamAttn, by introducing a new Siamese attention mechanism that compute… ▽ More

    Submitted 24 March, 2021; v1 submitted 14 April, 2020; originally announced April 2020.

    Comments: CVPR 2020, with code available at: https://github.com/msight-tech/research-siamattn

  8. arXiv:2003.05235  [pdf, other

    cs.CV

    Channel Interaction Networks for Fine-Grained Image Categorization

    Authors: Yu Gao, Xintong Han, Xun Wang, Weilin Huang, Matthew R. Scott

    Abstract: Fine-grained image categorization is challenging due to the subtle inter-class differences.We posit that exploiting the rich relationships between channels can help capture such differences since different channels correspond to different semantics. In this paper, we propose a channel interaction network (CIN), which models the channel-wise interplay both within an image and across images. For a s… ▽ More

    Submitted 11 March, 2020; originally announced March 2020.

    Comments: AAAI 2020

  9. arXiv:2003.04132  [pdf, other

    cs.CV

    iFAN: Image-Instance Full Alignment Networks for Adaptive Object Detection

    Authors: Chenfan Zhuang, Xintong Han, Weilin Huang, Matthew R. Scott

    Abstract: Training an object detector on a data-rich domain and applying it to a data-poor one with limited performance drop is highly attractive in industry, because it saves huge annotation cost. Recent research on unsupervised domain adaptive object detection has verified that aligning data distributions between source and target images through adversarial learning is very useful. The key is when, where… ▽ More

    Submitted 9 March, 2020; originally announced March 2020.

    Comments: AAAI 2020

  10. arXiv:2002.07471  [pdf, other

    cs.CV

    Knowledge Integration Networks for Action Recognition

    Authors: Shiwen Zhang, Sheng Guo, Limin Wang, Weilin Huang, Matthew R. Scott

    Abstract: In this work, we propose Knowledge Integration Networks (referred as KINet) for video action recognition. KINet is capable of aggregating meaningful context features which are of great importance to identifying an action, such as human information and scene context. We design a three-branch architecture consisting of a main branch for action recognition, and two auxiliary branches for human parsin… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

    Comments: To appear in AAAI 2020

  11. arXiv:2002.07442  [pdf, other

    cs.CV

    V4D:4D Convolutional Neural Networks for Video-level Representation Learning

    Authors: Shiwen Zhang, Sheng Guo, Weilin Huang, Matthew R. Scott, Limin Wang

    Abstract: Most existing 3D CNNs for video representation learning are clip-based methods, and thus do not consider video-level temporal evolution of spatio-temporal features. In this paper, we propose Video-level 4D Convolutional Neural Networks, referred as V4D, to model the evolution of long-range spatio-temporal representation with 4D convolutions, and at the same time, to preserve strong 3D spatio-tempo… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

    Comments: To appear in ICLR2020

  12. arXiv:1912.06798  [pdf, other

    cs.LG cs.CV

    Cross-Batch Memory for Embedding Learning

    Authors: Xun Wang, Haozhi Zhang, Weilin Huang, Matthew R. Scott

    Abstract: Mining informative negative instances are of central importance to deep metric learning (DML), however this task is intrinsically limited by mini-batch training, where only a mini-batch of instances is accessible at each iteration. In this paper, we identify a "slow drift" phenomena by observing that the embedding features drift exceptionally slow even as the model parameters are updating througho… ▽ More

    Submitted 20 April, 2020; v1 submitted 14 December, 2019; originally announced December 2019.

    Comments: CVPR 2020 Oral

  13. arXiv:1910.07954  [pdf, other

    cs.CV

    Convolutional Character Networks

    Authors: Linjie Xing, Zhi Tian, Weilin Huang, Matthew R. Scott

    Abstract: Recent progress has been made on develo** a unified framework for joint text detection and recognition in natural images, but existing joint models were mostly built on two-stage framework by involving ROI pooling, which can degrade the performance on recognition task. In this work, we propose convolutional character networks, referred as CharNet, which is an one-stage model that can process two… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: To appear in ICCV 2019

  14. arXiv:1910.02624  [pdf, other

    cs.CV

    Label-PEnet: Sequential Label Propagation and Enhancement Networks for Weakly Supervised Instance Segmentation

    Authors: Weifeng Ge, Sheng Guo, Weilin Huang, Matthew R. Scott

    Abstract: Weakly-supervised instance segmentation aims to detect and segment object instances precisely, given imagelevel labels only. Unlike previous methods which are composed of multiple offline stages, we propose Sequential Label Propagation and Enhancement Networks (referred as Label-PEnet) that progressively transform image-level labels to pixel-wise labels in a coarse-to-fine manner. We design four c… ▽ More

    Submitted 24 April, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

    Comments: Rectifiy some typos in Arxiv title

  15. arXiv:1909.11966  [pdf, other

    cs.CV

    Dual-Stream Pyramid Registration Network

    Authors: Miao Kang, Xiaojun Hu, Weilin Huang, Matthew R. Scott, Mauricio Reyes

    Abstract: We propose a Dual-Stream Pyramid Registration Network (referred as Dual-PRNet) for unsupervised 3D medical image registration. Unlike recent CNN-based registration approaches, such as VoxelMorph, which explores a single-stream encoder-decoder network to compute a registration fields from a pair of 3D volumes, we design a two-stream architecture able to compute multi-scale registration fields from… ▽ More

    Submitted 1 April, 2023; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: Published in Medical Image Analysis, 2022

  16. arXiv:1906.05750  [pdf, other

    cs.CV

    The iMaterialist Fashion Attribute Dataset

    Authors: Sheng Guo, Weilin Huang, Xiao Zhang, Prasanna Srikhanta, Yin Cui, Yuan Li, Matthew R. Scott, Hartwig Adam, Serge Belongie

    Abstract: Large-scale image databases such as ImageNet have significantly advanced image classification and other visual recognition tasks. However much of these datasets are constructed only for single-label and coarse object-level classification. For real-world applications, multiple labels and fine-grained categories are often needed, yet very few such datasets exist publicly, especially those of large-s… ▽ More

    Submitted 14 June, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

  17. arXiv:1904.06627  [pdf, other

    cs.CV

    Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning

    Authors: Xun Wang, Xintong Han, Weilin Huang, Dengke Dong, Matthew R. Scott

    Abstract: A family of loss functions built on pair-based computation have been proposed in the literature which provide a myriad of solutions for deep metric learning. In this paper, we provide a general weighting framework for understanding recent pair-based loss functions. Our contributions are three-fold: (1) we establish a General Pair Weighting (GPW) framework, which casts the sampling problem of deep… ▽ More

    Submitted 22 March, 2020; v1 submitted 14 April, 2019; originally announced April 2019.

    Comments: Accepted CVPR 2019, rewrite main method to be more clear

    Report number: 13 pages, 4 figures, 7 tables, including supplementary materials

  18. arXiv:1902.01096  [pdf, other

    cs.CV

    Compatible and Diverse Fashion Image Inpainting

    Authors: Xintong Han, Zuxuan Wu, Weilin Huang, Matthew R. Scott, Larry S. Davis

    Abstract: Visual compatibility is critical for fashion analysis, yet is missing in existing fashion image synthesis systems. In this paper, we propose to explicitly model visual compatibility through fashion image inpainting. To this end, we present Fashion Inpainting Networks (FiNet), a two-stage image-to-image generation framework that is able to perform compatible and diverse inpainting. Disentangling th… ▽ More

    Submitted 24 April, 2019; v1 submitted 4 February, 2019; originally announced February 2019.

  19. arXiv:1811.02629  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

    Authors: Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, Marcel Prastawa, Esther Alberts, Jana Lipkova, John Freymann, Justin Kirby, Michel Bilello, Hassan Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Benedikt Wiestler, Rivka Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko , et al. (402 additional authors not shown)

    Abstract: Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles dissem… ▽ More

    Submitted 23 April, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: The International Multimodal Brain Tumor Segmentation (BraTS) Challenge

  20. arXiv:1810.06951  [pdf, other

    cs.CV

    Deep Metric Learning with Hierarchical Triplet Loss

    Authors: Weifeng Ge, Weilin Huang, Dengke Dong, Matthew R. Scott

    Abstract: We present a novel hierarchical triplet loss (HTL) capable of automatically collecting informative training samples (triplets) via a defined hierarchical tree that encodes global context information. This allows us to cope with the main limitation of random sampling in training a conventional triplet loss, which is a central issue for deep metric learning. Our main contributions are two-fold. (i)… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

    Comments: Published in ECCV 2018

  21. arXiv:1808.01097  [pdf, other

    cs.CV

    CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

    Authors: Sheng Guo, Weilin Huang, Haozhi Zhang, Chenfan Zhuang, Dengke Dong, Matthew R. Scott, Dinglong Huang

    Abstract: We present a simple yet efficient approach capable of training deep neural networks on large-scale weakly-supervised web images, which are crawled raw from the Internet by using text queries, without any human annotation. We develop a principled learning strategy by leveraging curriculum learning, with the goal of handling a massive amount of noisy labels and data imbalance effectively. We design… ▽ More

    Submitted 18 October, 2018; v1 submitted 3 August, 2018; originally announced August 2018.

    Comments: Accepted to ECCV 2018. 16 pages, 5 figures, 5 tables