Skip to main content

Showing 1–7 of 7 results for author: Marsden, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.01961  [pdf, other

    cs.CV

    NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

    Authors: Taehoon Kim, Pyunghwan Ahn, Sangyun Kim, Sihaeng Lee, Mark Marsden, Alessandra Sala, Seung Hwan Kim, Bohyung Han, Kyoung Mu Lee, Honglak Lee, Kyounghoon Bae, Xiangyu Wu, Yi Gao, Hailiang Zhang, Yang Yang, Weili Guo, Jianfeng Lu, Youngtaek Oh, Jae Won Cho, Dong-** Kim, In So Kweon, Junmo Kim, Wooyoung Kang, Won Young Jhoo, Byungseok Roh , et al. (17 additional authors not shown)

    Abstract: In this report, we introduce NICE (New frontiers for zero-shot Image Captioning Evaluation) project and share the results and outcomes of 2023 challenge. This project is designed to challenge the computer vision community to develop robust image captioning models that advance the state-of-the-art both in terms of accuracy and fairness. Through the challenge, the image captioning models were tested… ▽ More

    Submitted 10 September, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Tech report, project page https://nice.lgresearch.ai/

  2. arXiv:2211.06774  [pdf, other

    cs.CV cs.CL

    Large-Scale Bidirectional Training for Zero-Shot Image Captioning

    Authors: Taehoon Kim, Mark Marsden, Pyunghwan Ahn, Sangyun Kim, Sihaeng Lee, Alessandra Sala, Seung Hwan Kim

    Abstract: When trained on large-scale datasets, image captioning models can understand the content of images from a general domain but often fail to generate accurate, detailed captions. To improve performance, pretraining-and-finetuning has been a key strategy for image captioning. However, we find that large-scale bidirectional training between image and text enables zero-shot image captioning. In this pa… ▽ More

    Submitted 1 October, 2023; v1 submitted 12 November, 2022; originally announced November 2022.

    Comments: Arxiv Preprint. Work in progress

  3. arXiv:2005.00430  [pdf, other

    cs.CV

    Investigating Class-level Difficulty Factors in Multi-label Classification Problems

    Authors: Mark Marsden, Kevin McGuinness, Joseph Antony, Haolin Wei, Milan Redzic, Jian Tang, Zhilan Hu, Alan Smeaton, Noel E O'Connor

    Abstract: This work investigates the use of class-level difficulty factors in multi-label classification problems for the first time. Four class-level difficulty factors are proposed: frequency, visual variation, semantic abstraction, and class co-occurrence. Once computed for a given multi-label classification dataset, these difficulty factors are shown to have several potential applications including the… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: Published in ICME 2020

  4. arXiv:1711.05586  [pdf, other

    cs.CV

    People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting

    Authors: Mark Marsden, Kevin McGuinness, Suzanne Little, Ciara E. Keogh, Noel E. O'Connor

    Abstract: In this paper we propose a technique to adapt a convolutional neural network (CNN) based object counter to additional visual domains and object types while still preserving the original counting function. Domain-specific normalisation and scaling operators are trained to allow the model to adjust to the statistical distributions of the various visual domains. The developed adaptation technique is… ▽ More

    Submitted 15 November, 2017; originally announced November 2017.

    Comments: 10 pages

  5. arXiv:1705.10698  [pdf, other

    cs.CV

    ResnetCrowd: A Residual Deep Learning Architecture for Crowd Counting, Violent Behaviour Detection and Crowd Density Level Classification

    Authors: Mark Marsden, Kevin McGuinness, Suzanne Little, Noel E. O'Connor

    Abstract: In this paper we propose ResnetCrowd, a deep residual architecture for simultaneous crowd counting, violent behaviour detection and crowd density level classification. To train and evaluate the proposed multi-objective technique, a new 100 image dataset referred to as Multi Task Crowd is constructed. This new dataset is the first computer vision dataset fully annotated for crowd counting, violent… ▽ More

    Submitted 30 May, 2017; originally announced May 2017.

    Comments: 7 Pages, AVSS 2017

  6. arXiv:1612.00220  [pdf, other

    cs.CV

    Fully Convolutional Crowd Counting On Highly Congested Scenes

    Authors: Mark Marsden, Kevin McGuinness, Suzanne Little, Noel E. O'Connor

    Abstract: In this paper we advance the state-of-the-art for crowd counting in high density scenes by further exploring the idea of a fully convolutional crowd counting model introduced by (Zhang et al., 2016). Producing an accurate and robust crowd count estimator using computer vision techniques has attracted significant research interest in recent years. Applications for crowd counting systems exist in ma… ▽ More

    Submitted 17 January, 2017; v1 submitted 1 December, 2016; originally announced December 2016.

    Comments: 7 pages , VISAPP 2017

  7. arXiv:1606.05310  [pdf, other

    cs.CV

    Holistic Features For Real-Time Crowd Behaviour Anomaly Detection

    Authors: M. Marsden, K. McGuinness, S. Little, N. E. O'Connor

    Abstract: This paper presents a new approach to crowd behaviour anomaly detection that uses a set of efficiently computed, easily interpretable, scene-level holistic features. This low-dimensional descriptor combines two features from the literature: crowd collectiveness [1] and crowd conflict [2], with two newly developed crowd features: mean motion speed and a new formulation of crowd density. Two differe… ▽ More

    Submitted 16 June, 2016; originally announced June 2016.

    Comments: 4 pages