Skip to main content

Showing 1–7 of 7 results for author: Huang, O

.
  1. arXiv:2406.16449  [pdf, other

    cs.CV

    Evaluating and Analyzing Relationship Hallucinations in LVLMs

    Authors: Mingrui Wu, Jiayi Ji, Oucheng Huang, Jiale Li, Yuhang Wu, Xiaoshuai Sun, Rongrong Ji

    Abstract: The issue of hallucinations is a prevalent concern in existing Large Vision-Language Models (LVLMs). Previous efforts have primarily focused on investigating object hallucinations, which can be easily alleviated by introducing object detectors. However, these efforts neglect hallucinations in inter-object relationships, which is essential for visual comprehension. In this work, we introduce R-Benc… ▽ More

    Submitted 2 July, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: ICML2024; Project Page:https://github.com/mrwu-mac/R-Bench

  2. arXiv:2311.02802  [pdf, other

    cs.CL cs.AI

    Incorporating Worker Perspectives into MTurk Annotation Practices for NLP

    Authors: Olivia Huang, Eve Fleisig, Dan Klein

    Abstract: Current practices regarding data collection for natural language processing on Amazon Mechanical Turk (MTurk) often rely on a combination of studies on data quality and heuristics shared among NLP researchers. However, without considering the perspectives of MTurk workers, these approaches are susceptible to issues regarding workers' rights and poor response quality. We conducted a critical litera… ▽ More

    Submitted 15 November, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

  3. arXiv:2212.00802  [pdf, other

    cs.LG

    An Introduction to Kernel and Operator Learning Methods for Homogenization by Self-consistent Clustering Analysis

    Authors: Owen Huang, Sourav Saha, Jiachen Guo, Wing Kam Liu

    Abstract: Recent advances in operator learning theory have improved our knowledge about learning maps between infinite dimensional spaces. However, for large-scale engineering problems such as concurrent multiscale simulation for mechanical properties, the training cost for the current operator learning methods is very high. The article presents a thorough analysis on the mathematical underpinnings of the o… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

  4. arXiv:2211.16610  [pdf, other

    math.NA

    Deep Learning Discrete Calculus (DLDC): A Family of Discrete Numerical Methods by Universal Approximation for STEM Education to Frontier Research

    Authors: Sourav Saha, Chanwook Park, Stefan Knapik, Jiachen Guo, Owen Huang, Wing Kam Liu

    Abstract: The article proposes formulating and codifying a set of applied numerical methods, coined as Deep Learning Discrete Calculus (DLDC), that uses the knowledge from discrete numerical methods to interpret the deep learning algorithms through the lens of applied mathematics. The DLDC methods aim to leverage the flexibility and ever increasing resources of deep learning and rich literature of numerical… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

  5. arXiv:2205.12602  [pdf, other

    cs.CV

    VTP: Volumetric Transformer for Multi-view Multi-person 3D Pose Estimation

    Authors: Yuxing Chen, Renshu Gu, Ouhan Huang, Gangyong Jia

    Abstract: This paper presents Volumetric Transformer Pose estimator (VTP), the first 3D volumetric transformer framework for multi-view multi-person 3D human pose estimation. VTP aggregates features from 2D keypoints in all camera views and directly learns the spatial relationships in the 3D voxel space in an end-to-end fashion. The aggregated 3D features are passed through 3D convolutions before being flat… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

  6. arXiv:2201.11345  [pdf, other

    cs.CV cs.AI

    Exploring Global Diversity and Local Context for Video Summarization

    Authors: Yingchao Pan, Ouhan Huang, Qinghao Ye, Zhong** Li, Wenjiang Wang, Guodun Li, Yuxing Chen

    Abstract: Video summarization aims to automatically generate a diverse and concise summary which is useful in large-scale video processing. Most of the methods tend to adopt self-attention mechanism across video frames, which fails to model the diversity of video frames. To alleviate this problem, we revisit the pairwise similarity measurement in self-attention mechanism and find that the existing inner-pro… ▽ More

    Submitted 27 March, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: Accepted by IEEE Access

  7. arXiv:1908.05782  [pdf, other

    eess.IV cs.CV stat.ML

    MimickNet, Matching Clinical Post-Processing Under Realistic Black-Box Constraints

    Authors: Ouwen Huang, Will Long, Nick Bottenus, Gregg E. Trahey, Sina Farsiu, Mark L. Palmeri

    Abstract: Image post-processing is used in clinical-grade ultrasound scanners to improve image quality (e.g., reduce speckle noise and enhance contrast). These post-processing techniques vary across manufacturers and are generally kept proprietary, which presents a challenge for researchers looking to match current clinical-grade workflows. We introduce a deep learning framework, MimickNet, that transforms… ▽ More

    Submitted 15 August, 2019; originally announced August 2019.

    Comments: This work has been submitted to the IEEE Transactions on Medical Imaging on July 1st, 2019 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible