Skip to main content

Showing 1–31 of 31 results for author: Wen, B

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.08300  [pdf, other

    eess.IV cs.CV

    From Chaos to Clarity: 3DGS in the Dark

    Authors: Zhihao Li, Yufei Wang, Alex Kot, Bihan Wen

    Abstract: Novel view synthesis from raw images provides superior high dynamic range (HDR) information compared to reconstructions from low dynamic range RGB images. However, the inherent noise in unprocessed raw images compromises the accuracy of 3D scene representation. Our study reveals that 3D Gaussian Splatting (3DGS) is particularly susceptible to this noise, leading to numerous elongated Gaussian shap… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2403.10064  [pdf, other

    eess.IV cs.CV

    Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI

    Authors: Chong Wang, Lanqing Guo, Yufei Wang, Hao Cheng, Yi Yu, Bihan Wen

    Abstract: Deep unfolding networks (DUN) have emerged as a popular iterative framework for accelerated magnetic resonance imaging (MRI) reconstruction. However, conventional DUN aims to reconstruct all the missing information within the entire null space in each iteration. Thus it could be challenging when dealing with highly ill-posed degradation, usually leading to unsatisfactory reconstruction. In this wo… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  3. Health Guardian: Using Multi-modal Data to Understand Individual Health

    Authors: Vince S. Siu, Kuan Yu Hsieh, Italo Buleje, Takashi Itoh, Tian Hao, Ben Civjan, Nigel Hinds, Bing Dang, Jeffrey L. Rogers, Bo Wen

    Abstract: Artificial intelligence (AI) has shown great promise in revolutionizing the field of digital health by improving disease diagnosis, treatment, and prevention. This paper describes the Health Guardian platform, a non-commercial, scientific research-based platform developed by the IBM Digital Health team to rapidly translate AI research into cloud-based microservices. The platform can collect health… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: 10 pages, 6 figures

    Journal ref: IEEE International Conference on Digital Health (ICDH), 2023, pp. 65-74

  4. arXiv:2309.07169  [pdf, other

    eess.SP cs.LG

    Spectral Convergence of Complexon Shift Operators

    Authors: Purui Zhang, Xingchao Jian, Feng Ji, Wee Peng Tay, Bihan Wen

    Abstract: Topological Signal Processing (TSP) utilizes simplicial complexes to model structures with higher order than vertices and edges. In this paper, we study the transferability of TSP via a generalized higher-order version of graphon, known as complexon. We recall the notion of a complexon as the limit of a simplicial complex sequence [1]. Inspired by the graphon shift operator and message-passing neu… ▽ More

    Submitted 5 May, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: 9 pages, 2 figures

  5. arXiv:2307.07710  [pdf, other

    cs.CV eess.IV

    ExposureDiffusion: Learning to Expose for Low-light Image Enhancement

    Authors: Yufei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex C. Kot, Bihan Wen

    Abstract: Previous raw image-based low-light image enhancement methods predominantly relied on feed-forward neural networks to learn deterministic map**s from low-light to normally-exposed images. However, they failed to capture critical distribution information, leading to visually undesirable results. This work addresses the issue by seamlessly integrating a diffusion model with a physics-based exposure… ▽ More

    Submitted 15 August, 2023; v1 submitted 15 July, 2023; originally announced July 2023.

    Comments: accepted by ICCV2023

  6. arXiv:2307.04122  [pdf, other

    cs.CV eess.IV

    Enhancing Low-Light Images Using Infrared-Encoded Images

    Authors: Shulin Tian, Yufei Wang, Renjie Wan, Wenhan Yang, Alex C. Kot, Bihan Wen

    Abstract: Low-light image enhancement task is essential yet challenging as it is ill-posed intrinsically. Previous arts mainly focus on the low-light images captured in the visible spectrum using pixel-wise loss, which limits the capacity of recovering the brightness, contrast, and texture details due to the small number of income photons. In this work, we propose a novel approach to increase the visibility… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: The first two authors contribute equally. The work is accepted by ICIP 2023

  7. arXiv:2306.12058  [pdf, other

    cs.CV eess.IV

    Beyond Learned Metadata-based Raw Image Reconstruction

    Authors: Yufei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex C. Kot, Bihan Wen

    Abstract: While raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels, they are not widely adopted by general users due to their substantial storage requirements. Very recent studies propose to compress raw images by designing sampling masks within the pixel space of the raw image. However, these approaches often leave space for pursuing more effective im… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  8. arXiv:2306.00303  [pdf, other

    cs.CV eess.IV

    Sea Ice Extraction via Remote Sensed Imagery: Algorithms, Datasets, Applications and Challenges

    Authors: Anzhu Yu, Wenjun Huang, Qing Xu, Qun Sun, Wenyue Guo, Song Ji, Bowei Wen, Chun** Qiu

    Abstract: The deep learning, which is a dominating technique in artificial intelligence, has completely changed the image understanding over the past decade. As a consequence, the sea ice extraction (SIE) problem has reached a new era. We present a comprehensive review of four important aspects of SIE, including algorithms, datasets, applications, and the future trends. Our review focuses on researches publ… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: 24 pages, 6 figures

  9. arXiv:2305.08995  [pdf, other

    cs.CV eess.IV

    Denoising Diffusion Models for Plug-and-Play Image Restoration

    Authors: Yuanzhi Zhu, Kai Zhang, **gyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, Luc Van Gool

    Abstract: Plug-and-play Image Restoration (IR) has been widely recognized as a flexible and interpretable method for solving various inverse problems by utilizing any off-the-shelf denoiser as the implicit image prior. However, most existing methods focus on discriminative Gaussian denoisers. Although diffusion models have shown impressive performance for high-quality image synthesis, their potential to ser… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  10. arXiv:2303.02057  [pdf, other

    eess.IV cs.CV

    Unsupervised Deep Digital Staining For Microscopic Cell Images Via Knowledge Distillation

    Authors: Ziwang Xu, Lanqing Guo, Shuyan Zhang, Alex C. Kot, Bihan Wen

    Abstract: Staining is critical to cell imaging and medical diagnosis, which is expensive, time-consuming, labor-intensive, and causes irreversible changes to cell tissues. Recent advances in deep learning enabled digital staining via supervised model training. However, it is difficult to obtain large-scale stained/unstained cell image pairs in practice, which need to be perfectly aligned with the supervisio… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  11. arXiv:2303.01777  [pdf, other

    eess.IV cs.CV

    Benchmarking White Blood Cell Classification Under Domain Shift

    Authors: Satoshi Tsutsui, Zhengyang Su, Bihan Wen

    Abstract: Recognizing the types of white blood cells (WBCs) in microscopic images of human blood smears is a fundamental task in the fields of pathology and hematology. Although previous studies have made significant contributions to the development of methods and datasets, few papers have investigated benchmarks or baselines that others can easily refer to. For instance, we observed notable variations in t… ▽ More

    Submitted 19 May, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted to the International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2023. More datasets are cited

  12. arXiv:2302.12995  [pdf, other

    cs.CV eess.IV

    Raw Image Reconstruction with Learned Compact Metadata

    Authors: Yufei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex Kot, Bihan Wen

    Abstract: While raw images exhibit advantages over sRGB images (e.g., linearity and fine-grained quantization level), they are not widely used by common users due to the large storage requirements. Very recent works propose to compress raw images by designing the sampling masks in the raw image pixel space, leading to suboptimal image representations and redundant metadata. In this paper, we propose a novel… ▽ More

    Submitted 27 February, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: Accepted by CVPR 2023

  13. arXiv:2302.05746  [pdf, other

    cs.CV eess.IV

    Removing Image Artifacts From Scratched Lens Protectors

    Authors: Yufei Wang, Renjie Wan, Wenhan Yang, Bihan Wen, Lap-Pui Chau, Alex C. Kot

    Abstract: A protector is placed in front of the camera lens for mobile devices to avoid damage, while the protector itself can be easily scratched accidentally, especially for plastic ones. The artifacts appear in a wide variety of patterns, making it difficult to see through them clearly. Removing image artifacts from the scratched lens protector is inherently challenging due to the occasional flare artifa… ▽ More

    Submitted 14 February, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

    Comments: Accepted by ISCAS 2023

  14. Health Guardian Platform: A technology stack to accelerate discovery in Digital Health research

    Authors: Bo Wen, Vince S. Siu, Italo Buleje, Kuan Yu Hsieh, Takashi Itoh, Lukas Zimmerli, Nigel Hinds, Elif Eyigoz, Bing Dang, Stefan von Cavallar, Jeffrey L. Rogers

    Abstract: This paper highlights the design philosophy and architecture of the Health Guardian, a platform developed by the IBM Digital Health team to accelerate discoveries of new digital biomarkers and development of digital health technologies. The Health Guardian allows for rapid translation of artificial intelligence (AI) research into cloud-based microservices that can be tested with data from clinical… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: 6 pages, 3 figures, https://ieeexplore.ieee.org/document/9861047

    Journal ref: IEEE International Conference on Digital Health (ICDH), 2022, pp. 40-46

  15. arXiv:2207.12056  [pdf, other

    eess.IV cs.CV

    REPNP: Plug-and-Play with Deep Reinforcement Learning Prior for Robust Image Restoration

    Authors: Chong Wang, Rongkai Zhang, Saiprasad Ravishankar, Bihan Wen

    Abstract: Image restoration schemes based on the pre-trained deep models have received great attention due to their unique flexibility for solving various inverse problems. In particular, the Plug-and-Play (PnP) framework is a popular and powerful tool that can integrate an off-the-shelf deep denoiser for different image restoration tasks with known observation models. However, obtaining the observation mod… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: Accepted to ICIP 2022

  16. arXiv:2203.09656  [pdf, other

    eess.IV

    Learning Nonlocal Sparse and Low-Rank Models for Image Compressive Sensing

    Authors: Zhiyuan Zha, Bihan Wen, Xin Yuan, Saiprasad Ravishankar, Jiantao Zhou, Ce Zhu

    Abstract: The compressive sensing (CS) scheme exploits much fewer measurements than suggested by the Nyquist-Shannon sampling theorem to accurately reconstruct images, which has attracted considerable attention in the computational imaging community. While classic image CS schemes employed sparsity using analytical transforms or bases, the learning-based approaches have become increasingly popular in recent… ▽ More

    Submitted 25 October, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

  17. arXiv:2201.12716  [pdf, other

    cs.RO cs.AI cs.CV eess.SY

    You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration

    Authors: Bowen Wen, Wenzhao Lian, Kostas Bekris, Stefan Schaal

    Abstract: Promising results have been achieved recently in category-level manipulation that generalizes across object instances. Nevertheless, it often requires expensive real-world data collection and manual specification of semantic keypoints for each object category and task. Additionally, coarse keypoint predictions and ignoring intermediate action sequences hinder adoption in complex manipulation tasks… ▽ More

    Submitted 6 May, 2022; v1 submitted 29 January, 2022; originally announced January 2022.

    Journal ref: Robotics: Science and Systems (RSS) 2022

  18. arXiv:2201.03145  [pdf, other

    eess.IV cs.CV

    Enhancing Low-Light Images in Real World via Cross-Image Disentanglement

    Authors: Lanqing Guo, Renjie Wan, Wenhan Yang, Alex Kot, Bihan Wen

    Abstract: Images captured in the low-light condition suffer from low visibility and various imaging artifacts, e.g., real noise. Existing supervised enlightening algorithms require a large set of pixel-aligned training image pairs, which are hard to prepare in practice. Though weakly-supervised or unsupervised methods can alleviate such challenges without using paired training images, some real-world artifa… ▽ More

    Submitted 7 July, 2022; v1 submitted 9 January, 2022; originally announced January 2022.

    ACM Class: I.4.3; I.4.4

  19. arXiv:2201.00269  [pdf, ps, other

    eess.AS cs.SD

    IQDUBBING: Prosody modeling based on discrete self-supervised speech representation for expressive voice conversion

    Authors: Wendong Gan, Bolong Wen, Ying Yan, Haitao Chen, Zhichao Wang, Hongqiang Du, Lei Xie, Kaixuan Guo, Hai Li

    Abstract: Prosody modeling is important, but still challenging in expressive voice conversion. As prosody is difficult to model, and other factors, e.g., speaker, environment and content, which are entangled with prosody in speech, should be removed in prosody modeling. In this paper, we present IQDubbing to solve this problem for expressive voice conversion. To model prosody, we leverage the recent advance… ▽ More

    Submitted 1 January, 2022; originally announced January 2022.

    Comments: Submitted to ICASSP 2022

  20. arXiv:2111.06031  [pdf, other

    cs.CV eess.IV

    FINO: Flow-based Joint Image and Noise Model

    Authors: Lanqing Guo, Siyu Huang, Haosen Liu, Bihan Wen

    Abstract: One of the fundamental challenges in image restoration is denoising, where the objective is to estimate the clean image from its noisy measurements. To tackle such an ill-posed inverse problem, the existing denoising approaches generally focus on exploiting effective natural image priors. The utilization and analysis of the noise model are often ignored, although the noise model can provide comple… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

    ACM Class: I.4.4

  21. arXiv:2109.09163  [pdf, other

    cs.RO cs.AI cs.CV eess.SY

    CaTGrasp: Learning Category-Level Task-Relevant Gras** in Clutter from Simulation

    Authors: Bowen Wen, Wenzhao Lian, Kostas Bekris, Stefan Schaal

    Abstract: Task-relevant gras** is critical for industrial assembly, where downstream manipulation tasks constrain the set of valid grasps. Learning how to perform this task, however, is challenging, since task-relevant grasp labels are hard to define and annotate. There is also yet no consensus on proper representations for modeling or off-the-shelf tools for performing task-relevant grasps. This work pro… ▽ More

    Submitted 25 February, 2022; v1 submitted 19 September, 2021; originally announced September 2021.

    Comments: IEEE International Conference on Robotics and Automation (ICRA) 2022

  22. arXiv:2107.05318  [pdf, other

    eess.IV cs.CV

    R3L: Connecting Deep Reinforcement Learning to Recurrent Neural Networks for Image Denoising via Residual Recovery

    Authors: Rongkai Zhang, Jiang Zhu, Zhiyuan Zha, Justin Dauwels, Bihan Wen

    Abstract: State-of-the-art image denoisers exploit various types of deep neural networks via deterministic training. Alternatively, very recent works utilize deep reinforcement learning for restoring images with diverse or unknown corruptions. Though deep reinforcement learning can generate effective policy networks for operator selection or architecture search in image restoration, how it is connected to t… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

    Comments: Accepted by ICIP 2021

  23. arXiv:2106.14070  [pdf, other

    cs.RO cs.AI cs.CV eess.SY

    Vision-driven Compliant Manipulation for Reliable, High-Precision Assembly Tasks

    Authors: Andrew S. Morgan, Bowen Wen, Junchi Liang, Abdeslam Boularias, Aaron M. Dollar, Kostas Bekris

    Abstract: Highly constrained manipulation tasks continue to be challenging for autonomous robots as they require high levels of precision, typically less than 1mm, which is often incompatible with what can be achieved by traditional perception systems. This paper demonstrates that the combination of state-of-the-art object tracking with passively adaptive mechanical hardware can be leveraged to complete pre… ▽ More

    Submitted 26 June, 2021; originally announced June 2021.

  24. arXiv:2010.15317  [pdf

    cs.SD eess.AS

    The IQIYI System for Voice Conversion Challenge 2020

    Authors: Wendong Gan, Haitao Chen, Yin Yan, Jianwei Li, Bolong Wen, Xue** Xu, Hai Li

    Abstract: This paper presents the IQIYI voice conversion system (T24) for Voice Conversion 2020. In the competition, each target speaker has 70 sentences. We have built an end-to-end voice conversion system based on PPG. First, the ASR acoustic model calculates the BN feature, which represents the content-related information in the speech. Then the Mel feature is calculated through an improved prosody tacot… ▽ More

    Submitted 28 October, 2020; originally announced October 2020.

  25. arXiv:2007.13866  [pdf, other

    cs.CV cs.GR cs.LG cs.RO eess.IV

    se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains

    Authors: Bowen Wen, Chaitanya Mitash, Baozhang Ren, Kostas E. Bekris

    Abstract: Tracking the 6D pose of objects in video sequences is important for robot manipulation. This task, however, introduces multiple challenges: (i) robot manipulation involves significant occlusions; (ii) data and annotations are troublesome and difficult to collect for 6D poses, which complicates machine learning solutions, and (iii) incremental error drift often accumulates in long term tracking to… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Journal ref: International Conference on Intelligent Robots and Systems (IROS) 2020

  26. arXiv:2007.09077  [pdf, other

    cs.CV cs.GR cs.LG eess.IV

    Generating Person Images with Appearance-aware Pose Stylizer

    Authors: Siyu Huang, Haoyi Xiong, Zhi-Qi Cheng, Qingzhong Wang, Xingran Zhou, Bihan Wen, Jun Huan, De**g Dou

    Abstract: Generation of high-quality person images is challenging, due to the sophisticated entanglements among image factors, e.g., appearance, pose, foreground, background, local details, global structures, etc. In this paper, we present a novel end-to-end framework to generate realistic person images based on given person poses and appearances. The core of our framework is a novel generator called Appear… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

    Comments: Appearing at IJCAI 2020. The code is available at https://github.com/siyuhuang/PoseStylizer

  27. arXiv:2005.07902  [pdf, other

    eess.IV cs.CV

    The Power of Triply Complementary Priors for Image Compressive Sensing

    Authors: Zhiyuan Zha, Xin Yuan, Joey Tianyi Zhou, Jiantao Zhou, Bihan Wen, Ce Zhu

    Abstract: Recent works that utilized deep models have achieved superior results in various image restoration applications. Such approach is typically supervised which requires a corpus of training images with distribution similar to the images to be recovered. On the other hand, the shallow methods which are usually unsupervised remain promising performance in many inverse problems, \eg, image compressive s… ▽ More

    Submitted 16 May, 2020; originally announced May 2020.

    Journal ref: 2020 International Conference on Image Processing

  28. arXiv:2003.12985  [pdf, other

    eess.IV cs.LG eess.SP stat.ML

    A Set-Theoretic Study of the Relationships of Image Models and Priors for Restoration Problems

    Authors: Bihan Wen, Yanjun Li, Yuqi Li, Yoram Bresler

    Abstract: Image prior modeling is the key issue in image recovery, computational imaging, compresses sensing, and other inverse problems. Recent algorithms combining multiple effective priors such as the sparse or low-rank models, have demonstrated superior performance in various applications. However, the relationships among the popular image models are unclear, and no theory in general is available to dem… ▽ More

    Submitted 29 March, 2020; originally announced March 2020.

  29. Robust, Occlusion-aware Pose Estimation for Objects Grasped by Adaptive Hands

    Authors: Bowen Wen, Chaitanya Mitash, Sruthi Soorian, Andrew Kimmel, Avishai Sintov, Kostas E. Bekris

    Abstract: Many manipulation tasks, such as placement or within-hand manipulation, require the object's pose relative to a robot hand. The task is difficult when the hand significantly occludes the object. It is especially hard for adaptive hands, for which it is not easy to detect the finger's configuration. In addition, RGB-only approaches face issues with texture-less objects or when the hand and the obje… ▽ More

    Submitted 7 March, 2020; originally announced March 2020.

    Journal ref: IEEE International Conference on Robotics and Automation (ICRA) 2020

  30. arXiv:1910.01221  [pdf, other

    cs.CV cs.MM eess.IV

    ROMark: A Robust Watermarking System Using Adversarial Training

    Authors: Bingyang Wen, Sergul Aydore

    Abstract: The availability and easy access to digital communication increase the risk of copyrighted material piracy. In order to detect illegal use or distribution of data, digital watermarking has been proposed as a suitable tool. It protects the copyright of digital content by embedding imperceptible information into the data in the presence of an adversary. The goal of the adversary is to remove the cop… ▽ More

    Submitted 2 October, 2019; originally announced October 2019.

    Comments: 5 pages, 1 figure, Machine Learning with Guarantees workshop at NeurIPS 2019

  31. arXiv:1903.11431  [pdf, other

    eess.IV cs.LG stat.ML

    Transform Learning for Magnetic Resonance Image Reconstruction: From Model-based Learning to Building Neural Networks

    Authors: Bihan Wen, Saiprasad Ravishankar, Luke Pfister, Yoram Bresler

    Abstract: Magnetic resonance imaging (MRI) is widely used in clinical practice, but it has been traditionally limited by its slow data acquisition. Recent advances in compressed sensing (CS) techniques for MRI reduce acquisition time while maintaining high image quality. Whereas classical CS assumes the images are sparse in known analytical dictionaries or transform domains, methods using learned image mode… ▽ More

    Submitted 5 November, 2019; v1 submitted 24 March, 2019; originally announced March 2019.

    Comments: Accepted to IEEE Signal Processing Magazine, Special Issue on Computational MRI: Compressed Sensing and Beyond