Skip to main content

Showing 1–50 of 117 results for author: Feng, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01029  [pdf, other

    cs.CV

    EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting

    Authors: Chenxin Li, Brandon Y. Feng, Yifan Liu, Hengyu Liu, Cheng Wang, Weihao Yu, Yixuan Yuan

    Abstract: 3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities. Existing methods employ various advanced neural rendering techniques for photorealistic view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available, which is usually… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accpeted by MICCAI2024

  2. arXiv:2406.14746  [pdf, other

    cs.LG cs.RO

    Relational Reasoning On Graphs Using Opinion Dynamics

    Authors: Yulong Yang, Bowen Feng, Keqin Wang, Naomi Leonard, Adji Bousso Dieng, Christine Allen-Blanchette

    Abstract: From pedestrians to Kuramoto oscillators, interactions between agents govern how a multitude of dynamical systems evolve in space and time. Discovering how these agents relate to each other can improve our understanding of the often complex dynamics that underlie these systems. Recent works learn to categorize relationships between agents based on observations of their physical behavior. These app… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 14 pages, 7 figures

  3. arXiv:2406.12816  [pdf, other

    cs.LG cs.CV eess.IV

    Neural Approximate Mirror Maps for Constrained Diffusion Models

    Authors: Berthy T. Feng, Ricardo Baptista, Katherine L. Bouman

    Abstract: Diffusion models excel at creating visually-convincing images, but they often struggle to meet subtle constraints inherent in the training data. Such constraints could be physics-based (e.g., satisfying a PDE), geometric (e.g., respecting symmetry), or semantic (e.g., including a particular number of objects). When the training data all satisfy a certain constraint, enforcing this constraint on a… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.12355  [pdf, other

    cs.CV

    LiCAF: LiDAR-Camera Asymmetric Fusion for Gait Recognition

    Authors: Yunze Deng, Haijun Xiong, Bin Feng

    Abstract: Gait recognition is a biometric technology that identifies individuals by using walking patterns. Due to the significant achievements of multimodal fusion in gait recognition, we consider employing LiDAR-camera fusion to obtain robust gait representations. However, existing methods often overlook intrinsic characteristics of modalities, and lack fine-grained fusion and temporal modeling. In this p… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by ICIP2024

  5. arXiv:2406.08814  [pdf, other

    cs.CV

    Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting

    Authors: Zhengqi Zhao, Xiaohu Huang, Hao Zhou, Kun Yao, Errui Ding, **gdong Wang, Xinggang Wang, Wenyu Liu, Bin Feng

    Abstract: The key to action counting is accurately locating each video's repetitive actions. Instead of estimating the probability of each frame belonging to an action directly, we propose a dual-branch network, i.e., SkimFocusNet, working in a two-step manner. The model draws inspiration from empirical observations indicating that humans typically engage in coarse skimming of entire sequences to grasp the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 13 pages, 9 figures

  6. arXiv:2406.02785  [pdf, other

    astro-ph.IM cs.LG eess.IV

    Event-horizon-scale Imaging of M87* under Different Assumptions via Deep Generative Image Priors

    Authors: Berthy T. Feng, Katherine L. Bouman, William T. Freeman

    Abstract: Reconstructing images from the Event Horizon Telescope (EHT) observations of M87*, the supermassive black hole at the center of the galaxy M87, depends on a prior to impose desired image statistics. However, given the impossibility of directly observing black holes, there is no clear choice for a prior. We present a framework for flexibly designing a range of priors, each bringing different biases… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  7. arXiv:2405.20334  [pdf, other

    cs.CV cs.GR

    VividDream: Generating 3D Scene with Ambient Dynamics

    Authors: Yao-Chih Lee, Yi-Ting Chen, Andrew Wang, Ting-Hsuan Liao, Brandon Y. Feng, Jia-Bin Huang

    Abstract: We introduce VividDream, a method for generating explorable 4D scenes with ambient dynamics from a single input image or text prompt. VividDream first expands an input image into a static 3D point cloud through iterative inpainting and geometry merging. An ensemble of animated videos is then generated using video diffusion models with quality refinement techniques and conditioned on renderings of… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Project page: https://vivid-dream-4d.github.io

  8. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  9. arXiv:2404.15014  [pdf, other

    cs.CV

    OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving

    Authors: Guoqing Wang, Zhongdao Wang, Pin Tang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma

    Abstract: Existing solutions for 3D semantic occupancy prediction typically treat the task as a one-shot 3D voxel-wise segmentation perception problem. These discriminative methods focus on learning the map** between the inputs and occupancy map in a single step, lacking the ability to gradually refine the occupancy map and the reasonable scene imaginative capacity to complete the local regions somewhere.… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  10. arXiv:2404.13026  [pdf, other

    cs.CV cs.AI

    PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation

    Authors: Tianyuan Zhang, Hong-Xing Yu, Rundi Wu, Brandon Y. Feng, Changxi Zheng, Noah Snavely, Jiajun Wu, William T. Freeman

    Abstract: Realistic object interactions are crucial for creating immersive virtual experiences, yet synthesizing realistic 3D object dynamics in response to novel interactions remains a significant challenge. Unlike unconditional or text-conditioned dynamics generation, action-conditioned dynamics requires perceiving the physical material properties of objects and grounding the 3D motion prediction on these… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Project website at: https://physdreamer.github.io/

  11. arXiv:2404.09734  [pdf, other

    cs.IT eess.SP

    Weighted Sum-Rate Maximization for Movable Antenna-Enhanced Wireless Networks

    Authors: Biqian Feng, Yongpeng Wu, Xiang-Gen Xia, Chengshan Xiao

    Abstract: This letter investigates the weighted sum rate maximization problem in movable antenna (MA)-enhanced systems. To reduce the computational complexity, we transform it into a more tractable weighted minimum mean square error (WMMSE) problem well-suited for MA. We then adopt the WMMSE algorithm and majorization-minimization algorithm to optimize the beamforming and antenna positions, respectively. Mo… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE Wireless Communications Letters

  12. arXiv:2404.09502  [pdf, other

    cs.CV

    SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction

    Authors: Pin Tang, Zhongdao Wang, Guoqing Wang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma

    Abstract: Vision-based perception for autonomous driving requires an explicit modeling of a 3D space, where 2D latent representations are mapped and subsequent 3D operators are applied. However, operating on dense latent spaces introduces a cubic time and space complexity, which limits scalability in terms of perception range or spatial resolution. Existing approaches compress the dense representation using… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 10 pages, 4 figures, accepted by CVPR 2024

    Journal ref: IEEE Conference on Computer Vision and Pattern Recognition 2024 (CVPR 2024)

  13. arXiv:2404.07985  [pdf, other

    cs.CV eess.IV

    WaveMo: Learning Wavefront Modulations to See Through Scattering

    Authors: Mingyang Xie, Haiyun Guo, Brandon Y. Feng, Lingbo **, Ashok Veeraraghavan, Christopher A. Metzler

    Abstract: Imaging through scattering media is a fundamental and pervasive challenge in fields ranging from medical diagnostics to astronomy. A promising strategy to overcome this challenge is wavefront modulation, which induces measurement diversity during image acquisition. Despite its importance, designing optimal wavefront modulations to image through scattering remains under-explored. This paper introdu… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  14. arXiv:2404.00471  [pdf, other

    physics.med-ph cs.CV cs.LG eess.IV

    Score-Based Diffusion Models for Photoacoustic Tomography Image Reconstruction

    Authors: Sreemanti Dey, Snigdha Saha, Berthy T. Feng, Manxiu Cui, Laure Delisle, Oscar Leong, Lihong V. Wang, Katherine L. Bouman

    Abstract: Photoacoustic tomography (PAT) is a rapidly-evolving medical imaging modality that combines optical absorption contrast with ultrasound imaging depth. One challenge in PAT is image reconstruction with inadequate acoustic signals due to limited sensor coverage or due to the density of the transducer array. Such cases call for solving an ill-posed inverse reconstruction problem. In this work, we use… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 5 pages

    Journal ref: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 2470-2474

  15. arXiv:2403.16095  [pdf, other

    cs.CV cs.RO

    CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field

    Authors: Jiarui Hu, Xianhao Chen, Boyin Feng, Guanglin Li, Liang**g Yang, Hujun Bao, Guofeng Zhang, Zhaopeng Cui

    Abstract: Recently neural radiance fields (NeRF) have been widely exploited as 3D representations for dense simultaneous localization and map** (SLAM). Despite their notable successes in surface modeling and novel view synthesis, existing NeRF-based methods are hindered by their computationally intensive and time-consuming volume rendering pipeline. This paper presents an efficient dense RGB-D SLAM system… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: Project Page: https://zju3dv.github.io/cg-slam

  16. arXiv:2403.13800  [pdf, other

    cs.CV

    TimeRewind: Rewinding Time with Image-and-Events Video Diffusion

    Authors: **gxi Chen, Brandon Y. Feng, Haoming Cai, Mingyang Xie, Christopher Metzler, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: This paper addresses the novel challenge of ``rewinding'' time from a single captured image to recover the fleeting moments missed just before the shutter button is pressed. This problem poses a significant challenge in computer vision and computational photography, as it requires predicting plausible pre-capture motion from a single static frame, an inherently ill-posed task due to the high degre… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  17. arXiv:2403.11050  [pdf, other

    cs.CV

    Endora: Video Generation Models as Endoscopy Simulators

    Authors: Chenxin Li, Hengyu Liu, Yifan Liu, Brandon Y. Feng, Wuyang Li, Xinyu Liu, Zhen Chen, **g Shao, Yixuan Yuan

    Abstract: Generative models hold promise for revolutionizing medical education, robot-assisted surgery, and data augmentation for machine learning. Despite progress in generating 2D medical images, the complex domain of clinical video generation has largely remained untapped.This paper introduces \model, an innovative approach to generate medical videos that simulate clinical endoscopy scenes. We present a… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: Project page: https://endora-medvidgen.github.io/

  18. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  19. arXiv:2312.04679  [pdf, other

    eess.IV cs.CV

    ConVRT: Consistent Video Restoration Through Turbulence with Test-time Optimization of Neural Video Representations

    Authors: Haoming Cai, **gxi Chen, Brandon Y. Feng, Weiyun Jiang, Mingyang Xie, Kevin Zhang, Ashok Veeraraghavan, Christopher Metzler

    Abstract: tmospheric turbulence presents a significant challenge in long-range imaging. Current restoration algorithms often struggle with temporal inconsistency, as well as limited generalization ability across varying turbulence levels and scene content different than the training data. To tackle these issues, we introduce a self-supervised method, Consistent Video Restoration through Turbulence (ConVRT)… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: https://convrt-2024.github.io/

  20. arXiv:2312.03788  [pdf, other

    cs.LG cs.CL

    SmoothQuant+: Accurate and Efficient 4-bit Post-Training WeightQuantization for LLM

    Authors: Jiayi Pan, Chengcan Wang, Kaifu Zheng, Yangguang Li, Zhenyu Wang, Bin Feng

    Abstract: Large language models (LLMs) have shown remarkable capabilities in various tasks. However their huge model size and the consequent demand for computational and memory resources also pose challenges to model deployment. Currently, 4-bit post-training quantization (PTQ) has achieved some success in LLMs, reducing the memory footprint by approximately 75% compared to FP16 models, albeit with some acc… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  21. arXiv:2312.01195  [pdf, other

    cs.CR cs.SE

    AIM: Automatic Interrupt Modeling for Dynamic Firmware Analysis

    Authors: Bo Feng, Meng Luo, Changming Liu, Long Lu, Engin Kirda

    Abstract: The security of microcontrollers, which drive modern IoT and embedded devices, continues to raise major concerns. Within a microcontroller (MCU), the firmware is a monolithic piece of software that contains the whole software stack, whereas a variety of peripherals represent the hardware. As MCU firmware contains vulnerabilities, it is ideal to test firmware with off-the-shelf software testing tec… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: This paper was accepted to IEEE Transactions on Dependable and Secure Computing at Oct 12, 2023

  22. arXiv:2310.10835  [pdf, other

    eess.IV cs.CV cs.LG

    Provable Probabilistic Imaging using Score-Based Generative Priors

    Authors: Yu Sun, Zihui Wu, Yifan Chen, Berthy T. Feng, Katherine L. Bouman

    Abstract: Estimating high-quality images while also quantifying their uncertainty are two desired features in an image reconstruction algorithm for solving ill-posed inverse problems. In this paper, we propose plug-and-play Monte Carlo (PMC) as a principled framework for characterizing the space of possible solutions to a general inverse problem. PMC is able to incorporate expressive score-based generative… ▽ More

    Submitted 29 December, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

  23. arXiv:2310.06504  [pdf, other

    cs.CL cs.AI cs.LG

    Revisit Input Perturbation Problems for LLMs: A Unified Robustness Evaluation Framework for Noisy Slot Filling Task

    Authors: Guanting Dong, **xu Zhao, Tingfeng Hui, Daichi Guo, Wenlong Wan, Boqi Feng, Yueyan Qiu, Zhuoma Gongque, Keqing He, Zechen Wang, Weiran Xu

    Abstract: With the increasing capabilities of large language models (LLMs), these high-performance models have achieved state-of-the-art results on a wide range of natural language processing (NLP) tasks. However, the models' performance on commonly-used benchmark datasets often fails to accurately reflect their reliability and robustness when applied to real-world noisy data. To address these challenges, w… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted at NLPCC 2023 (Oral Presentation)

  24. arXiv:2310.03125  [pdf, other

    cs.CV

    Shielding the Unseen: Privacy Protection through Poisoning NeRF with Spatial Deformation

    Authors: Yihan Wu, Brandon Y. Feng, Heng Huang

    Abstract: In this paper, we introduce an innovative method of safeguarding user privacy against the generative capabilities of Neural Radiance Fields (NeRF) models. Our novel poisoning attack method induces changes to observed views that are imperceptible to the human eye, yet potent enough to disrupt NeRF's ability to accurately reconstruct a 3D scene. To achieve this, we devise a bi-level optimization alg… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  25. arXiv:2309.17293  [pdf, other

    quant-ph cs.CR cs.ET

    Quantum Privacy-preserving Two-party Circle Intersection Protocol Based on Phase-encoded Query

    Authors: Zi-Xian Li, Qi Yang, Bao Feng, Wen-Jie Liu

    Abstract: Privacy-preserving geometric intersection (PGI) is an important issue in Secure multiparty computation (SMC). The existing quantum PGI protocols are mainly based on grid coding, which requires a lot of computational complexity. The phase-encoded query method which has been used in some Quantum SMC protocols is suitable to solve the decision problem, but it needs to apply high dimensional Oracle op… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: 16 pages, 2 figures

    Journal ref: International Journal of Theoretical Physics,2023.62(7):p.138

  26. arXiv:2309.14349  [pdf, other

    cs.LG cs.AI

    Corporate Credit Rating: A Survey

    Authors: Bo**g Feng, Xi Cheng, Dan Li, Zeyu Liu, Wenfang Xue

    Abstract: Corporate credit rating (CCR) plays a very important role in the process of contemporary economic and social development. How to use credit rating methods for enterprises has always been a problem worthy of discussion. Through reading and studying the relevant literature at home and abroad, this paper makes a systematic survey of CCR. This paper combs the context of the development of CCR methods… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: 11 pages

  27. arXiv:2309.11591  [pdf, other

    cs.CV cs.GR

    Continuous Levels of Detail for Light Field Networks

    Authors: David Li, Brandon Y. Feng, Amitabh Varshney

    Abstract: Recently, several approaches have emerged for generating neural representations with multiple levels of detail (LODs). LODs can improve the rendering by using lower resolutions and smaller model sizes when appropriate. However, existing methods generally focus on a few discrete LODs which suffer from aliasing and flicker artifacts as details are changed and limit their granularity for adapting to… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted to BMVC 2023. Webpage at https://augmentariumlab.github.io/continuous-lfn/

  28. arXiv:2309.01949  [pdf, other

    cs.CV

    Efficient Bayesian Computational Imaging with a Surrogate Score-Based Prior

    Authors: Berthy T. Feng, Katherine L. Bouman

    Abstract: We propose a surrogate function for efficient use of score-based priors for Bayesian inverse imaging. Recent work turned score-based diffusion models into probabilistic priors for solving ill-posed imaging problems by appealing to an ODE-based log-probability function. However, evaluating this function is computationally inefficient and inhibits posterior estimation of high-dimensional images. Our… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  29. arXiv:2308.16861  [pdf, ps, other

    cs.CR

    Facing Unknown: Open-World Encrypted Traffic Classification Based on Contrastive Pre-Training

    Authors: Xiang Li, Beibei Feng, Tianning Zang, Shuyuan Zhao, **grun Ma

    Abstract: Traditional Encrypted Traffic Classification (ETC) methods face a significant challenge in classifying large volumes of encrypted traffic in the open-world assumption, i.e., simultaneously classifying the known applications and detecting unknown applications. We propose a novel Open-World Contrastive Pre-training (OWCP) framework for this. OWCP performs contrastive pre-training to obtain a robust… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: Accepted by 2023 IEEE ISCC, 6 pages, 5 figures

  30. arXiv:2308.06720  [pdf, other

    cs.IT eess.SP

    Joint Beamforming and Antenna Movement Design for Moveable Antenna Systems Based on Statistical CSI

    Authors: Xintai Chen, Biqian Feng, Yongpeng Wu, Derrick Wing Kwan Ng, Robert Schober

    Abstract: This paper studies a novel movable antenna (MA)-enhanced multiple-input multiple-output (MIMO) system to leverage the corresponding spatial degrees of freedom (DoFs) for improving the performance of wireless communications. We aim to maximize the achievable rate by jointly optimizing the MA positions and the transmit covariance matrix based on statistical channel state information (CSI). To solve… ▽ More

    Submitted 18 August, 2023; v1 submitted 13 August, 2023; originally announced August 2023.

    Comments: Accepted by GLOBECOM 2023

  31. arXiv:2308.06707  [pdf, other

    cs.CV

    Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition

    Authors: Xiaohu Huang, Xinggang Wang, Zhidianqiu **, Bo Yang, Botao He, Bin Feng, Wenyu Liu

    Abstract: Graph convolutional networks have been widely applied in skeleton-based gait recognition. A key challenge in this task is to distinguish the individual walking styles of different subjects across various views. Existing state-of-the-art methods employ uniform convolutions to extract features from diverse sequences and ignore the effects of viewpoint changes. To overcome these limitations, we propo… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

    Comments: Accepted by TIP journal

  32. arXiv:2308.03757  [pdf, other

    cs.CV

    3D Motion Magnification: Visualizing Subtle Motions with Time Varying Radiance Fields

    Authors: Brandon Y. Feng, Hadi Alzayer, Michael Rubinstein, William T. Freeman, Jia-Bin Huang

    Abstract: Motion magnification helps us visualize subtle, imperceptible motion. However, prior methods only work for 2D videos captured with a fixed camera. We present a 3D motion magnification method that can magnify subtle motions from scenes captured by a moving camera, while supporting novel view rendering. We represent the scene with time-varying radiance fields and leverage the Eulerian principle for… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: ICCV 2023. See the project page at https://3d-motion-magnification.github.io

  33. arXiv:2306.09348  [pdf, other

    cs.CV

    Seeing the World through Your Eyes

    Authors: Hadi Alzayer, Kevin Zhang, Brandon Feng, Christopher Metzler, Jia-Bin Huang

    Abstract: The reflective nature of the human eye is an underappreciated source of information about what the world around us looks like. By imaging the eyes of a moving person, we can collect multiple views of a scene outside the camera's direct line of sight through the reflections in the eyes. In this paper, we reconstruct a 3D scene beyond the camera's line of sight using portrait images containing eye r… ▽ More

    Submitted 2 March, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: CVPR 2024. First two authors contributed equally. Project page: https://world-from-eyes.github.io/

  34. arXiv:2306.07598  [pdf, other

    cs.CV

    Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images

    Authors: Panwang Pan, Zhiwen Fan, Brandon Y. Feng, Peihao Wang, Chenxin Li, Zhangyang Wang

    Abstract: The accurate estimation of six degrees-of-freedom (6DoF) object poses is essential for many applications in robotics and augmented reality. However, existing methods for 6DoF pose estimation often depend on CAD templates or dense support views, restricting their usefulness in realworld situations. In this study, we present a new cascade framework named Cas6D for few-shot 6DoF pose estimation that… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  35. arXiv:2306.05629  [pdf, other

    cs.IT eess.SY

    R-PMAC: A Robust Preamble Based MAC Mechanism Applied in Industrial Internet of Things

    Authors: Kai Song, Biqian Feng, Yongpeng Wu, Zhen Gao, Wenjun Zhang

    Abstract: This paper proposes a novel media access control (MAC) mechanism, called the robust preamble-based MAC mechanism (R-PMAC), which can be applied to power line communication (PLC) networks in the context of the Industrial Internet of Things (IIoT). Compared with other MAC mechanisms such as P-MAC and the MAC layer of IEEE1901.1, R-PMAC has higher networking speed. Besides, it supports whitelist auth… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: This paper has been accepted by IEEE Internet of Things Journal

  36. arXiv:2305.19700  [pdf, other

    cs.CV

    GaitGS: Temporal Feature Learning in Granularity and Span Dimension for Gait Recognition

    Authors: Haijun Xiong, Yunze Deng, Bin Feng, Xinggang Wang, Wenyu Liu

    Abstract: Gait recognition, a growing field in biological recognition technology, utilizes distinct walking patterns for accurate individual identification. However, existing methods lack the incorporation of temporal information. To reach the full potential of gait recognition, we advocate for the consideration of temporal features at varying granularities and spans. This paper introduces a novel framework… ▽ More

    Submitted 18 June, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted by ICIP2024

  37. arXiv:2305.07584  [pdf, other

    cs.IT eess.SP

    Proactive Content Caching Scheme in Urban Vehicular Networks

    Authors: Biqian Feng, Chenyuan Feng, Daquan Feng, Yongpeng Wu, Xiang-Gen Xia

    Abstract: Stream media content caching is a key enabling technology to promote the value chain of future urban vehicular networks. Nevertheless, the high mobility of vehicles, intermittency of information transmissions, high dynamics of user requests, limited caching capacities and extreme complexity of business scenarios pose an enormous challenge to content caching and distribution in vehicular networks.… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted by IEEE Transactions on Communications

  38. arXiv:2305.06233  [pdf, other

    cs.GR

    View Correspondence Network for Implicit Light Field Representation

    Authors: Süleyman Aslan, Brandon Yushan Feng, Amitabh Varshney

    Abstract: We present a novel technique for implicit neural representation of light fields at continuously defined viewpoints with high quality and fidelity. Our implicit neural representation maps 4D coordinates defining two-plane parameterization of the light fields to the corresponding color values. We leverage periodic activations to achieve high expressivity and accurate reconstruction for complex data… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: 10 pages, 7 figures

  39. arXiv:2304.11751  [pdf, other

    cs.CV

    Score-Based Diffusion Models as Principled Priors for Inverse Imaging

    Authors: Berthy T. Feng, Jamie Smith, Michael Rubinstein, Huiwen Chang, Katherine L. Bouman, William T. Freeman

    Abstract: Priors are essential for reconstructing images from noisy and/or incomplete measurements. The choice of the prior determines both the quality and uncertainty of recovered images. We propose turning score-based diffusion models into principled image priors ("score-based priors") for analyzing a posterior of images given measurements. Previously, probabilistic priors were limited to handcrafted regu… ▽ More

    Submitted 28 August, 2023; v1 submitted 23 April, 2023; originally announced April 2023.

    Comments: ICCV 2023

  40. arXiv:2304.02214  [pdf, other

    cs.CV

    LogoNet: a fine-grained network for instance-level logo sketch retrieval

    Authors: Binbin Feng, Jun Li, Jianhua Xu

    Abstract: Sketch-based image retrieval, which aims to use sketches as queries to retrieve images containing the same query instance, receives increasing attention in recent years. Although dramatic progress has been made in sketch retrieval, few efforts are devoted to logo sketch retrieval which is still hindered by the following challenges: Firstly, logo sketch retrieval is more difficult than typical sket… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  41. arXiv:2303.16856  [pdf, other

    cs.CV cs.GR

    Robust Dancer: Long-term 3D Dance Synthesis Using Unpaired Data

    Authors: Bin Feng, Tenglong Ao, Zequn Liu, Wei Ju, Libin Liu, Ming Zhang

    Abstract: How to automatically synthesize natural-looking dance movements based on a piece of music is an incrementally popular yet challenging task. Most existing data-driven approaches require hard-to-get paired training data and fail to generate long sequences of motion due to error accumulation of autoregressive structure. We present a novel 3D dance synthesis system that only needs unpaired data for tr… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: Preliminary video demo: https://youtu.be/gJbxG9QlcUU

  42. arXiv:2303.02242  [pdf, other

    cs.CL

    TrojText: Test-time Invisible Textual Trojan Insertion

    Authors: Qian Lou, Yepeng Liu, Bo Feng

    Abstract: In Natural Language Processing (NLP), intelligent neuron models can be susceptible to textual Trojan attacks. Such attacks occur when Trojan models behave normally for standard inputs but generate malicious output for inputs that contain a specific trigger. Syntactic-structure triggers, which are invisible, are becoming more popular for Trojan attacks because they are difficult to detect and defen… ▽ More

    Submitted 21 August, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: In The Eleventh International Conference on Learning Representations. 2023 (ICLR 2023)

  43. arXiv:2301.10900  [pdf, other

    cs.CV

    Graph Contrastive Learning for Skeleton-based Action Recognition

    Authors: Xiaohu Huang, Hao Zhou, Jian Wang, Haocheng Feng, Junyu Han, Errui Ding, **gdong Wang, Xinggang Wang, Wenyu Liu, Bin Feng

    Abstract: In the field of skeleton-based action recognition, current top-performing graph convolutional networks (GCNs) exploit intra-sequence context to construct adaptive graphs for feature aggregation. However, we argue that such context is still \textit{local} since the rich cross-sequence relations have not been explicitly investigated. In this paper, we propose a graph contrastive learning framework f… ▽ More

    Submitted 10 June, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted by ICLR2023

  44. arXiv:2212.01602  [pdf, other

    cs.CV

    StegaNeRF: Embedding Invisible Information within Neural Radiance Fields

    Authors: Chenxin Li, Brandon Y. Feng, Zhiwen Fan, Panwang Pan, Zhangyang Wang

    Abstract: Recent advances in neural rendering imply a future of widespread visual data distributions through sharing NeRF model weights. However, while common visual data (images and videos) have standard approaches to embed ownership or copyright information explicitly or subtly, the problem remains unexplored for the emerging NeRF format. We present StegaNeRF, a method for steganographic information embed… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

    Comments: Project page: https://xggnet.github.io/StegaNeRF/

  45. arXiv:2211.00722  [pdf, other

    cs.CV cs.GR cs.LG

    VIINTER: View Interpolation with Implicit Neural Representations of Images

    Authors: Brandon Yushan Feng, Susmija Jabbireddy, Amitabh Varshney

    Abstract: We present VIINTER, a method for view interpolation by interpolating the implicit neural representation (INR) of the captured images. We leverage the learned code vector associated with each image and interpolate between these codes to achieve viewpoint transitions. We propose several techniques that significantly enhance the interpolation quality. VIINTER signifies a new way to achieve view inter… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: SIGGRAPH Asia 2022

  46. arXiv:2209.12708  [pdf, other

    cs.LG cs.PF

    Faith: An Efficient Framework for Transformer Verification on GPUs

    Authors: Boyuan Feng, Tianqi Tang, Yuke Wang, Zhaodong Chen, Zheng Wang, Shu Yang, Yuan Xie, Yufei Ding

    Abstract: Transformer verification draws increasing attention in machine learning research and industry. It formally verifies the robustness of transformers against adversarial attacks such as exchanging words in a sentence with synonyms. However, the performance of transformer verification is still not satisfactory due to bound-centric computation which is significantly different from standard neural netwo… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

    Comments: Published in ATC'22

  47. arXiv:2209.07936  [pdf, other

    cs.CR cs.AR

    PA-Boot: A Formally Verified Authentication Protocol for Multiprocessor Secure Boot

    Authors: Zhuoruo Zhang, Chenyang Yu, Rui Chang, Mingshuai Chen, Bo Feng, He Huang, Qinming Dai, Wenbo Shen, Yongwang Zhao

    Abstract: Hardware supply-chain attacks are raising significant security threats to the boot process of multiprocessor systems. This paper identifies a new, prevalent hardware supply-chain attack surface that can bypass multiprocessor secure boot due to the absence of processor-authentication mechanisms. To defend against such attacks, we present PA-Boot, the first formally verified processor-authentication… ▽ More

    Submitted 24 April, 2024; v1 submitted 16 September, 2022; originally announced September 2022.

  48. arXiv:2209.06800  [pdf, other

    cs.DC cs.LG

    MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms

    Authors: Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Kevin Barker, Ang Li, Yufei Ding

    Abstract: The increasing size of input graphs for graph neural networks (GNNs) highlights the demand for using multi-GPU platforms. However, existing multi-GPU GNN systems optimize the computation and communication individually based on the conventional practice of scaling dense DNNs. For irregularly sparse and fine-grained GNN workloads, such solutions miss the opportunity to jointly schedule/optimize the… ▽ More

    Submitted 26 June, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: Paper is accepted to OSDI'23

  49. arXiv:2208.12341   

    stat.ML cs.LG

    Variance Reduction based Experience Replay for Policy Optimization

    Authors: Hua Zheng, Wei Xie, M. Ben Feng

    Abstract: For reinforcement learning on complex stochastic systems where many factors dynamically impact the output trajectories, it is desirable to effectively leverage the information from historical samples collected in previous iterations to accelerate policy optimization. Classical experience replay allows agents to remember by reusing historical observations. However, the uniform reuse strategy that t… ▽ More

    Submitted 9 September, 2022; v1 submitted 25 August, 2022; originally announced August 2022.

    Comments: This work was intended as a replacement of arXiv:2110.08902 and any subsequent updates will appear there

  50. arXiv:2208.06143  [pdf, other

    cs.CV cs.GR cs.LG

    PRIF: Primary Ray-based Implicit Function

    Authors: Brandon Yushan Feng, Yinda Zhang, Danhang Tang, Ruofei Du, Amitabh Varshney

    Abstract: We introduce a new implicit shape representation called Primary Ray-based Implicit Function (PRIF). In contrast to most existing approaches based on the signed distance function (SDF) which handles spatial locations, our representation operates on oriented rays. Specifically, PRIF is formulated to directly produce the surface hit point of a given input ray, without the expensive sphere-tracing ope… ▽ More

    Submitted 12 August, 2022; originally announced August 2022.

    Comments: ECCV 2022. Project Page: https://augmentariumlab.github.io/PRIF/