Search | arXiv e-print repository

Data-Centric AI in the Age of Large Language Models

Authors: Xinyi Xu, Zhaoxuan Wu, Rui Qiao, Arun Verma, Yao Shu, **gtan Wang, Xinyuan Niu, Zhenfeng He, Jiangwei Chen, Zijian Zhou, Gregory Kang Ruey Lau, Hieu Dao, Lucas Agussurja, Rachael Hwee Ling Sim, Xiaoqiang Lin, Wenyang Hu, Zhongxiang Dai, Pang Wei Koh, Bryan Kian Hsiang Low

Abstract: This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs). We start by making the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs, and yet it receives disproportionally low attention from the research community. We identify four specific… ▽ More This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs). We start by making the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs, and yet it receives disproportionally low attention from the research community. We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization. In each scenario, we underscore the importance of data, highlight promising research directions, and articulate the potential impacts on the research community and, where applicable, the society as a whole. For instance, we advocate for a suite of data-centric benchmarks tailored to the scale and complexity of data for LLMs. These benchmarks can be used to develop new data curation methods and document research efforts and results, which can help promote openness and transparency in AI and LLM research. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Preprint

arXiv:2406.10652 [pdf, other]

MDeRainNet: An Efficient Neural Network for Rain Streak Removal from Macro-pixel Images

Authors: Tao Yan, Weijiang He, Chenglong Wang, Xiangjie Zhu, Yinghui Wang, Rynson W. H. Lau

Abstract: Since rainy weather always degrades image quality and poses significant challenges to most computer vision-based intelligent systems, image de-raining has been a hot research topic. Fortunately, in a rainy light field (LF) image, background obscured by rain streaks in one sub-view may be visible in the other sub-views, and implicit depth information and recorded 4D structural information may benef… ▽ More Since rainy weather always degrades image quality and poses significant challenges to most computer vision-based intelligent systems, image de-raining has been a hot research topic. Fortunately, in a rainy light field (LF) image, background obscured by rain streaks in one sub-view may be visible in the other sub-views, and implicit depth information and recorded 4D structural information may benefit rain streak detection and removal. However, existing LF image rain removal methods either do not fully exploit the global correlations of 4D LF data or only utilize partial sub-views, resulting in sub-optimal rain removal performance and no-equally good quality for all de-rained sub-views. In this paper, we propose an efficient network, called MDeRainNet, for rain streak removal from LF images. The proposed network adopts a multi-scale encoder-decoder architecture, which directly works on Macro-pixel images (MPIs) to improve the rain removal performance. To fully model the global correlation between the spatial and the angular information, we propose an Extended Spatial-Angular Interaction (ESAI) module to merge them, in which a simple and effective Transformer-based Spatial-Angular Interaction Attention (SAIA) block is also proposed for modeling long-range geometric correlations and making full use of the angular information. Furthermore, to improve the generalization performance of our network on real-world rainy scenes, we propose a novel semi-supervised learning framework for our MDeRainNet, which utilizes multi-level KL loss to bridge the domain gap between features of synthetic and real-world rain streaks and introduces colored-residue image guided contrastive regularization to reconstruct rain-free images. Extensive experiments conducted on synthetic and real-world LFIs demonstrate that our method outperforms the state-of-the-art methods both quantitatively and qualitatively. △ Less

Submitted 15 June, 2024; originally announced June 2024.

Comments: 13 pages, 13 figures, 4 tables

arXiv:2406.01720 [pdf, other]

The first Palomar Gattini-IR catalog of J-band light curves: construction and public data release

Authors: Shion Murakawa, Kishalay De, Michael C. B. Ashley, Nicholas Earley, Lynne A. Hillenbrand, Mansi M. Kasliwal, Ryan M. Lau, Anna M. Moore, J. L. Sokoloski, Roberto Soria

Abstract: Palomar Gattini-IR is a wide-field, synoptic infrared time domain survey covering $\approx 15000$ sq. deg. of the sky at $\approx 1-3$ night cadence to a depth of $J\approx 13.0$ and $\approx 14.9$ Vega mag in and outside the Galactic plane, respectively. Here, we present the first data release of $J$-band light curves of 2MASS sources within the survey footprint covering approximately the first f… ▽ More Palomar Gattini-IR is a wide-field, synoptic infrared time domain survey covering $\approx 15000$ sq. deg. of the sky at $\approx 1-3$ night cadence to a depth of $J\approx 13.0$ and $\approx 14.9$ Vega mag in and outside the Galactic plane, respectively. Here, we present the first data release of $J$-band light curves of 2MASS sources within the survey footprint covering approximately the first four years of operations. We describe the construction of the source catalog based on the 2MASS point sources, followed by exposure filtering criteria and forced PSF photometry. The catalog contains light curves of $\approx 286$ million unique sources with 2MASS magnitudes of $J < 15.5$ mag, with a total of $\approx 50$ billion photometric measurements and $\approx 20$ billion individual source detections at signal-to-noise-ratio $> 3$. We demonstrate the photometric fidelity of the catalog by i) quantifying the magnitude-dependent accuracy and uncertainty of the photometry with respect to 2MASS and ii) comparing against forced PGIR aperture photometry for known variable sources. We present simple filtering criteria for selecting reliable photometric measurements as well as example Python notebooks for users. This catalog is the largest compilation of nightly cadence, synoptic infrared light curves to date, comparable to those in the largest optical surveys, providing a step** stone to upcoming infrared surveys in the coming decade. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 10 Pages, 5 figures, submitted to PASP. Full catalog will be available upon publication, individual requests welcome!

arXiv:2406.01476 [pdf, other]

DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors

Authors: Tianyu Huang, Yihan Zeng, Hui Li, Wangmeng Zuo, Rynson W. H. Lau

Abstract: Dynamic 3D interaction has witnessed great interest in recent works, while creating such 4D content remains challenging. One solution is to animate 3D scenes with physics-based simulation, and the other is to learn the deformation of static 3D objects with the distillation of video generative models. The former one requires assigning precise physical properties to the target object, otherwise the… ▽ More Dynamic 3D interaction has witnessed great interest in recent works, while creating such 4D content remains challenging. One solution is to animate 3D scenes with physics-based simulation, and the other is to learn the deformation of static 3D objects with the distillation of video generative models. The former one requires assigning precise physical properties to the target object, otherwise the simulated results would become unnatural. The latter tends to formulate the video with minor motions and discontinuous frames, due to the absence of physical constraints in deformation learning. We think that video generative models are trained with real-world captured data, capable of judging physical phenomenon in simulation environments. To this end, we propose DreamPhysics in this work, which estimates physical properties of 3D Gaussian Splatting with video diffusion priors. DreamPhysics supports both image- and text-conditioned guidance, optimizing physical parameters via score distillation sampling with frame interpolation and log gradient. Based on a material point method simulator with proper physical parameters, our method can generate 4D content with realistic motions. Experimental results demonstrate that, by distilling the prior knowledge of video diffusion models, inaccurate physical properties can be gradually refined for high-quality simulation. Codes are released at: https://github.com/tyhuang0428/DreamPhysics. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: Technical report. Codes are released at: https://github.com/tyhuang0428/DreamPhysics

arXiv:2405.17725 [pdf, other]

Color Shift Estimation-and-Correction for Image Enhancement

Authors: Yiyu Li, Ke Xu, Gerhard Petrus Hancke, Rynson W. H. Lau

Abstract: Images captured under sub-optimal illumination conditions may contain both over- and under-exposures. Current approaches mainly focus on adjusting image brightness, which may exacerbate the color tone distortion in under-exposed areas and fail to restore accurate colors in over-exposed regions. We observe that over- and under-exposed regions display opposite color tone distribution shifts with res… ▽ More Images captured under sub-optimal illumination conditions may contain both over- and under-exposures. Current approaches mainly focus on adjusting image brightness, which may exacerbate the color tone distortion in under-exposed areas and fail to restore accurate colors in over-exposed regions. We observe that over- and under-exposed regions display opposite color tone distribution shifts with respect to each other, which may not be easily normalized in joint modeling as they usually do not have ``normal-exposed'' regions/pixels as reference. In this paper, we propose a novel method to enhance images with both over- and under-exposures by learning to estimate and correct such color shifts. Specifically, we first derive the color feature maps of the brightened and darkened versions of the input image via a UNet-based network, followed by a pseudo-normal feature generator to produce pseudo-normal color feature maps. We then propose a novel COlor Shift Estimation (COSE) module to estimate the color shifts between the derived brightened (or darkened) color feature maps and the pseudo-normal color feature maps. The COSE module corrects the estimated color shifts of the over- and under-exposed regions separately. We further propose a novel COlor MOdulation (COMO) module to modulate the separately corrected colors in the over- and under-exposed regions to produce the enhanced image. Comprehensive experiments show that our method outperforms existing approaches. Project webpage: https://github.com/yiyulics/CSEC. △ Less

Submitted 29 May, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

Comments: CVPR2024 accepted paper

arXiv:2405.14663 [pdf, other]

WTP19aalnxx: Discovery of a bright mid-infrared transient in the emerging class of low luminosity supernovae revealed by delayed circumstellar interaction

Authors: Charlotte Myers, Kishalay De, Lin Yan, Jacob E. Jencson, Nicholas Earley, Christoffer Fremling, Daichi Hiramatsu, Mansi M. Kasliwal, Ryan M. Lau, Morgan MacLeod, Megan Masterson, Christos Panagiotou, Robert Simcoe, Samaporn Tinyanont

Abstract: While core-collapse supernovae (SNe) often show early and consistent signs of circumstellar (CSM) interaction, some exhibit delayed signatures due to interaction with distant material around the progenitor star. Here we present the discovery in NEOWISE data of WTP19aalnxx, a luminous mid-infrared (IR) transient in the outskirts of the galaxy KUG 0022-007 at $\approx 190$ Mpc. First detected in 201… ▽ More While core-collapse supernovae (SNe) often show early and consistent signs of circumstellar (CSM) interaction, some exhibit delayed signatures due to interaction with distant material around the progenitor star. Here we present the discovery in NEOWISE data of WTP19aalnxx, a luminous mid-infrared (IR) transient in the outskirts of the galaxy KUG 0022-007 at $\approx 190$ Mpc. First detected in 2018, WTP19aalnxx reaches a peak absolute (Vega) magnitude of $\approx-22$ at $4.6 \, μ$m in $\approx3$ yr, comparable to the most luminous interacting SNe. Archival data reveal a $\gtrsim 5\times$ fainter optical counterpart detected since 2015, while follow-up near-IR observations in 2022 reveal an extremely red ($Ks-W2 \approx 3.7$ mag) active transient. Deep optical spectroscopy confirm strong CSM interaction signatures via intermediate-width Balmer emission lines and coronal metal lines. Modeling the broadband spectral energy distribution, we estimate the presence of $\gtrsim 10^{-2}$ M$_\odot$ of warm dust, likely formed in the shock interaction region. Together with the lack of nebular Fe emission, we suggest that WTP19aalnxx is a missed, low (optical) luminosity SN in an emerging family of core-collapse SNe distinguished by their CSM-interaction-powered mid-IR emission that outshines the optical bands. Investigating the Zwicky Transient Facility sample of SNe in NEOWISE data, we find $17$ core-collapse SNe ($\gtrsim 3$% in a volume-limited sample) without early signs of CSM interaction that exhibit delayed IR brightening, suggestive of dense CSM shells at $\lesssim 10^{17}$cm. We suggest that synoptic IR surveys offer a new route to revealing late-time CSM interaction and the prevalence of intense terminal mass loss in massive stars. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 15 pages, 5 figures, submitted to ApJ

arXiv:2405.10454 [pdf, ps, other]

The long-period spectroscopic orbit and dust creation in the Wolf-Rayet binary system WR 125

Authors: Noel D. Richardson, Andrea R. Daly, Peredur M. Williams, Grant M. Hill, Victor I. Shenavrin, Izumi Endo, André-Nicolas Chené, Nicole Karnath, Ryan M. Lau, Anthony F. J. Moffat, Gerd Weigelt

Abstract: Several long-period binaries with a carbon-rich Wolf-Rayet star and an O star produce dust in their wind collisions. In eccentric binaries, this is seen most strongly near periastron passage. The exact conditions leading to dust creation require orbital properties to be determined, which is difficult owing to their long periods. Recently, the binary system WR 125 (WC7+O9III) began a dust creation… ▽ More Several long-period binaries with a carbon-rich Wolf-Rayet star and an O star produce dust in their wind collisions. In eccentric binaries, this is seen most strongly near periastron passage. The exact conditions leading to dust creation require orbital properties to be determined, which is difficult owing to their long periods. Recently, the binary system WR 125 (WC7+O9III) began a dust creation episode seen through an infrared outburst first detected by NEOWISE-R, which was the first outburst detected since 1991. We present new near- and mid-infrared photometry, which we use to show consistency between the two outbursts and derive an orbital period of 28.12$^{+0.10}_{-0.05}$ yr. We use a long time-series of optical spectra to place the first constraints on its orbital elements, on the assumption that this system will produce dust near periastron. The orbit has a mild eccentricity of 0.29$\pm$0.12 and is only derived for the Wolf-Rayet component, as the O star's radial velocities have noise that is likely larger than the expected semi-amplitude of the orbit. We also present SOFIA/FORCAST grism spectroscopy to examine the infrared spectral energy distribution (SED) of the dust during this outburst, comparing its properties to other WCd binaries, deriving a dust temperature of 580 K in 2021. This collection of observations will allow us to plan future observations of this system and place the system in the context of dust-creating Wolf-Rayet binaries. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: accepted to ApJ

arXiv:2404.07662 [pdf, other]

PINNACLE: PINN Adaptive ColLocation and Experimental points selection

Authors: Gregory Kang Ruey Lau, Apivich Hemachandra, See-Kiong Ng, Bryan Kian Hsiang Low

Abstract: Physics-Informed Neural Networks (PINNs), which incorporate PDEs as soft constraints, train with a composite loss function that contains multiple training point types: different types of collocation points chosen during training to enforce each PDE and initial/boundary conditions, and experimental points which are usually costly to obtain via experiments or simulations. Training PINNs using this l… ▽ More Physics-Informed Neural Networks (PINNs), which incorporate PDEs as soft constraints, train with a composite loss function that contains multiple training point types: different types of collocation points chosen during training to enforce each PDE and initial/boundary conditions, and experimental points which are usually costly to obtain via experiments or simulations. Training PINNs using this loss function is challenging as it typically requires selecting large numbers of points of different types, each with different training dynamics. Unlike past works that focused on the selection of either collocation or experimental points, this work introduces PINN Adaptive ColLocation and Experimental points selection (PINNACLE), the first algorithm that jointly optimizes the selection of all training point types, while automatically adjusting the proportion of collocation point types as training progresses. PINNACLE uses information on the interaction among training point types, which had not been considered before, based on an analysis of PINN training dynamics via the Neural Tangent Kernel (NTK). We theoretically show that the criterion used by PINNACLE is related to the PINN generalization error, and empirically demonstrate that PINNACLE is able to outperform existing point selection methods for forward, inverse, and transfer learning problems. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: Accepted to 12th International Conference on Learning Representations (ICLR 2024), 36 pages

arXiv:2403.17013 [pdf, other]

Temporal-Spatial Processing of Event Camera Data via Delay-Loop Reservoir Neural Network

Authors: Richard Lau, Anthony Tylan-Tyler, Lihan Yao, Rey de Castro Roberto, Robert Taylor, Isaiah Jones

Abstract: This paper describes a temporal-spatial model for video processing with special applications to processing event camera videos. We propose to study a conjecture motivated by our previous study of video processing with delay loop reservoir (DLR) neural network, which we call Temporal-Spatial Conjecture (TSC). The TSC postulates that there is significant information content carried in the temporal r… ▽ More This paper describes a temporal-spatial model for video processing with special applications to processing event camera videos. We propose to study a conjecture motivated by our previous study of video processing with delay loop reservoir (DLR) neural network, which we call Temporal-Spatial Conjecture (TSC). The TSC postulates that there is significant information content carried in the temporal representation of a video signal and that machine learning algorithms would benefit from separate optimization of the spatial and temporal components for intelligent processing. To verify or refute the TSC, we propose a Visual Markov Model (VMM) which decompose the video into spatial and temporal components and estimate the mutual information (MI) of these components. Since computation of video mutual information is complex and time consuming, we use a Mutual Information Neural Network to estimate the bounds of the mutual information. Our result shows that the temporal component carries significant MI compared to that of the spatial component. This finding has often been overlooked in neural network literature. In this paper, we will exploit this new finding to guide our design of a delay-loop reservoir neural network for event camera classification, which results in a 18% improvement on classification accuracy. △ Less

Submitted 12 February, 2024; originally announced March 2024.

Comments: 10 pages, 12 figures, Darpa Distribution Statement A. Approved for public release. Distribution Unlimited

arXiv:2403.16224 [pdf, other]

Inverse Rendering of Glossy Objects via the Neural Plenoptic Function and Radiance Fields

Authors: Haoyuan Wang, Wenbo Hu, Lei Zhu, Rynson W. H. Lau

Abstract: Inverse rendering aims at recovering both geometry and materials of objects. It provides a more compatible reconstruction for conventional rendering engines, compared with the neural radiance fields (NeRFs). On the other hand, existing NeRF-based inverse rendering methods cannot handle glossy objects with local light interactions well, as they typically oversimplify the illumination as a 2D enviro… ▽ More Inverse rendering aims at recovering both geometry and materials of objects. It provides a more compatible reconstruction for conventional rendering engines, compared with the neural radiance fields (NeRFs). On the other hand, existing NeRF-based inverse rendering methods cannot handle glossy objects with local light interactions well, as they typically oversimplify the illumination as a 2D environmental map, which assumes infinite lights only. Observing the superiority of NeRFs in recovering radiance fields, we propose a novel 5D Neural Plenoptic Function (NeP) based on NeRFs and ray tracing, such that more accurate lighting-object interactions can be formulated via the rendering equation. We also design a material-aware cone sampling strategy to efficiently integrate lights inside the BRDF lobes with the help of pre-filtered radiance fields. Our method has two stages: the geometry of the target object and the pre-filtered environmental radiance fields are reconstructed in the first stage, and materials of the target object are estimated in the second stage with the proposed NeP and material-aware cone sampling strategy. Extensive experiments on the proposed real-world and synthetic datasets demonstrate that our method can reconstruct high-fidelity geometry/materials of challenging glossy objects with complex lighting interactions from nearby objects. Project webpage: https://whyy.site/paper/nep △ Less

Submitted 24 March, 2024; originally announced March 2024.

Comments: CVPR 2024 paper. Project webpage https://whyy.site/paper/nep

arXiv:2403.15383 [pdf, other]

ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars

Authors: Zhenwei Wang, Tengfei Wang, Gerhard Hancke, Ziwei Liu, Rynson W. H. Lau

Abstract: Real-world applications often require a large gallery of 3D assets that share a consistent theme. While remarkable advances have been made in general 3D content creation from text or image, synthesizing customized 3D assets following the shared theme of input 3D exemplars remains an open and challenging problem. In this work, we present ThemeStation, a novel approach for theme-aware 3D-to-3D gener… ▽ More Real-world applications often require a large gallery of 3D assets that share a consistent theme. While remarkable advances have been made in general 3D content creation from text or image, synthesizing customized 3D assets following the shared theme of input 3D exemplars remains an open and challenging problem. In this work, we present ThemeStation, a novel approach for theme-aware 3D-to-3D generation. ThemeStation synthesizes customized 3D assets based on given few exemplars with two goals: 1) unity for generating 3D assets that thematically align with the given exemplars and 2) diversity for generating 3D assets with a high degree of variations. To this end, we design a two-stage framework that draws a concept image first, followed by a reference-informed 3D modeling stage. We propose a novel dual score distillation (DSD) loss to jointly leverage priors from both the input exemplars and the synthesized concept image. Extensive experiments and user studies confirm that ThemeStation surpasses prior works in producing diverse theme-aware 3D models with impressive quality. ThemeStation also enables various applications such as controllable 3D-to-3D generation. △ Less

Submitted 15 May, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

Comments: Accepted to SIGGRAPH 2024. Project page: https://3dthemestation.github.io/

arXiv:2403.04386 [pdf, other]

doi 10.1126/science.adj5796

Emission lines due to ionizing radiation from a compact object in the remnant of Supernova 1987A

Authors: C. Fransson, M. J. Barlow, P. J. Kavanagh, J. Larsson, O. C. Jones, B. Sargent, M. Meixner, P. Bouchet, T. Temim, G. S. Wright, J. A. D. L. Blommaert, N. Habel, A. S. Hirschauer, J. Hjorth, L. Lenkić, T. Tikkanen, R. Wesson, A. Coulais, O. D. Fox, R. Gastaud, A. Glasse, J. Jaspers, O. Krause, R. M. Lau, O. Nayak , et al. (9 additional authors not shown)

Abstract: The nearby Supernova 1987A was accompanied by a burst of neutrino emission, which indicates that a compact object (a neutron star or black hole) was formed in the explosion. There has been no direct observation of this compact object. In this work, we observe the supernova remnant with JWST spectroscopy finding narrow infrared emission lines of argon and sulphur. The line emission is spatially unr… ▽ More The nearby Supernova 1987A was accompanied by a burst of neutrino emission, which indicates that a compact object (a neutron star or black hole) was formed in the explosion. There has been no direct observation of this compact object. In this work, we observe the supernova remnant with JWST spectroscopy finding narrow infrared emission lines of argon and sulphur. The line emission is spatially unresolved and blueshifted in velocity relative to the supernova rest frame. We interpret the lines as gas illuminated by a source of ionizing photons located close to the center of the expanding ejecta. Photoionization models show that the line ratios are consistent with ionization by a cooling neutron star or pulsar wind nebula. The velocity shift could be evidence for a neutron star natal kick. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: Authors version of manuscript published in Science on 22 Feb 2024

Journal ref: SCIENCE 22 Feb 2024 Vol 383, Issue 6685 pp. 898-903

arXiv:2403.00644 [pdf, other]

Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks

Authors: Yuhao Liu, Zhanghan Ke, Fang Liu, Nanxuan Zhao, Rynson W. H. Lau

Abstract: Diffusion models trained on large-scale datasets have achieved remarkable progress in image synthesis. However, due to the randomness in the diffusion process, they often struggle with handling diverse low-level tasks that require details preservation. To overcome this limitation, we present a new Diff-Plugin framework to enable a single pre-trained diffusion model to generate high-fidelity result… ▽ More Diffusion models trained on large-scale datasets have achieved remarkable progress in image synthesis. However, due to the randomness in the diffusion process, they often struggle with handling diverse low-level tasks that require details preservation. To overcome this limitation, we present a new Diff-Plugin framework to enable a single pre-trained diffusion model to generate high-fidelity results across a variety of low-level tasks. Specifically, we first propose a lightweight Task-Plugin module with a dual branch design to provide task-specific priors, guiding the diffusion process in preserving image content. We then propose a Plugin-Selector that can automatically select different Task-Plugins based on the text instruction, allowing users to edit images by indicating multiple low-level tasks with natural language. We conduct extensive experiments on 8 low-level vision tasks. The results demonstrate the superiority of Diff-Plugin over existing methods, particularly in real-world scenarios. Our ablations further validate that Diff-Plugin is stable, schedulable, and supports robust training across different dataset sizes. △ Less

Submitted 28 May, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

Comments: Accepted to CVPR2024. Replaced some celebrity images to avoid copyright disputes

arXiv:2402.14808 [pdf, other]

RelayAttention for Efficient Large Language Model Serving with Long System Prompts

Authors: Lei Zhu, Xinjiang Wang, Wayne Zhang, Rynson W. H. Lau

Abstract: A practical large language model (LLM) service may involve a long system prompt, which specifies the instructions, examples, and knowledge documents of the task and is reused across requests. However, the long system prompt causes throughput/latency bottlenecks as the cost of generating the next token grows w.r.t. the sequence length. This paper aims to improve the efficiency of LLM services that… ▽ More A practical large language model (LLM) service may involve a long system prompt, which specifies the instructions, examples, and knowledge documents of the task and is reused across requests. However, the long system prompt causes throughput/latency bottlenecks as the cost of generating the next token grows w.r.t. the sequence length. This paper aims to improve the efficiency of LLM services that involve long system prompts. Our key observation is that handling these system prompts requires heavily redundant memory accesses in existing causal attention computation algorithms. Specifically, for batched requests, the cached hidden states (\ie, key-value pairs) of system prompts are transferred from off-chip DRAM to on-chip SRAM multiple times, each corresponding to an individual request. To eliminate such a redundancy, we propose RelayAttention, an attention algorithm that allows reading these hidden states from DRAM exactly once for a batch of input tokens. RelayAttention is a free lunch: it maintains the generation quality while requiring no model retraining, as it is based on a mathematical reformulation of causal attention. We have observed significant performance improvements to a production-level system, vLLM, through integration with RelayAttention. The improvements are even more profound with longer system prompts. △ Less

Submitted 30 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

Comments: accepted by the ACL 2024 main conference

arXiv:2402.14014 [pdf, other]

JWST MIRI Imager Observations of Supernova SN 1987A

Authors: P. Bouchet, R. Gastaud, A. Coulais, M. J. Barlow, C. Fransson, P. J. Kavanagh, J. Larsson, T. Temim, O. C. Jones, A. S. Hirschauer, T. Tikkanen, J. A. D. L. Blommaert, O. D. Fox, A. Glasse, N. Habel, J. Hjorth, J. Jaspers, O. Krause, R. M. Lau, L. Lenkić, M. Meixner, O. Nayak, A. Rest, B. Sargent, R. Wesson , et al. (9 additional authors not shown)

Abstract: There exist very few mid-infrared (IR) observations of supernovae (SNe) in general. Therefore, SN 1987A, the closest visible SN in 400 years, gives us the opportunity to explore the mid-IR properties of SNe, the dust in their ejecta and surrounding medium, and to witness the birth of a SN remnant (SNR). The James Webb Space Telescope (JWST), with its high spatial resolution and extreme sensitivity… ▽ More There exist very few mid-infrared (IR) observations of supernovae (SNe) in general. Therefore, SN 1987A, the closest visible SN in 400 years, gives us the opportunity to explore the mid-IR properties of SNe, the dust in their ejecta and surrounding medium, and to witness the birth of a SN remnant (SNR). The James Webb Space Telescope (JWST), with its high spatial resolution and extreme sensitivity, gives a new view on these issues. We report on the first imaging observations obtained with the Mid-InfraRed Instrument (MIRI). We build temperature maps and discuss the morphology of the nascent SNR. Our results show that the temperatures in the equatorial ring (ER) are quite non-uniform. This could be due to dust destruction in some parts of the ring, as had been assumed in some previous works. We show that the IR emission extends beyond the ER, illustrating the fact that the shock wave has now passed through this ring to affect the circumstellar medium on a larger scale. Finally, while sub-mm Atacama Large Millimeter Array (ALMA) observations have hinted at the location of the compact remnant of SN 1987A, we note that our MIRI data have found no such evidence. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: 19 pages, 19 figures, 2 tables; Accepted for publication in the Astrophysical Journal (February 2, 2024)

arXiv:2402.13631 [pdf, other]

Delving into Dark Regions for Robust Shadow Detection

Authors: Huankang Guan, Ke Xu, Rynson W. H. Lau

Abstract: Shadow detection is a challenging task as it requires a comprehensive understanding of shadow characteristics and global/local illumination conditions. We observe from our experiment that state-of-the-art deep methods tend to have higher error rates in differentiating shadow pixels from non-shadow pixels in dark regions (ie, regions with low-intensity values). Our key insight to this problem is th… ▽ More Shadow detection is a challenging task as it requires a comprehensive understanding of shadow characteristics and global/local illumination conditions. We observe from our experiment that state-of-the-art deep methods tend to have higher error rates in differentiating shadow pixels from non-shadow pixels in dark regions (ie, regions with low-intensity values). Our key insight to this problem is that existing methods typically learn discriminative shadow features from the whole image globally, covering the full range of intensity values, and may not learn the subtle differences between shadow and non-shadow pixels in dark regions. Hence, if we can design a model to focus on a narrower range of low-intensity regions, it may be able to learn better discriminative features for shadow detection. Inspired by this insight, we propose a novel shadow detection approach that first learns global contextual cues over the entire image and then zooms into the dark regions to learn local shadow representations. To this end, we formulate an effective dark-region recommendation (DRR) module to recommend regions of low-intensity values, and a novel dark-aware shadow analysis (DASA) module to learn dark-aware shadow features from the recommended dark regions. Extensive experiments show that the proposed method outperforms the state-of-the-art methods on three popular shadow detection datasets. Code is available at https://github.com/guanhuankang/ShadowDetection2021.git. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2402.00341 [pdf, other]

Recasting Regional Lighting for Shadow Removal

Authors: Yuhao Liu, Zhanghan Ke, Ke Xu, Fang Liu, Zhenwei Wang, Rynson W. H. Lau

Abstract: Removing shadows requires an understanding of both lighting conditions and object textures in a scene. Existing methods typically learn pixel-level color map**s between shadow and non-shadow images, in which the joint modeling of lighting and object textures is implicit and inadequate. We observe that in a shadow region, the degradation degree of object textures depends on the local illumination… ▽ More Removing shadows requires an understanding of both lighting conditions and object textures in a scene. Existing methods typically learn pixel-level color map**s between shadow and non-shadow images, in which the joint modeling of lighting and object textures is implicit and inadequate. We observe that in a shadow region, the degradation degree of object textures depends on the local illumination, while simply enhancing the local illumination cannot fully recover the attenuated textures. Based on this observation, we propose to condition the restoration of attenuated textures on the corrected local lighting in the shadow region. Specifically, We first design a shadow-aware decomposition network to estimate the illumination and reflectance layers of shadow regions explicitly. We then propose a novel bilateral correction network to recast the lighting of shadow regions in the illumination layer via a novel local lighting correction module, and to restore the textures conditioned on the corrected illumination layer via a novel illumination-guided texture restoration module. We further annotate pixel-wise shadow masks for the public SRD dataset, which originally contains only image pairs. Experiments on three benchmarks show that our method outperforms existing state-of-the-art shadow removal methods. △ Less

Submitted 1 February, 2024; originally announced February 2024.

Comments: AAAI 2024 (Oral)

arXiv:2401.02063 [pdf, other]

Windows on the Universe: Establishing the Infrastructure for a Collaborative Multi-messenger Ecosystem

Authors: The 2023 Windows on the Universe Workshop White Paper Working Group, T. Ahumada, J. E. Andrews, S. Antier, E. Blaufuss, P. R. Brady, A. M. Brazier, E. Burns, S. B. Cenko, P. Chandra, D. Chatterjee, A. Corsi, M. W. Coughlin, D. A. Coulter, S. Fu, A. Goldstein, L. P. Guy, E. J. Hooper, S. B. Howell, T. B. Humensky, J. A. Kennea, S. M. Jarrett, R. M. Lau, T. R. Lewis, L. Lu , et al. (21 additional authors not shown)

Abstract: In this White Paper, we present recommendations for the scientific community and funding agencies to foster the infrastructure for a collaborative multi-messenger and time-domain astronomy (MMA/TDA) ecosystem. MMA/TDA is poised for breakthrough discoveries in the coming decade. In much the same way that expanding beyond the optical bandpass revealed entirely new and unexpected discoveries, cosmic… ▽ More In this White Paper, we present recommendations for the scientific community and funding agencies to foster the infrastructure for a collaborative multi-messenger and time-domain astronomy (MMA/TDA) ecosystem. MMA/TDA is poised for breakthrough discoveries in the coming decade. In much the same way that expanding beyond the optical bandpass revealed entirely new and unexpected discoveries, cosmic messengers beyond light (i.e., gravitational waves, neutrinos, and cosmic rays) open entirely new windows to answer some of the most fundamental questions in (astro)physics: heavy element synthesis, equation of state of dense matter, particle acceleration, etc. This field was prioritized as a frontier scientific pursuit in the 2020 Decadal Survey on Astronomy and Astrophysics via its "New Windows on the Dynamic Universe" theme. MMA/TDA science presents technical challenges distinct from those experienced in other disciplines. Successful observations require coordination across myriad boundaries -- different cosmic messengers, ground vs. space, international borders, etc. -- all for sources that may not be well localized, and whose brightness may be changing rapidly with time. Add that all of this work is undertaken by real human beings, with distinct backgrounds, experiences, cultures, and expectations, that often conflict. To address these challenges and help MMA/TDA realize its full scientific potential in the coming decade (and beyond), the second in a series of community workshops sponsored by the U.S. National Science Foundation (NSF) and NASA titled "Windows on the Universe: Establishing the Infrastructure for a Collaborative Multi-Messenger Ecosystem" was held on October 16-18, 2023 in Tucson, AZ. Here we present the primary recommendations from this workshop focused on three key topics -- hardware, software, and people and policy. [abridged] △ Less

Submitted 3 April, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

Comments: Workshop white paper

arXiv:2312.06439 [pdf, other]

DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior

Authors: Tianyu Huang, Yihan Zeng, Zhilu Zhang, Wan Xu, Hang Xu, Songcen Xu, Rynson W. H. Lau, Wangmeng Zuo

Abstract: 3D generation has raised great attention in recent years. With the success of text-to-image diffusion models, the 2D-lifting technique becomes a promising route to controllable 3D generation. However, these methods tend to present inconsistent geometry, which is also known as the Janus problem. We observe that the problem is caused mainly by two aspects, i.e., viewpoint bias in 2D diffusion models… ▽ More 3D generation has raised great attention in recent years. With the success of text-to-image diffusion models, the 2D-lifting technique becomes a promising route to controllable 3D generation. However, these methods tend to present inconsistent geometry, which is also known as the Janus problem. We observe that the problem is caused mainly by two aspects, i.e., viewpoint bias in 2D diffusion models and overfitting of the optimization objective. To address it, we propose a two-stage 2D-lifting framework, namely DreamControl, which optimizes coarse NeRF scenes as 3D self-prior and then generates fine-grained objects with control-based score distillation. Specifically, adaptive viewpoint sampling and boundary integrity metric are proposed to ensure the consistency of generated priors. The priors are then regarded as input conditions to maintain reasonable geometries, in which conditional LoRA and weighted score are further proposed to optimize detailed textures. DreamControl can generate high-quality 3D content in terms of both geometry consistency and texture fidelity. Moreover, our control-based optimization guidance is applicable to more downstream tasks, including user-guided generation and 3D animation. The project page is available at https://github.com/tyhuang0428/DreamControl. △ Less

Submitted 12 March, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: Accepted by CVPR 2024

arXiv:2311.15948 [pdf, other]

A First Look with JWST Aperture Masking Interferometry (AMI): Resolving Circumstellar Dust around the Wolf-Rayet Binary WR 137 beyond the Rayleigh Limit

Authors: Ryan M. Lau, Matthew J. Hankins, Joel Sanchez-Bermudez, Deepashri Thatte, Anthony Soulain, Rachel A. Cooper, Anand Sivaramakrishnan, Michael F. Corcoran, Alexandra Z. Greenbaum, Theodore R. Gull, Yinuo Han, Olivia C. Jones, Thomas Madura, Anthony F. J. Moffat, Mark R. Morris, Takashi Onaka, Christopher M. P. Russell, Noel D. Richardson, Nathan Smith, Peter Tuthill, Kevin Volk, Gerd Weigelt, Peredur M. Williams

Abstract: We present infrared aperture masking interferometry (AMI) observations of newly formed dust from the colliding winds of the massive binary system Wolf-Rayet (WR) 137 with JWST using the Near Infrared Imager and Slitless Spectrograph (NIRISS). NIRISS AMI observations of WR 137 and a point-spread-function calibrator star, HD~228337, were taken using the F380M and F480M filters in 2022 July and Augus… ▽ More We present infrared aperture masking interferometry (AMI) observations of newly formed dust from the colliding winds of the massive binary system Wolf-Rayet (WR) 137 with JWST using the Near Infrared Imager and Slitless Spectrograph (NIRISS). NIRISS AMI observations of WR 137 and a point-spread-function calibrator star, HD~228337, were taken using the F380M and F480M filters in 2022 July and August as part of the Director's Discretionary Early Release Science (DD-ERS) program 1349. Interferometric observables (squared visibilities and closure phases) from the WR 137 "interferogram" were extracted and calibrated using three independent software tools: ImPlaneIA, AMICAL, and SAMpip. The analysis of the calibrated observables yielded consistent values except for slightly discrepant closure phases measured by ImPlaneIA. Based on all three sets of calibrated observables, images were reconstructed using three independent software tools: BSMEM, IRBis, and SQUEEZE. All reconstructed image combinations generated consistent images in both F380M and F480M filters. The reconstructed images of WR 137 reveal a bright central core with a $\sim300$ mas linear filament extending to the northwest. A geometric colliding-wind model with dust production constrained to the orbital plane of the binary system and enhanced as the system approaches periapsis provided a general agreement with the interferometric observables and reconstructed images. Based on a colliding-wind dust condensation analysis, we suggest that dust formation within the orbital plane of WR 137 is induced by enhanced equatorial mass-loss from the rapidly rotating O9 companion star, whose axis of rotation is aligned with that of the orbit. △ Less

Submitted 22 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: 18 pages, 8 figures, Accepted for publication in ApJ. Updated plotting error in Fig. 2

arXiv:2311.01564 [pdf, other]

InSAR-Informed In-Situ Monitoring for Deep-Seated Landslides

Authors: Rachael Lau, Carolina Segui, Tyler Waterman, Nathaniel Chaney, Manolis Veveakis

Abstract: This work focuses on assessing the fidelity of Interferometric Synthetic Aperture Radar (InSAR) as it relates to subsurface ground motion monitoring, as well as understanding uncertainty in modeling active landslide scarp displacement for the case study of the in situ monitored El Forn deep seated landslide in Canillo, Andorra. We used the available Sentinel 1 data on the Alaska Satellite Facility… ▽ More This work focuses on assessing the fidelity of Interferometric Synthetic Aperture Radar (InSAR) as it relates to subsurface ground motion monitoring, as well as understanding uncertainty in modeling active landslide scarp displacement for the case study of the in situ monitored El Forn deep seated landslide in Canillo, Andorra. We used the available Sentinel 1 data on the Alaska Satellite Facility (ASF) Vertex platform to create deformation velocity maps and time series of the El Forn landslide scarp. We compared the performances of InSAR data from the recently launched European Ground Motion Service (EGMS) platform and the ASF Vertex Platform in a time series comparison of displacement in the direction of landslide motion with in situ borehole based measurements from 2019 to 2021, suggesting that ground motion detected through InSAR can be used in tandem with field monitoring to provide optimal information with minimum in situ deployment. While identification of active landslide scarps may be possible via the use of EGMS platform, the intents and purposes of this work are in assessment of InSAR as a monitoring tool. Based on that, geospatial interpolation with statistical analysis was conducted to better understand the necessary number of in situ observations needed to lower error on a remote sensing recreation of ground motion over the entirety of a landslide scarp, suggesting between 20 to 25 total observations provides the optimal normalized root mean squared error for an ordinarily kriged model of the El Forn landslide scarp. △ Less

Submitted 12 February, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.11912 [pdf, other]

The JWST Galactic Center Survey -- A White Paper

Authors: Rainer Schoedel, Steve Longmore, Jonny Henshaw, Adam Ginsburg, John Bally, Anja Feldmeier, Matt Hosek, Francisco Nogueras Lara, Anna Ciurlo, Mélanie Chevance, J. M. Diederik Kruijssen, Ralf Klessen, Gabriele Ponti, Pau Amaro-Seoane, Konstantina Anastasopoulou, Jay Anderson, Maria Arias, Ashley T. Barnes, Cara Battersby, Giuseppe Bono, Lucía Bravo Ferres, Aaron Bryant, Miguel Cano Gonzáalez, Santi Cassisi, Leonardo Chaves-Velasquez , et al. (85 additional authors not shown)

Abstract: The inner hundred parsecs of the Milky Way hosts the nearest supermassive black hole, largest reservoir of dense gas, greatest stellar density, hundreds of massive main and post main sequence stars, and the highest volume density of supernovae in the Galaxy. As the nearest environment in which it is possible to simultaneously observe many of the extreme processes sha** the Universe, it is one of… ▽ More The inner hundred parsecs of the Milky Way hosts the nearest supermassive black hole, largest reservoir of dense gas, greatest stellar density, hundreds of massive main and post main sequence stars, and the highest volume density of supernovae in the Galaxy. As the nearest environment in which it is possible to simultaneously observe many of the extreme processes sha** the Universe, it is one of the most well-studied regions in astrophysics. Due to its proximity, we can study the center of our Galaxy on scales down to a few hundred AU, a hundred times better than in similar Local Group galaxies and thousands of times better than in the nearest active galaxies. The Galactic Center (GC) is therefore of outstanding astrophysical interest. However, in spite of intense observational work over the past decades, there are still fundamental things unknown about the GC. JWST has the unique capability to provide us with the necessary, game-changing data. In this White Paper, we advocate for a JWST NIRCam survey that aims at solving central questions, that we have identified as a community: i) the 3D structure and kinematics of gas and stars; ii) ancient star formation and its relation with the overall history of the Milky Way, as well as recent star formation and its implications for the overall energetics of our galaxy's nucleus; and iii) the (non-)universality of star formation and the stellar initial mass function. We advocate for a large-area, multi-epoch, multi-wavelength NIRCam survey of the inner 100\,pc of the Galaxy in the form of a Treasury GO JWST Large Program that is open to the community. We describe how this survey will derive the physical and kinematic properties of ~10,000,000 stars, how this will solve the key unknowns and provide a valuable resource for the community with long-lasting legacy value. △ Less

Submitted 14 March, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

Comments: This White Paper will be updated when required (e.g. new authors joining, editing of content). Most recent update: 24 Oct 2023

arXiv:2310.05373 [pdf, other]

Quantum Bayesian Optimization

Authors: Zhongxiang Dai, Gregory Kang Ruey Lau, Arun Verma, Yao Shu, Bryan Kian Hsiang Low, Patrick Jaillet

Abstract: Kernelized bandits, also known as Bayesian optimization (BO), has been a prevalent method for optimizing complicated black-box reward functions. Various BO algorithms have been theoretically shown to enjoy upper bounds on their cumulative regret which are sub-linear in the number T of iterations, and a regret lower bound of Omega(sqrt(T)) has been derived which represents the unavoidable regrets f… ▽ More Kernelized bandits, also known as Bayesian optimization (BO), has been a prevalent method for optimizing complicated black-box reward functions. Various BO algorithms have been theoretically shown to enjoy upper bounds on their cumulative regret which are sub-linear in the number T of iterations, and a regret lower bound of Omega(sqrt(T)) has been derived which represents the unavoidable regrets for any classical BO algorithm. Recent works on quantum bandits have shown that with the aid of quantum computing, it is possible to achieve tighter regret upper bounds better than their corresponding classical lower bounds. However, these works are restricted to either multi-armed or linear bandits, and are hence not able to solve sophisticated real-world problems with non-linear reward functions. To this end, we introduce the quantum-Gaussian process-upper confidence bound (Q-GP-UCB) algorithm. To the best of our knowledge, our Q-GP-UCB is the first BO algorithm able to achieve a regret upper bound of O(polylog T), which is significantly smaller than its regret lower bound of Omega(sqrt(T)) in the classical setting. Moreover, thanks to our novel analysis of the confidence ellipsoid, our Q-GP-UCB with the linear kernel achieves a smaller regret than the quantum linear UCB algorithm from the previous work. We use simulations, as well as an experiment using a real quantum computer, to verify that the theoretical quantum speedup achieved by our Q-GP-UCB is also potentially relevant in practice. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Comments: Accepted to NeurIPS 2023

arXiv:2310.03448 [pdf, other]

Serendipitous detection of the dusty Type IIL SN 1980K with JWST/MIRI

Authors: Szanna Zsíros, Tamás Szalai, Ilse De Looze, Arkaprabha Sarangi, Melissa Shahbandeh, Ori D. Fox, Tea Temim, Dan Milisavljevic, Schuyler D. Van Dyk, Nathan Smith, Alexei V. Filippenko, Thomas G. Brink, WeiKang Zheng, Luc Dessart, Jacob Jencson, Joel Johansson, Justin Pierel, Armin Rest, Samaporn Tinyanont, Maria Niculescu-Duvaz, M. J. Barlow, Roger Wesson, Jennifer Andrews, Geoff Clayton, Kishalay De , et al. (17 additional authors not shown)

Abstract: We present mid-infrared (mid-IR) imaging of the Type IIL supernova (SN) 1980K with the James Webb Space Telescope (JWST) more than 40 yr post-explosion. SN 1980K, located in the nearby ($D\approx7$ Mpc) "SN factory" galaxy NGC 6946, was serendipitously captured in JWST/MIRI images taken of the field of SN 2004et in the same galaxy. SN 1980K serves as a promising candidate for studying the transiti… ▽ More We present mid-infrared (mid-IR) imaging of the Type IIL supernova (SN) 1980K with the James Webb Space Telescope (JWST) more than 40 yr post-explosion. SN 1980K, located in the nearby ($D\approx7$ Mpc) "SN factory" galaxy NGC 6946, was serendipitously captured in JWST/MIRI images taken of the field of SN 2004et in the same galaxy. SN 1980K serves as a promising candidate for studying the transitional phase between young SNe and older SN remnants and also provides a great opportunity to investigate its the close environment. SN 1980K can be identified as a clear and bright point source in all eight MIRI filters from F560W up to F2550W. We fit analytical dust models to the mid-IR spectral energy distribution that reveal a large amount ($M_d \approx 0.002 {M}_{\odot}$) of Si-dominated dust at $T_{dust}\approx 150$ K (accompanied by a hotter dust/gas component), and also computed numerical SED dust models. Radiative transfer modeling of a late-time optical spectrum obtained recently with Keck discloses that an even larger ($\sim 0.24-0.58~{M}_{\odot}$) amount of dust is needed in order for selective extinction to explain the asymmetric line profile shapes observed in SN 1980K. As a conclusion, with JWST, we may see i) pre-existing circumstellar dust heated collisionally (or, partly radiatively), analogous to the equatorial ring of SN 1987A, or ii) the mid-IR component of the presumed newly-formed dust, accompanied by much more colder dust present in the ejecta (as suggested by the late-time the optical spectra). △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: 14 pages, 9 figures

arXiv:2309.17175 [pdf, other]

TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields

Authors: Tianyu Huang, Yihan Zeng, Bowen Dong, Hang Xu, Songcen Xu, Rynson W. H. Lau, Wangmeng Zuo

Abstract: Recent works learn 3D representation explicitly under text-3D guidance. However, limited text-3D data restricts the vocabulary scale and text control of generations. Generators may easily fall into a stereotype concept for certain text prompts, thus losing open-vocabulary generation ability. To tackle this issue, we introduce a conditional 3D generative model, namely TextField3D. Specifically, rat… ▽ More Recent works learn 3D representation explicitly under text-3D guidance. However, limited text-3D data restricts the vocabulary scale and text control of generations. Generators may easily fall into a stereotype concept for certain text prompts, thus losing open-vocabulary generation ability. To tackle this issue, we introduce a conditional 3D generative model, namely TextField3D. Specifically, rather than using the text prompts as input directly, we suggest to inject dynamic noise into the latent space of given text prompts, i.e., Noisy Text Fields (NTFs). In this way, limited 3D data can be mapped to the appropriate range of textual latent space that is expanded by NTFs. To this end, an NTFGen module is proposed to model general text latent code in noisy fields. Meanwhile, an NTFBind module is proposed to align view-invariant image latent code to noisy fields, further supporting image-conditional 3D generation. To guide the conditional generation in both geometry and texture, multi-modal discrimination is constructed with a text-3D discriminator and a text-2.5D discriminator. Compared to previous methods, TextField3D includes three merits: 1) large vocabulary, 2) text consistency, and 3) low latency. Extensive experiments demonstrate that our method achieves a potential open-vocabulary 3D generation capability. △ Less

Submitted 14 March, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

Comments: Accepted by ICLR 2024

arXiv:2309.09774 [pdf, other]

Towards Self-Adaptive Pseudo-Label Filtering for Semi-Supervised Learning

Authors: Lei Zhu, Zhanghan Ke, Rynson Lau

Abstract: Recent semi-supervised learning (SSL) methods typically include a filtering strategy to improve the quality of pseudo labels. However, these filtering strategies are usually hand-crafted and do not change as the model is updated, resulting in a lot of correct pseudo labels being discarded and incorrect pseudo labels being selected during the training process. In this work, we observe that the dist… ▽ More Recent semi-supervised learning (SSL) methods typically include a filtering strategy to improve the quality of pseudo labels. However, these filtering strategies are usually hand-crafted and do not change as the model is updated, resulting in a lot of correct pseudo labels being discarded and incorrect pseudo labels being selected during the training process. In this work, we observe that the distribution gap between the confidence values of correct and incorrect pseudo labels emerges at the very beginning of the training, which can be utilized to filter pseudo labels. Based on this observation, we propose a Self-Adaptive Pseudo-Label Filter (SPF), which automatically filters noise in pseudo labels in accordance with model evolvement by modeling the confidence distribution throughout the training process. Specifically, with an online mixture model, we weight each pseudo-labeled sample by the posterior of it being correct, which takes into consideration the confidence distribution at that time. Unlike previous handcrafted filters, our SPF evolves together with the deep neural network without manual tuning. Extensive experiments demonstrate that incorporating SPF into the existing SSL methods can help improve the performance of SSL, especially when the labeled data is extremely scarce. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: This paper was first submitted to NeurIPS 2021

arXiv:2308.14575 [pdf, other]

Referring Image Segmentation Using Text Supervision

Authors: Fang Liu, Yuhao Liu, Yuqiu Kong, Ke Xu, Lihe Zhang, Baocai Yin, Gerhard Hancke, Rynson Lau

Abstract: Existing Referring Image Segmentation (RIS) methods typically require expensive pixel-level or box-level annotations for supervision. In this paper, we observe that the referring texts used in RIS already provide sufficient information to localize the target object. Hence, we propose a novel weakly-supervised RIS framework to formulate the target localization problem as a classification process to… ▽ More Existing Referring Image Segmentation (RIS) methods typically require expensive pixel-level or box-level annotations for supervision. In this paper, we observe that the referring texts used in RIS already provide sufficient information to localize the target object. Hence, we propose a novel weakly-supervised RIS framework to formulate the target localization problem as a classification process to differentiate between positive and negative text expressions. While the referring text expressions for an image are used as positive expressions, the referring text expressions from other images can be used as negative expressions for this image. Our framework has three main novelties. First, we propose a bilateral prompt method to facilitate the classification process, by harmonizing the domain discrepancy between visual and linguistic features. Second, we propose a calibration method to reduce noisy background information and improve the correctness of the response maps for target object localization. Third, we propose a positive response map selection strategy to generate high-quality pseudo-labels from the enhanced response maps, for training a segmentation network for RIS inference. For evaluation, we propose a new metric to measure localization accuracy. Experiments on four benchmarks show that our framework achieves promising performances to existing fully-supervised RIS methods while outperforming state-of-the-art weakly-supervised methods adapted from related areas. Code is available at https://github.com/fawnliu/TRIS. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: ICCV 2023

arXiv:2308.11798 [pdf, ps, other]

FORCASTing the spectroscopic dust properties of the WC+O binary WR137 with SOFIA

Authors: Megan J. Peatt, Noel D. Richardson, Peredur M. Williams, Nicole Karnath, Victor I. Shenavrin, Ryan M. Lau, Anthony F. J. Moffat, Gerd Weigelt

Abstract: WR 137 (HD 192641) is a binary system consisting of a carbon-rich Wolf-Rayet star and an Oe companion star in a 13-year orbit. Near periastron, the winds of the two stars collide and form carbonaceous dust. We obtained three mid-infrared grism spectra of the system with SOFIA and FORCAST during the last year of SOFIA's operations in July 2021, February 2021, and May 2022 (Cycle 9). Within these sp… ▽ More WR 137 (HD 192641) is a binary system consisting of a carbon-rich Wolf-Rayet star and an Oe companion star in a 13-year orbit. Near periastron, the winds of the two stars collide and form carbonaceous dust. We obtained three mid-infrared grism spectra of the system with SOFIA and FORCAST during the last year of SOFIA's operations in July 2021, February 2021, and May 2022 (Cycle 9). Within these spectra, we have identified several wind lines from He I, He II, C III, and C IV that are emitted from the Wolf-Rayet wind as well as a weak emission feature around 6.3-6.4 $μ$m that may have shifted its peak flux from 6.29 to 6.41$μ$m through this time period. The weak feature grew as the continuum dust emission grew while the WR emission appeared to decline due to lower contrast with the continuum. Furthermore, we observe that the peak of the feature shifts to redder wavelengths during the observations. We compare this feature to the UIR feature and other emission lines identified in dusty WC binaries. For WR 137, we speculate that mixing of the winds in the system with the Oe star's disk is important for starting the dust formation and that it is less important as dust formation continues. Previous infrared photometry shows "mini-eruptions" of dust production which could then be explained with variations of the Oe star disk. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: 13 pages, accepted to ApJ

arXiv:2308.03059 [pdf, other]

doi 10.1145/3592111

Language-based Photo Color Adjustment for Graphic Designs

Authors: Zhenwei Wang, Nanxuan Zhao, Gerhard Hancke, Rynson W. H. Lau

Abstract: Adjusting the photo color to associate with some design elements is an essential way for a graphic design to effectively deliver its message and make it aesthetically pleasing. However, existing tools and previous works face a dilemma between the ease of use and level of expressiveness. To this end, we introduce an interactive language-based approach for photo recoloring, which provides an intuiti… ▽ More Adjusting the photo color to associate with some design elements is an essential way for a graphic design to effectively deliver its message and make it aesthetically pleasing. However, existing tools and previous works face a dilemma between the ease of use and level of expressiveness. To this end, we introduce an interactive language-based approach for photo recoloring, which provides an intuitive system that can assist both experts and novices on graphic design. Given a graphic design containing a photo that needs to be recolored, our model can predict the source colors and the target regions, and then recolor the target regions with the source colors based on the given language-based instruction. The multi-granularity of the instruction allows diverse user intentions. The proposed novel task faces several unique challenges, including: 1) color accuracy for recoloring with exactly the same color from the target design element as specified by the user; 2) multi-granularity instructions for parsing instructions correctly to generate a specific result or multiple plausible ones; and 3) locality for recoloring in semantically meaningful local regions to preserve original image semantics. To address these challenges, we propose a model called LangRecol with two main components: the language-based source color prediction module and the semantic-palette-based photo recoloring module. We also introduce an approach for generating a synthetic graphic design dataset with instructions to enable model training. We evaluate our model via extensive experiments and user studies. We also discuss several practical applications, showing the effectiveness and practicality of our approach. Code and data for this paper are at: https://zhenwwang.github.io/langrecol. △ Less

Submitted 6 August, 2023; originally announced August 2023.

Comments: 15 pages, 19 figures. Accepted by SIGGRAPH 2023. Project page: https://zhenwwang.github.io/langrecol

arXiv:2307.10664 [pdf, other]

Lighting up NeRF via Unsupervised Decomposition and Enhancement

Authors: Haoyuan Wang, Xiaogang Xu, Ke Xu, Rynson WH. Lau

Abstract: Neural Radiance Field (NeRF) is a promising approach for synthesizing novel views, given a set of images and the corresponding camera poses of a scene. However, images photographed from a low-light scene can hardly be used to train a NeRF model to produce high-quality results, due to their low pixel intensities, heavy noise, and color distortion. Combining existing low-light image enhancement meth… ▽ More Neural Radiance Field (NeRF) is a promising approach for synthesizing novel views, given a set of images and the corresponding camera poses of a scene. However, images photographed from a low-light scene can hardly be used to train a NeRF model to produce high-quality results, due to their low pixel intensities, heavy noise, and color distortion. Combining existing low-light image enhancement methods with NeRF methods also does not work well due to the view inconsistency caused by the individual 2D enhancement process. In this paper, we propose a novel approach, called Low-Light NeRF (or LLNeRF), to enhance the scene representation and synthesize normal-light novel views directly from sRGB low-light images in an unsupervised manner. The core of our approach is a decomposition of radiance field learning, which allows us to enhance the illumination, reduce noise and correct the distorted colors jointly with the NeRF optimization process. Our method is able to produce novel view images with proper lighting and vivid colors and details, given a collection of camera-finished low dynamic range (8-bits/channel) images from a low-light scene. Experiments demonstrate that our method outperforms existing low-light enhancement methods and NeRF methods. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: ICCV 2023. Project website: https://whyy.site/paper/llnerf

arXiv:2307.06692 [pdf, other]

Ejecta, Rings, and Dust in SN 1987A with JWST MIRI/MRS

Authors: O. C. Jones, P. J. Kavanagh, M. J. Barlow, T. Temim, C. Fransson, J. Larsson, J. A. D. L. Blommaert, M. Meixner, R. M. Lau, B. Sargent, P. Bouchet, J. Hjorth, G. S. Wright, A. Coulais, O. D. Fox, R. Gastaud, A. Glasse, N. Habel, A. S. Hirschauer, J. Jaspers, O. Krause, Lenkić, O. Nayak, A. Rest, T. Tikkanen , et al. (9 additional authors not shown)

Abstract: Supernova (SN) 1987A is the nearest supernova in $\sim$400 years. Using the {\em JWST} MIRI Medium Resolution Spectrograph, we spatially resolved the ejecta, equatorial ring (ER) and outer rings in the mid-infrared 12,927 days after the explosion. The spectra are rich in line and dust continuum emission, both in the ejecta and the ring. Broad emission lines (280-380~km~s$^{-1}$ FWHM) seen from all… ▽ More Supernova (SN) 1987A is the nearest supernova in $\sim$400 years. Using the {\em JWST} MIRI Medium Resolution Spectrograph, we spatially resolved the ejecta, equatorial ring (ER) and outer rings in the mid-infrared 12,927 days after the explosion. The spectra are rich in line and dust continuum emission, both in the ejecta and the ring. Broad emission lines (280-380~km~s$^{-1}$ FWHM) seen from all singly-ionized species originate from the expanding ER, with properties consistent with dense post-shock cooling gas. Narrower emission lines (100-170~km~s$^{-1}$ FWHM) are seen from species originating from a more extended lower-density component whose high ionization may have been produced by shocks progressing through the ER, or by the UV radiation pulse associated with the original supernova event. The asymmetric east-west dust emission in the ER has continued to fade, with constant temperature, signifying a reduction in dust mass. Small grains in the ER are preferentially destroyed, with larger grains from the progenitor surviving the transition from SN into SNR. The ER is fit with a single set of optical constants, eliminating the need for a secondary featureless hot dust component. We find several broad ejecta emission lines from [Ne~{\sc ii}], [Ar~{\sc ii}], [Fe~{\sc ii}], and [Ni~{\sc ii}]. With the exception of [Fe~{\sc ii}]~25.99$μ$m, these all originate from the ejecta close to the ring and are likely being excited by X-rays from the interaction. The [Fe~{\sc ii}]~5.34$μ$m to 25.99$μ$m line ratio indicates a temperature of only a few hundred K in the inner core, consistent with being powered by ${}^{44}$Ti decay. △ Less

Submitted 29 February, 2024; v1 submitted 13 July, 2023; originally announced July 2023.

Comments: 27 pages, 16 figures, 4 tables. Accepted ApJ

arXiv:2306.12451 [pdf, other]

doi 10.3847/1538-4357/acebc4

Impact of Pycnonuclear Fusion Uncertainties on the Cooling of Accreting Neutron Star Crusts

Authors: R. Jain, E. F. Brown, H. Schatz, A. V. Afanasjev, M. Beard, L. R. Gasques, S. S. Gupta, G. W. Hitt, W. R. Hix, R. Lau, P. Moller, W. J. Ong, M. Wiescher, Y. Xu

Abstract: The observation of X-rays during quiescence from transiently accreting neutron stars provides unique clues about the nature of dense matter. This, however, requires extensive modeling of the crusts and matching the results to observations. The pycnonuclear fusion reaction rates implemented in these models are theoretically calculated by extending phenomenological expressions and have large uncerta… ▽ More The observation of X-rays during quiescence from transiently accreting neutron stars provides unique clues about the nature of dense matter. This, however, requires extensive modeling of the crusts and matching the results to observations. The pycnonuclear fusion reaction rates implemented in these models are theoretically calculated by extending phenomenological expressions and have large uncertainties spanning many orders of magnitude. We present the first sensitivity studies of these pycnonuclear fusion reactions in realistic network calculations. We also couple the reaction network with the thermal evolution code dStar to further study their impact on the neutron star cooling curves in quiescence. Varying the pycnonuclear fusion reaction rates alters the depth at which nuclear heat is deposited although the total heating remains constant. The enhancement of the pycnonuclear fusion reaction rates leads to an overall shallower deposition of nuclear heat. The impurity factors are also altered depending on the type of ashes deposited on the crust. These total changes correspond to a variation of up to 9 eV in the modeled cooling curves. While this is not sufficient to explain the shallow heat source, it is comparable to the observational uncertainties and can still be important for modeling the neutron star crust. △ Less

Submitted 1 October, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: AASTeX63, 11 pages with 9 figures

Journal ref: ApJ 955 51 (2023)

arXiv:2306.08678 [pdf, other]

doi 10.3847/2041-8213/ace618

A Luminous Red Supergiant and Dusty Long-period Variable Progenitor for SN 2023ixf

Authors: Jacob E. Jencson, Jeniveve Pearson, Emma R. Beasor, Ryan M. Lau, Jennifer E. Andrews, K. Azalee Bostroem, Yize Dong, Michael Engesser, Sebastian Gomez, Muryel Guolo, Emily Hoang, Griffin Hosseinzadeh, Saurabh W. Jha, Viraj Karambelkar, Mansi M. Kasliwal, Michael Lundquist, Nicolas E. Meza Retamal, Armin Rest, David J. Sand, Melissa Shahbandeh, Manisha Shrestha, Nathan Smith, Jay Strader, Stefano Valenti, Qinan Wang , et al. (1 additional authors not shown)

Abstract: We analyze pre-explosion near- and mid-infrared (IR) imaging of the site of SN 2023ixf in the nearby spiral galaxy M101 and characterize the candidate progenitor star. The star displays compelling evidence of variability with a possible period of $\approx$1000 days and an amplitude of $Δm \approx 0.6$ mag in extensive monitoring with the Spitzer Space Telescope since 2004, likely indicative of rad… ▽ More We analyze pre-explosion near- and mid-infrared (IR) imaging of the site of SN 2023ixf in the nearby spiral galaxy M101 and characterize the candidate progenitor star. The star displays compelling evidence of variability with a possible period of $\approx$1000 days and an amplitude of $Δm \approx 0.6$ mag in extensive monitoring with the Spitzer Space Telescope since 2004, likely indicative of radial pulsations. Variability consistent with this period is also seen in the near-IR $J$ and $K_{s}$ bands between 2010 and 2023, up to just 10 days before the explosion. Beyond the periodic variability, we do not find evidence for any IR-bright pre-supernova outbursts in this time period. The IR brightness ($M_{K_s} = -10.7$ mag) and color ($J-K_{s} = 1.6$ mag) of the star suggest a luminous and dusty red supergiant. Modeling of the phase-averaged spectral energy distribution (SED) yields constraints on the stellar temperature ($T_{\mathrm{eff}} = 3500_{-1400}^{+800}$ K) and luminosity ($\log L/L_{\odot} = 5.1\pm0.2$). This places the candidate among the most luminous Type II supernova progenitors with direct imaging constraints, with the caveat that many of these rely only on optical measurements. Comparison with stellar evolution models gives an initial mass of $M_{\mathrm{init}} = 17\pm4 M_{\odot}$. We estimate the pre-supernova mass-loss rate of the star between 3 and 19 yr before explosion from the SED modeling at $\dot M \approx 3\times10^{-5}$ to $3\times10^{-4} M_{\odot}$ yr$^{-1}$ for an assumed wind velocity of $v_w = 10$ km s$^{-1}$, perhaps pointing to enhanced mass loss in a pulsation-driven wind. △ Less

Submitted 1 August, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

Comments: 13 pages, 5 figures, published in ApJL, replacement with revisions to match published version

Journal ref: ApJL 952 (2023) L30

arXiv:2305.14557 [pdf, other]

doi 10.3847/1538-4357/acd4c5

From Dust to Nanodust: Resolving Circumstellar Dust from the Colliding-Wind Binary Wolf-Rayet (WR) 140

Authors: Ryan M. Lau, Jason Wang, Matthew J. Hankins, Thayne Currie, Vincent Deo, Izumi Endo, Olivier Guyon, Yinuo Han, Anthony P. Jones, Nemanja Jovanovic, Julien Lozi, Anthony F. J. Moffat, Takashi Onaka, Garreth Ruane, Andreas A. C. Sander, Samaporn Tinyanont, Peter G. Tuthill, Gerd Weigelt, Peredur M. Williams, Sebastien Vievard

Abstract: Wolf-Rayet (WR) 140 is the archetypal periodic dust-forming colliding-wind binary that hosts a carbon-rich WR (WC) star and an O-star companion with an orbital period of 7.93 years and an orbital eccentricity of 0.9. Throughout the past several decades, multiple dust-formation episodes from WR 140 have been observed that are linked to the binary orbit and occur near the time of periastron passage.… ▽ More Wolf-Rayet (WR) 140 is the archetypal periodic dust-forming colliding-wind binary that hosts a carbon-rich WR (WC) star and an O-star companion with an orbital period of 7.93 years and an orbital eccentricity of 0.9. Throughout the past several decades, multiple dust-formation episodes from WR 140 have been observed that are linked to the binary orbit and occur near the time of periastron passage. Given its predictable dust-formation episodes, WR 140 presents an ideal astrophysical laboratory for investigating the formation and evolution of dust in the hostile environment around a massive binary system. In this paper, we present near- and mid-infrared (IR) spectroscopic and imaging observations of WR 140 with Subaru/SCExAO+CHARIS, Keck/NIRC2+PyWFS, and Subaru/COMICS taken between 2020 June and Sept that resolve the circumstellar dust emission linked to its most recent dust-formation episode in 2016 Dec. Our spectral energy distribution (SED) analysis of WR 140's resolved circumstellar dust emission reveals the presence of a hot ($T_\mathrm{d}\sim1000$ K) near-IR dust component that is co-spatial with the previously known and cooler ($T_\mathrm{d}\sim500$ K) mid-IR dust component composed of $300-500$ Å-sized dust grains. We attribute the hot near-IR dust emission to the presence of nano-sized ("nanodust") grains and suggest they were formed from grain-grain collisions or the rotational disruption of the larger grain size population by radiative torques in the strong radiation field from the central binary. Lastly, we speculate on the astrophysical implications of nanodust formation around colliding-wind WC binaries, which may present an early source of carbonaceous nanodust in the interstellar medium. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 21 pages, 8 figures, Accepted for publication in ApJ

arXiv:2303.13511 [pdf, other]

Neural Preset for Color Style Transfer

Authors: Zhanghan Ke, Yuhao Liu, Lei Zhu, Nanxuan Zhao, Rynson W. H. Lau

Abstract: In this paper, we present a Neural Preset technique to address the limitations of existing color style transfer methods, including visual artifacts, vast memory requirement, and slow style switching speed. Our method is based on two core designs. First, we propose Deterministic Neural Color Map** (DNCM) to consistently operate on each pixel via an image-adaptive color map** matrix, avoiding ar… ▽ More In this paper, we present a Neural Preset technique to address the limitations of existing color style transfer methods, including visual artifacts, vast memory requirement, and slow style switching speed. Our method is based on two core designs. First, we propose Deterministic Neural Color Map** (DNCM) to consistently operate on each pixel via an image-adaptive color map** matrix, avoiding artifacts and supporting high-resolution inputs with a small memory footprint. Second, we develop a two-stage pipeline by dividing the task into color normalization and stylization, which allows efficient style switching by extracting color styles as presets and reusing them on normalized input images. Due to the unavailability of pairwise datasets, we describe how to train Neural Preset via a self-supervised strategy. Various advantages of Neural Preset over existing methods are demonstrated through comprehensive evaluations. Notably, Neural Preset enables stable 4K color style transfer in real-time without artifacts. Besides, we show that our trained model can naturally support multiple applications without fine-tuning, including low-light image enhancement, underwater image correction, image dehazing, and image harmonization. Project page with demos: https://zhkkke.github.io/NeuralPreset . △ Less

Submitted 24 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

Comments: Project page with demos: https://zhkkke.github.io/NeuralPreset . Artifact-free real-time 4K color style transfer via AI-generated presets. CVPR 2023

arXiv:2303.08810 [pdf, other]

BiFormer: Vision Transformer with Bi-Level Routing Attention

Authors: Lei Zhu, Xinjiang Wang, Zhanghan Ke, Wayne Zhang, Rynson Lau

Abstract: As the core building block of vision transformers, attention is a powerful tool to capture long-range dependency. However, such power comes at a cost: it incurs a huge computation burden and heavy memory footprint as pairwise token interaction across all spatial locations is computed. A series of works attempt to alleviate this problem by introducing handcrafted and content-agnostic sparsity into… ▽ More As the core building block of vision transformers, attention is a powerful tool to capture long-range dependency. However, such power comes at a cost: it incurs a huge computation burden and heavy memory footprint as pairwise token interaction across all spatial locations is computed. A series of works attempt to alleviate this problem by introducing handcrafted and content-agnostic sparsity into attention, such as restricting the attention operation to be inside local windows, axial stripes, or dilated windows. In contrast to these approaches, we propose a novel dynamic sparse attention via bi-level routing to enable a more flexible allocation of computations with content awareness. Specifically, for a query, irrelevant key-value pairs are first filtered out at a coarse region level, and then fine-grained token-to-token attention is applied in the union of remaining candidate regions (\ie, routed regions). We provide a simple yet effective implementation of the proposed bi-level routing attention, which utilizes the sparsity to save both computation and memory while involving only GPU-friendly dense matrix multiplications. Built with the proposed bi-level routing attention, a new general vision transformer, named BiFormer, is then presented. As BiFormer attends to a small subset of relevant tokens in a \textbf{query adaptive} manner without distraction from other irrelevant ones, it enjoys both good performance and high computational efficiency, especially in dense prediction tasks. Empirical results across several computer vision tasks such as image classification, object detection, and semantic segmentation verify the effectiveness of our design. Code is available at \url{https://github.com/rayleizhu/BiFormer}. △ Less

Submitted 15 March, 2023; originally announced March 2023.

Comments: CVPR 2023 camera-ready

arXiv:2302.03576 [pdf, other]

doi 10.3847/2041-8213/acd555

JWST NIRSpec observations of Supernova 1987A -- from the inner ejecta to the reverse shock

Authors: J. Larsson, C. Fransson, B. Sargent, O. C. Jones, M. J. Barlow, P. Bouchet, M. Meixner, J. A. D. L. Blommaert, A. Coulais, O. D. Fox, R. Gastaud, A. Glasse, N. Habel, A. S. Hirschauer, J. Hjorth, J. Jaspers, P. J. Kavanagh, O. Krause, R. M. Lau, L. Lenkic, O. Nayak, A. Rest, T. Temim, T. Tikkanen, R. Wesson , et al. (1 additional authors not shown)

Abstract: We present initial results from JWST NIRSpec integral field unit observations of the nearby Supernova (SN) 1987A. The observations provide the first spatially-resolved spectroscopy of the ejecta and equatorial ring (ER) over the 1-5 μm range. We construct 3D emissivity maps of the [Fe I] 1.443 μm line from the inner ejecta and the He I 1.083 μm line from the reverse shock (RS), where the former pr… ▽ More We present initial results from JWST NIRSpec integral field unit observations of the nearby Supernova (SN) 1987A. The observations provide the first spatially-resolved spectroscopy of the ejecta and equatorial ring (ER) over the 1-5 μm range. We construct 3D emissivity maps of the [Fe I] 1.443 μm line from the inner ejecta and the He I 1.083 μm line from the reverse shock (RS), where the former probes the explosion geometry and the latter traces the structure of the circumstellar medium. We also present a model for the integrated spectrum of the ejecta. The [Fe I] 3D map reveals a highly-asymmetric morphology resembling a broken dipole, dominated by two large clumps with velocities of ~2300 km/s. We also find evidence that the Fe-rich inner ejecta have started to interact with the RS. The RS surface traced by the He I line extends from just inside the ER to higher latitudes on both sides of the ER with a half-opening angle ~45 degrees, forming a bubble-like structure. The spectral model for the ejecta allows us to identify the many emission lines, including numerous H_2 lines. We find that the H_2 is most likely excited by far-UV emission, while the metal lines ratios are consistent with a combination of collisional excitation and recombination in the low-temperature ejecta. We also find several high-ionization coronal lines from the ER, requiring a temperature > 2 \times 10^6 K. △ Less

Submitted 16 May, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

Comments: Accepted for publication in ApJL

arXiv:2301.10778 [pdf, other]

doi 10.1093/mnras/stad1681

JWST Discovery of Dust Reservoirs in Nearby Type IIP Supernovae 2004et and 2017eaw

Authors: Melissa Shahbandeh, Arkaprabha Sarangi, Tea Temim, Tamas Szalai, Ori D. Fox, Samaporn Tinyanont, Eli Dwek, Luc Dessart, Alexei V. Filippenko, Thomas G. Brink, Ryan J. Foley, Jacob Jencson, Justin Pierel, Szanna Zsiros, Armin Rest, WeiKang Zheng, Jennifer Andrews, Geoffrey C. Clayton, Kishalay De, Michael Engesser, Suvi Gezari, Sebastian Gomez, Shireen Gonzaga, Joel Johansson, Mansi Kasliwal , et al. (14 additional authors not shown)

Abstract: Supernova (SN) explosions have been sought for decades as a possible source of dust in the Universe, providing the seeds of galaxies, stars, and planetary systems. SN 1987A offers one of the most promising examples of significant SN dust formation, but until the James Webb Space Telescope (JWST), instruments have traditionally lacked the sensitivity at both late times (>1 yr post-explosion) and lo… ▽ More Supernova (SN) explosions have been sought for decades as a possible source of dust in the Universe, providing the seeds of galaxies, stars, and planetary systems. SN 1987A offers one of the most promising examples of significant SN dust formation, but until the James Webb Space Telescope (JWST), instruments have traditionally lacked the sensitivity at both late times (>1 yr post-explosion) and longer wavelengths (i.e., >10 um) to detect analogous dust reservoirs. Here we present JWST/MIRI observations of two historic Type IIP SNe, 2004et and SN 2017eaw, at nearly 18 and 5 yr post-explosion, respectively. We fit the spectral energy distributions as functions of dust mass and temperature, from which we are able to constrain the dust geometry, origin, and heating mechanism. We place a 90% confidence lower limit on the dust masses for SNe 2004et and 2017eaw of >0.014 and >4e-4 M_sun, respectively. More dust may exist at even colder temperatures or may be obscured by high optical depths. We conclude dust formation in the ejecta to be the most plausible and consistent scenario. The observed dust is radiatively heated to ~100-150 K by ongoing shock interaction with the circumstellar medium. Regardless of the best fit or heating mechanism adopted, the inferred dust mass for SN 2004et is the second highest (next to SN 1987A) inferred dust mass in extragalactic SNe thus far, promoting the prospect of SNe as potential significant sources of dust in the Universe. △ Less

Submitted 25 January, 2023; originally announced January 2023.

Comments: 12 pages, 7 figures, submitting to MNRAS

arXiv:2301.03182 [pdf, other]

Structure-Informed Shadow Removal Networks

Authors: Yuhao Liu, Qing Guo, Lan Fu, Zhanghan Ke, Ke Xu, Wei Feng, Ivor W. Tsang, Rynson W. H. Lau

Abstract: Existing deep learning-based shadow removal methods still produce images with shadow remnants. These shadow remnants typically exist in homogeneous regions with low-intensity values, making them untraceable in the existing image-to-image map** paradigm. We observe that shadows mainly degrade images at the image-structure level (in which humans perceive object shapes and continuous colors). Hence… ▽ More Existing deep learning-based shadow removal methods still produce images with shadow remnants. These shadow remnants typically exist in homogeneous regions with low-intensity values, making them untraceable in the existing image-to-image map** paradigm. We observe that shadows mainly degrade images at the image-structure level (in which humans perceive object shapes and continuous colors). Hence, in this paper, we propose to remove shadows at the image structure level. Based on this idea, we propose a novel structure-informed shadow removal network (StructNet) to leverage the image-structure information to address the shadow remnant problem. Specifically, StructNet first reconstructs the structure information of the input image without shadows and then uses the restored shadow-free structure prior to guiding the image-level shadow removal. StructNet contains two main novel modules: (1) a mask-guided shadow-free extraction (MSFE) module to extract image structural features in a non-shadow-to-shadow directional manner, and (2) a multi-scale feature & residual aggregation (MFRA) module to leverage the shadow-free structure information to regularize feature consistency. In addition, we also propose to extend StructNet to exploit multi-level structure information (MStructNet), to further boost the shadow removal performance with minimum computational overheads. Extensive experiments on three shadow removal benchmarks demonstrate that our method outperforms existing shadow removal methods, and our StructNet can be integrated with existing methods to improve them further. △ Less

Submitted 1 February, 2024; v1 submitted 9 January, 2023; originally announced January 2023.

Comments: IEEE TIP

arXiv:2212.09708 [pdf, other]

Recurring outbursts of the supernova impostor AT 2016blu in NGC 4559

Authors: Mojgan Aghakhanloo, Nathan Smith, Peter Milne, Jennifer E. Andrews, Schuyler D. Van Dyk, Alexei V. Filippenko, Jacob E. Jencson, Ryan M. Lau, David J. Sand, Samuel Wyatt, WeiKang Zheng

Abstract: We present the first photometric analysis of the supernova (SN) impostor AT 2016blu in NGC 4559. This transient was discovered by the Lick Observatory Supernova Search in 2012 and has continued its outbursts since then. Optical and infrared photometry of AT 2016blu reveals at least 19 outbursts in 2012-2022. Similar photometry from 1999-2009 shows no outbursts, indicating that the star was relativ… ▽ More We present the first photometric analysis of the supernova (SN) impostor AT 2016blu in NGC 4559. This transient was discovered by the Lick Observatory Supernova Search in 2012 and has continued its outbursts since then. Optical and infrared photometry of AT 2016blu reveals at least 19 outbursts in 2012-2022. Similar photometry from 1999-2009 shows no outbursts, indicating that the star was relatively stable in the decade before discovery. Archival {\it Hubble Space Telescope} observations suggest that the progenitor had a minimum initial mass of $M >= 33$ M$_{\odot}$ and a luminosity of $L >= 10^{5.7}$ L$_{\odot}$. AT 2016blu's outbursts show irregular variability with multiple closely spaced peaks having typical amplitudes of 1-2 mag and durations of 1-4 weeks. While individual outbursts have irregular light curves, concentrations of these peaks recur with a period of $\sim 113 \pm 2$ d. Based on this period, we predict times for upcoming outbursts in 2023 and 2024. AT 2016blu shares similarities with SN 2000ch in NGC 3432, where outbursts may arise from periastron encounters in an eccentric binary containing a luminous blue variable (LBV). We propose that AT 2016blu's outbursts are also driven by interactions that intensify around periastron in an eccentric system. Intrinsic variability of the LBV-like primary star may cause different intensity and duration of binary interaction at each periastron passage. AT 2016blu also resembles the periastron encounters of $η$ Carinae prior to its Great Eruption and the erratic pre-SN eruptions of SN 2009ip. This similarity and the onset of eruptions in the past decade hint that AT 2016blu may also be headed for a catastrophe, making it a target of great interest. △ Less

Submitted 20 September, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: 18 pages, 14 figures, 6 tables, MNRAS Accepted

arXiv:2211.15644 [pdf, other]

Efficient Mirror Detection via Multi-level Heterogeneous Learning

Authors: Ruozhen He, Jiaying Lin, Rynson W. H. Lau

Abstract: We present HetNet (Multi-level \textbf{Het}erogeneous \textbf{Net}work), a highly efficient mirror detection network. Current mirror detection methods focus more on performance than efficiency, limiting the real-time applications (such as drones). Their lack of efficiency is aroused by the common design of adopting homogeneous modules at different levels, which ignores the difference between diffe… ▽ More We present HetNet (Multi-level \textbf{Het}erogeneous \textbf{Net}work), a highly efficient mirror detection network. Current mirror detection methods focus more on performance than efficiency, limiting the real-time applications (such as drones). Their lack of efficiency is aroused by the common design of adopting homogeneous modules at different levels, which ignores the difference between different levels of features. In contrast, HetNet detects potential mirror regions initially through low-level understandings (\textit{e.g.}, intensity contrasts) and then combines with high-level understandings (contextual discontinuity for instance) to finalize the predictions. To perform accurate yet efficient mirror detection, HetNet follows an effective architecture that obtains specific information at different stages to detect mirrors. We further propose a multi-orientation intensity-based contrasted module (MIC) and a reflection semantic logical module (RSL), equipped on HetNet, to predict potential mirror regions by low-level understandings and analyze semantic logic in scenarios by high-level understandings, respectively. Compared to the state-of-the-art method, HetNet runs 664$\%$ faster and draws an average performance gain of 8.9$\%$ on MAE, 3.1$\%$ on IoU, and 2.0$\%$ on F-measure on two mirror detection benchmarks. △ Less

Submitted 28 November, 2022; originally announced November 2022.

Comments: Accepted to AAAI 2023. The code is available at https://github.com/Catherine-R-He/HetNet

arXiv:2210.06556 [pdf]

doi 10.1038/s41586-022-05155-5

Radiation-driven acceleration in the expanding WR140 dust shell

Authors: Yinuo Han, Peter G. Tuthill, Ryan M. Lau, Anthony Soulain

Abstract: The Wolf-Rayet (WR) binary system WR140 is a close (0.9-16.7 mas) binary star consisting of an O5 primary and WC7 companion and is known as the archetype of episodic dust-producing WRs. Dust in WR binaries is known to form in a confined stream originating from the collision of the two stellar winds, with orbital motion of the binary sculpting the large-scale dust structure into arcs as dust is swe… ▽ More The Wolf-Rayet (WR) binary system WR140 is a close (0.9-16.7 mas) binary star consisting of an O5 primary and WC7 companion and is known as the archetype of episodic dust-producing WRs. Dust in WR binaries is known to form in a confined stream originating from the collision of the two stellar winds, with orbital motion of the binary sculpting the large-scale dust structure into arcs as dust is swept radially outwards. It is understood that sensitive conditions required for dust production in WR140 are only met around periastron when the two stars are sufficiently close. Here we present multiepoch imagery of the circumstellar dust shell of WR140. We constructed geometric models that closely trace the expansion of the intricately structured dust plume, showing that complex effects induced by orbital modulation may result in a 'Goldilocks zone' for dust production. We find that the expansion of the dust plume cannot be reproduced under the assumption of a simple uniform-speed outflow, finding instead the dust to be accelerating. This constitutes a direct kinematic record of dust motion under acceleration by radiation pressure and further highlights the complexity of the physical conditions in colliding-wind binaries. △ Less

Submitted 12 October, 2022; originally announced October 2022.

Comments: Published in Nature

Journal ref: Nature 610, 269-272 (2022)

arXiv:2210.06452 [pdf, other]

doi 10.1038/s41550-022-01812-x

Nested Dust Shells around the Wolf-Rayet Binary WR 140 observed with JWST

Authors: Ryan M. Lau, Matthew J. Hankins, Yinuo Han, Ioannis Argyriou, Michael F. Corcoran, Jan J. Eldridge, Izumi Endo, Ori D. Fox, Macarena Garcia Marin, Theodore R. Gull, Olivia C. Jones, Kenji Hamaguchi, Astrid Lamberts, David R. Law, Thomas Madura, Sergey V. Marchenko, Hideo Matsuhara, Anthony F. J. Moffat, Mark R. Morris, Patrick W. Morris, Takashi Onaka, Michael E. Ressler, Noel D. Richardson, Christopher M. P. Russell, Joel Sanchez-Bermudez , et al. (7 additional authors not shown)

Abstract: Massive colliding-wind binaries that host a Wolf-Rayet (WR) star present a potentially important source of dust and chemical enrichment in the interstellar medium (ISM). However, the chemical composition and survival of dust formed from such systems is not well understood. The carbon-rich WR (WC) binary WR~140 presents an ideal astrophysical laboratory for investigating these questions given its w… ▽ More Massive colliding-wind binaries that host a Wolf-Rayet (WR) star present a potentially important source of dust and chemical enrichment in the interstellar medium (ISM). However, the chemical composition and survival of dust formed from such systems is not well understood. The carbon-rich WR (WC) binary WR~140 presents an ideal astrophysical laboratory for investigating these questions given its well-defined orbital period and predictable dust-formation episodes every 7.93 years around periastron passage. We present observations from our Early Release Science program (ERS1349) with the James Webb Space Telescope (JWST) Mid-Infrared Instrument (MIRI) Medium-Resolution Spectrometer (MRS) and Imager that reveal the spectral and spatial signatures of nested circumstellar dust shells around WR~140. MIRI MRS spectroscopy of the second dust shell and Imager detections of over 17 shells formed throughout the past $\gtrsim130$ years confirm the survival of carbonaceous dust grains from WR~140 that are likely carriers of "unidentified infrared" (UIR)-band features at 6.4 and 7.7 $μ$m. The observations indicate that dust-forming WC binaries can enrich the ISM with organic compounds and carbonaceous dust. △ Less

Submitted 12 October, 2022; originally announced October 2022.

Comments: Published in Nature Astronomy on Oct 12, 2022; 21 pages, 5 figures, 2 tables

Journal ref: Lau, R.M., Hankins, M.J., Han, Y. et al. Nested dust shells around the Wolf-Rayet binary WR 140 observed with JWST. Nat Astron (2022)

arXiv:2210.01055 [pdf, other]

CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training

Authors: Tianyu Huang, Bowen Dong, Yunhan Yang, Xiaoshui Huang, Rynson W. H. Lau, Wanli Ouyang, Wangmeng Zuo

Abstract: Pre-training across 3D vision and language remains under development because of limited training data. Recent works attempt to transfer vision-language pre-training models to 3D vision. PointCLIP converts point cloud data to multi-view depth maps, adopting CLIP for shape classification. However, its performance is restricted by the domain gap between rendered depth maps and images, as well as the… ▽ More Pre-training across 3D vision and language remains under development because of limited training data. Recent works attempt to transfer vision-language pre-training models to 3D vision. PointCLIP converts point cloud data to multi-view depth maps, adopting CLIP for shape classification. However, its performance is restricted by the domain gap between rendered depth maps and images, as well as the diversity of depth distributions. To address this issue, we propose CLIP2Point, an image-depth pre-training method by contrastive learning to transfer CLIP to the 3D domain, and adapt it to point cloud classification. We introduce a new depth rendering setting that forms a better visual effect, and then render 52,460 pairs of images and depth maps from ShapeNet for pre-training. The pre-training scheme of CLIP2Point combines cross-modality learning to enforce the depth features for capturing expressive visual and textual features and intra-modality learning to enhance the invariance of depth aggregation. Additionally, we propose a novel Dual-Path Adapter (DPA) module, i.e., a dual-path structure with simplified adapters for few-shot learning. The dual-path structure allows the joint use of CLIP and CLIP2Point, and the simplified adapter can well fit few-shot tasks without post-search. Experimental results show that CLIP2Point is effective in transferring CLIP knowledge to 3D vision. Our CLIP2Point outperforms PointCLIP and other self-supervised 3D networks, achieving state-of-the-art results on zero-shot and few-shot classification. △ Less

Submitted 22 August, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

Comments: Accepted by ICCV2023

arXiv:2209.04639 [pdf, other]

doi 10.1109/TPAMI.2022.3181973

Large-Field Contextual Feature Learning for Glass Detection

Authors: Haiyang Mei, Xin Yang, Letian Yu, Qiang Zhang, Xiaopeng Wei, Rynson W. H. Lau

Abstract: Glass is very common in our daily life. Existing computer vision systems neglect it and thus may have severe consequences, e.g., a robot may crash into a glass wall. However, sensing the presence of glass is not straightforward. The key challenge is that arbitrary objects/scenes can appear behind the glass. In this paper, we propose an important problem of detecting glass surfaces from a single RG… ▽ More Glass is very common in our daily life. Existing computer vision systems neglect it and thus may have severe consequences, e.g., a robot may crash into a glass wall. However, sensing the presence of glass is not straightforward. The key challenge is that arbitrary objects/scenes can appear behind the glass. In this paper, we propose an important problem of detecting glass surfaces from a single RGB image. To address this problem, we construct the first large-scale glass detection dataset (GDD) and propose a novel glass detection network, called GDNet-B, which explores abundant contextual cues in a large field-of-view via a novel large-field contextual feature integration (LCFI) module and integrates both high-level and low-level boundary features with a boundary feature enhancement (BFE) module. Extensive experiments demonstrate that our GDNet-B achieves satisfying glass detection results on the images within and beyond the GDD testing set. We further validate the effectiveness and generalization capability of our proposed GDNet-B by applying it to other vision tasks, including mirror segmentation and salient object detection. Finally, we show the potential applications of glass detection and discuss possible future research directions. △ Less

Submitted 10 September, 2022; originally announced September 2022.

arXiv:2209.01884 [pdf, other]

doi 10.1093/mnras/stac2999

Smoke on the wind: dust nucleation in archetype colliding wind pinwheel WR104

Authors: A. Soulain, A. Lamberts, F. Millour, P. Tuthill, R. M. Lau

Abstract: A handful of binary Wolf-Rayet stars are known to harbour spectacular spiral structures spanning a few hundred AU. These systems host some of the highest dust production rates in the Universe and are therefore interesting candidates to address the origin of the enigmatic dust excess observed across galactic evolution. The substantial interaction between the winds of the Wolf-Rayet star and its com… ▽ More A handful of binary Wolf-Rayet stars are known to harbour spectacular spiral structures spanning a few hundred AU. These systems host some of the highest dust production rates in the Universe and are therefore interesting candidates to address the origin of the enigmatic dust excess observed across galactic evolution. The substantial interaction between the winds of the Wolf-Rayet star and its companion constitutes a unique laboratory to study the mechanisms of dust nucleation in a hostile environment. Using the grid-based $\texttt{RAMSES}$ code, we investigate this problem by performing a 3D hydrodynamic simulation of the inner region of the prototypical spiral nebula around WR104. We then process the $\texttt{RAMSES}$ results using the radiative transfer code $\texttt{RADMC3d}$ to generate a candidate observable scene. This allows us to estimate the geometrical parameters of the shocked region. We link those quantities to the specific chemical pathway for dust nucleation, where the hydrogen-rich companion's wind catalyses dust formation. The scaling laws we derive constitute a unique tool that can be directly compared to observations. Depending on the dust nucleation locus, the velocity field reveals a differential wind speed. Thus, the initial dust speed could be more balanced between the speeds of the two stellar winds ($\sim$1600 km/s). With $\texttt{RADMC3d}$, we provide constraints on the dust nucleation radius for different combinations of dust-to-gas ratio, hydrogen enrichment and dust grain properties. Finally, our models reveal that dust may escape beyond the boundaries of the spiral due to hydrodynamical instabilities in the wind collision zone. △ Less

Submitted 5 September, 2022; originally announced September 2022.

arXiv:2208.07735 [pdf, other]

doi 10.1109/TAP.2022.3218759

Rain Removal from Light Field Images with 4D Convolution and Multi-scale Gaussian Process

Authors: Tao Yan, Mingyue Li, Bin Li, Yang Yang, Rynson W. H. Lau

Abstract: Existing deraining methods focus mainly on a single input image. However, with just a single input image, it is extremely difficult to accurately detect and remove rain streaks, in order to restore a rain-free image. In contrast, a light field image (LFI) embeds abundant 3D structure and texture information of the target scene by recording the direction and position of each incident ray via a plen… ▽ More Existing deraining methods focus mainly on a single input image. However, with just a single input image, it is extremely difficult to accurately detect and remove rain streaks, in order to restore a rain-free image. In contrast, a light field image (LFI) embeds abundant 3D structure and texture information of the target scene by recording the direction and position of each incident ray via a plenoptic camera. LFIs are becoming popular in the computer vision and graphics communities. However, making full use of the abundant information available from LFIs, such as 2D array of sub-views and the disparity map of each sub-view, for effective rain removal is still a challenging problem. In this paper, we propose a novel method, 4D-MGP-SRRNet, for rain streak removal from LFIs. Our method takes as input all sub-views of a rainy LFI. To make full use of the LFI, it adopts 4D convolutional layers to simultaneously process all sub-views of the LFI. In the pipeline, the rain detection network, MGPDNet, with a novel Multi-scale Self-guided Gaussian Process (MSGP) module is proposed to detect high-resolution rain streaks from all sub-views of the input LFI at multi-scales. Semi-supervised learning is introduced for MSGP to accurately detect rain streaks by training on both virtual-world rainy LFIs and real-world rainy LFIs at multi-scales via computing pseudo ground truths for real-world rain streaks. We then feed all sub-views subtracting the predicted rain streaks into a 4D convolution-based Depth Estimation Residual Network (DERNet) to estimate the depth maps, which are later converted into fog maps. Finally, all sub-views concatenated with the corresponding rain streaks and fog maps are fed into a powerful rainy LFI restoring model based on the adversarial recurrent neural network to progressively eliminate rain streaks and recover the rain-free LFI. △ Less

Submitted 27 January, 2023; v1 submitted 16 August, 2022; originally announced August 2022.

Comments: This paper has been published on IEEE Transactions on Image Processing

Journal ref: IEEE Transactions on Image Processing (2023), v32, pages 921-936

arXiv:2207.14083 [pdf, other]

Weakly-Supervised Camouflaged Object Detection with Scribble Annotations

Authors: Ruozhen He, Qihua Dong, Jiaying Lin, Rynson W. H. Lau

Abstract: Existing camouflaged object detection (COD) methods rely heavily on large-scale datasets with pixel-wise annotations. However, due to the ambiguous boundary, annotating camouflage objects pixel-wisely is very time-consuming and labor-intensive, taking ~60mins to label one image. In this paper, we propose the first weakly-supervised COD method, using scribble annotations as supervision. To achieve… ▽ More Existing camouflaged object detection (COD) methods rely heavily on large-scale datasets with pixel-wise annotations. However, due to the ambiguous boundary, annotating camouflage objects pixel-wisely is very time-consuming and labor-intensive, taking ~60mins to label one image. In this paper, we propose the first weakly-supervised COD method, using scribble annotations as supervision. To achieve this, we first relabel 4,040 images in existing camouflaged object datasets with scribbles, which takes ~10s to label one image. As scribble annotations only describe the primary structure of objects without details, for the network to learn to localize the boundaries of camouflaged objects, we propose a novel consistency loss composed of two parts: a cross-view loss to attain reliable consistency over different images, and an inside-view loss to maintain consistency inside a single prediction map. Besides, we observe that humans use semantic information to segment regions near the boundaries of camouflaged objects. Hence, we further propose a feature-guided loss, which includes visual features directly extracted from images and semantically significant features captured by the model. Finally, we propose a novel network for COD via scribble learning on structural information and semantic relations. Our network has two novel modules: the local-context contrasted (LCC) module, which mimics visual inhibition to enhance image contrast/sharpness and expand the scribbles into potential camouflaged regions, and the logical semantic relation (LSR) module, which analyzes the semantic relation to determine the regions representing the camouflaged object. Experimental results show that our model outperforms relevant SOTA methods on three COD benchmarks with an average improvement of 11.0% on MAE, 3.2% on S-measure, 2.5% on E-measure, and 4.4% on weighted F-measure. △ Less

Submitted 28 November, 2022; v1 submitted 28 July, 2022; originally announced July 2022.

Comments: Accepted to AAAI 2023. The code and dataset are available at https://github.com/dddraxxx/Weakly-Supervised-Camouflaged-Object-Detection-with-Scribble-Annotations

arXiv:2207.06332 [pdf, other]

Symmetry-Aware Transformer-based Mirror Detection

Authors: Tianyu Huang, Bowen Dong, Jiaying Lin, Xiaohui Liu, Rynson W. H. Lau, Wangmeng Zuo

Abstract: Mirror detection aims to identify the mirror regions in the given input image. Existing works mainly focus on integrating the semantic features and structural features to mine specific relations between mirror and non-mirror regions, or introducing mirror properties like depth or chirality to help analyze the existence of mirrors. In this work, we observe that a real object typically forms a loose… ▽ More Mirror detection aims to identify the mirror regions in the given input image. Existing works mainly focus on integrating the semantic features and structural features to mine specific relations between mirror and non-mirror regions, or introducing mirror properties like depth or chirality to help analyze the existence of mirrors. In this work, we observe that a real object typically forms a loose symmetry relationship with its corresponding reflection in the mirror, which is beneficial in distinguishing mirrors from real objects. Based on this observation, we propose a dual-path Symmetry-Aware Transformer-based mirror detection Network (SATNet), which includes two novel modules: Symmetry-Aware Attention Module (SAAM) and Contrast and Fusion Decoder Module (CFDM). Specifically, we first adopt a transformer backbone to model global information aggregation in images, extracting multi-scale features in two paths. We then feed the high-level dual-path features to SAAMs to capture the symmetry relations. Finally, we fuse the dual-path features and refine our prediction maps progressively with CFDMs to obtain the final mirror mask. Experimental results show that SATNet outperforms both RGB and RGB-D mirror detection methods on all available mirror detection datasets. Codes and trained models are available at: https://github.com/tyhuang0428/SATNet. △ Less

Submitted 4 September, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

arXiv:2207.01322 [pdf, other]

Harmonizer: Learning to Perform White-Box Image and Video Harmonization

Authors: Zhanghan Ke, Chunyi Sun, Lei Zhu, Ke Xu, Rynson W. H. Lau

Abstract: Recent works on image harmonization solve the problem as a pixel-wise image translation task via large autoencoders. They have unsatisfactory performances and slow inference speeds when dealing with high-resolution images. In this work, we observe that adjusting the input arguments of basic image filters, e.g., brightness and contrast, is sufficient for humans to produce realistic images from the… ▽ More Recent works on image harmonization solve the problem as a pixel-wise image translation task via large autoencoders. They have unsatisfactory performances and slow inference speeds when dealing with high-resolution images. In this work, we observe that adjusting the input arguments of basic image filters, e.g., brightness and contrast, is sufficient for humans to produce realistic images from the composite ones. Hence, we frame image harmonization as an image-level regression problem to learn the arguments of the filters that humans use for the task. We present a Harmonizer framework for image harmonization. Unlike prior methods that are based on black-box autoencoders, Harmonizer contains a neural network for filter argument prediction and several white-box filters (based on the predicted arguments) for image harmonization. We also introduce a cascade regressor and a dynamic loss strategy for Harmonizer to learn filter arguments more stably and precisely. Since our network only outputs image-level arguments and the filters we used are efficient, Harmonizer is much lighter and faster than existing methods. Comprehensive experiments demonstrate that Harmonizer surpasses existing methods notably, especially with high-resolution inputs. Finally, we apply Harmonizer to video harmonization, which achieves consistent results across frames and 56 fps at 1080P resolution. Code and models are available at: https://github.com/ZHKKKe/Harmonizer. △ Less

Submitted 20 July, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

Showing 1–50 of 172 results for author: Lau, R