Search | arXiv e-print repository

arXiv:2307.00895 [pdf, other]

Synthesis of Contrast-Enhanced Breast MRI Using Multi-b-Value DWI-based Hierarchical Fusion Network with Attention Mechanism

Authors: Tianyu Zhang, Luyi Han, Anna D'Angelo, Xin Wang, Yuan Gao, Chunyao Lu, Jonas Teuwen, Regina Beets-Tan, Tao Tan, Ritse Mann

Abstract: Magnetic resonance imaging (MRI) is the most sensitive technique for breast cancer detection among current clinical imaging modalities. Contrast-enhanced MRI (CE-MRI) provides superior differentiation between tumors and invaded healthy tissue, and has become an indispensable technique in the detection and evaluation of cancer. However, the use of gadolinium-based contrast agents (GBCA) to obtain C… ▽ More Magnetic resonance imaging (MRI) is the most sensitive technique for breast cancer detection among current clinical imaging modalities. Contrast-enhanced MRI (CE-MRI) provides superior differentiation between tumors and invaded healthy tissue, and has become an indispensable technique in the detection and evaluation of cancer. However, the use of gadolinium-based contrast agents (GBCA) to obtain CE-MRI may be associated with nephrogenic systemic fibrosis and may lead to bioaccumulation in the brain, posing a potential risk to human health. Moreover, and likely more important, the use of gadolinium-based contrast agents requires the cannulation of a vein, and the injection of the contrast media which is cumbersome and places a burden on the patient. To reduce the use of contrast agents, diffusion-weighted imaging (DWI) is emerging as a key imaging technique, although currently usually complementing breast CE-MRI. In this study, we develop a multi-sequence fusion network to synthesize CE-MRI based on T1-weighted MRI and DWIs. DWIs with different b-values are fused to efficiently utilize the difference features of DWIs. Rather than proposing a pure data-driven approach, we invent a multi-sequence attention module to obtain refined feature maps, and leverage hierarchical representation information fused at different scales while utilizing the contributions from different sequences from a model-driven approach by introducing the weighted difference module. The results show that the multi-b-value DWI-based fusion model can potentially be used to synthesize CE-MRI, thus theoretically reducing or avoiding the use of GBCA, thereby minimizing the burden to patients. Our code is available at \url{https://github.com/Netherlands-Cancer-Institute/CE-MRI}. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Comments: This paper has been accepted by MICCAI 2023

arXiv:2306.16918 [pdf, other]

PCDAL: A Perturbation Consistency-Driven Active Learning Approach for Medical Image Segmentation and Classification

Authors: Tao Wang, Xinlin Zhang, Yuanbo Zhou, Junlin Lan, Tao Tan, Min Du, Qinquan Gao, Tong Tong

Abstract: In recent years, deep learning has become a breakthrough technique in assisting medical image diagnosis. Supervised learning using convolutional neural networks (CNN) provides state-of-the-art performance and has served as a benchmark for various medical image segmentation and classification. However, supervised learning deeply relies on large-scale annotated data, which is expensive, time-consumi… ▽ More In recent years, deep learning has become a breakthrough technique in assisting medical image diagnosis. Supervised learning using convolutional neural networks (CNN) provides state-of-the-art performance and has served as a benchmark for various medical image segmentation and classification. However, supervised learning deeply relies on large-scale annotated data, which is expensive, time-consuming, and even impractical to acquire in medical imaging applications. Active Learning (AL) methods have been widely applied in natural image classification tasks to reduce annotation costs by selecting more valuable examples from the unlabeled data pool. However, their application in medical image segmentation tasks is limited, and there is currently no effective and universal AL-based method specifically designed for 3D medical image segmentation. To address this limitation, we propose an AL-based method that can be simultaneously applied to 2D medical image classification, segmentation, and 3D medical image segmentation tasks. We extensively validated our proposed active learning method on three publicly available and challenging medical image datasets, Kvasir Dataset, COVID-19 Infection Segmentation Dataset, and BraTS2019 Dataset. The experimental results demonstrate that our PCDAL can achieve significantly improved performance with fewer annotations in 2D classification and segmentation and 3D segmentation tasks. The codes of this study are available at https://github.com/ortonwang/PCDAL. △ Less

Submitted 29 June, 2023; originally announced June 2023.

arXiv:2306.11784 [pdf, other]

NANCY: Next-generation All-sky Near-infrared Community surveY

Authors: Jiwon Jesse Han, Arjun Dey, Adrian M. Price-Whelan, Joan Najita, Edward F. Schlafly, Andrew Saydjari, Risa H. Wechsler, Ana Bonaca, David J Schlegel, Charlie Conroy, Anand Raichoor, Alex Drlica-Wagner, Juna A. Kollmeier, Sergey E. Koposov, Gurtina Besla, Hans-Walter Rix, Alyssa Goodman, Douglas Finkbeiner, Abhijeet Anand, Matthew Ashby, Benedict Bahr-Kalus, Rachel Beaton, Jayashree Behera, Eric F. Bell, Eric C Bellm , et al. (184 additional authors not shown)

Abstract: The Nancy Grace Roman Space Telescope is capable of delivering an unprecedented all-sky, high-spatial resolution, multi-epoch infrared map to the astronomical community. This opportunity arises in the midst of numerous ground- and space-based surveys that will provide extensive spectroscopy and imaging together covering the entire sky (such as Rubin/LSST, Euclid, UNIONS, SPHEREx, DESI, SDSS-V, GAL… ▽ More The Nancy Grace Roman Space Telescope is capable of delivering an unprecedented all-sky, high-spatial resolution, multi-epoch infrared map to the astronomical community. This opportunity arises in the midst of numerous ground- and space-based surveys that will provide extensive spectroscopy and imaging together covering the entire sky (such as Rubin/LSST, Euclid, UNIONS, SPHEREx, DESI, SDSS-V, GALAH, 4MOST, WEAVE, MOONS, PFS, UVEX, NEO Surveyor, etc.). Roman can uniquely provide uniform high-spatial-resolution (~0.1 arcsec) imaging over the entire sky, vastly expanding the science reach and precision of all of these near-term and future surveys. This imaging will not only enhance other surveys, but also facilitate completely new science. By imaging the full sky over two epochs, Roman can measure the proper motions for stars across the entire Milky Way, probing 100 times fainter than Gaia out to the very edge of the Galaxy. Here, we propose NANCY: a completely public, all-sky survey that will create a high-value legacy dataset benefiting innumerable ongoing and forthcoming studies of the universe. NANCY is a pure expression of Roman's potential: it images the entire sky, at high spatial resolution, in a broad infrared bandpass that collects as many photons as possible. The majority of all ongoing astronomical surveys would benefit from incorporating observations of NANCY into their analyses, whether these surveys focus on nearby stars, the Milky Way, near-field cosmology, or the broader universe. △ Less

Submitted 20 June, 2023; originally announced June 2023.

Comments: Submitted to the call for white papers for the Roman Core Community Survey (June 16th, 2023), and to the Bulletin of the AAS

arXiv:2306.09573 [pdf, other]

Reevaluation of Stark-induced transition polarizabilities in cesium

Authors: H. B. Tran Tan, D. Xiao, A. Derevianko

Abstract: Extracting electroweak observables from experiments on atomic parity violation (APV) using the Stark interference technique requires accurate knowledge of transition polarizabilities. In cesium, the focus of our paper, the $6S_{1/2}\rightarrow{7S_{1/2}}$ APV amplitude is deduced from the measured ratio of the APV amplitude to the vector transition polarizability, $β$. This ratio was measured with… ▽ More Extracting electroweak observables from experiments on atomic parity violation (APV) using the Stark interference technique requires accurate knowledge of transition polarizabilities. In cesium, the focus of our paper, the $6S_{1/2}\rightarrow{7S_{1/2}}$ APV amplitude is deduced from the measured ratio of the APV amplitude to the vector transition polarizability, $β$. This ratio was measured with a $0.35\%$ uncertainty by the Boulder group [Science 275, 1759 (1997)]. Currently, there is a sizable discrepancy in different determinations of $β$ critically limiting the interpretation of the APV measurement. The most recent value [Phys. Rev. Lett. 123, 073002 (2019)] of $β=27.139(42)\, \mathrm{a.u.}$ was deduced from a semi-empirical sum-over-state determination of the scalar transition polarizability $α$ and the measured $α/β$ ratio [Phys. Rev. A 55, 1007 (1997)]. This value of $β$, however, differs by $\sim 0.7\%$ or $2.8σ$ from the previous determination of $β=26.957(51)$ by [Phys. Rev. A 62, 052101 (2000)] based on the measured ratio $M1/β$ of the magnetic-dipole $6S_{1/2}\rightarrow{7S_{1/2}}$ matrix element to $β$. Here, we revise the determination of $β$ by [Phys. Rev. Lett. 123, 073002 (2019)], using a more consistent and more theoretically complete treatment of contributions from the excited intermediate states in the sum-over-state $α/β$ method. Our result of $β=26.887(38)\, \mathrm{a.u.}$ resolves the tension between the $α/β$ and $M1/β$ approaches. We recommend the value of $β=26.912(30)$ obtained by averaging our result and that of [Phys. Rev. A 62, 052101 (2000)]. △ Less

Submitted 14 August, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: 9 pages, 2 figures v2: Reference added, small cosmetic changes to the text

arXiv:2306.07841 [pdf, other]

Electrostatic moiré potential from twisted-hBN layers

Authors: Dong Seob Kim, Roy C. Dominguez, Rigo Mayorga-Luna, Dingyi Ye, Jacob Embley, Tixuan Tan, Yue Ni, Zhida Liu, Mitchell Ford, Frank Y. Gao, Saba Arash, Kenji Watanabe, Takashi Taniguchi, Suenne Kim, Chih-Kang Shih, Keji Lai, Wang Yao, Li Yang, Xiaoqin Li, Yoichi Miyahara

Abstract: Moiré superlattices formed by vertically stacking van der Waals layers host a rich variety of correlated electronic phases and function as novel photonic materials. The moiré potential of the superlattice, however, is fixed by the interlayer coupling of the stacked functional layers (e.g. graphene) and dependent on carrier types (e.g. electrons or holes) and valleys (e.g. Γ vs. K). In contrast, tw… ▽ More Moiré superlattices formed by vertically stacking van der Waals layers host a rich variety of correlated electronic phases and function as novel photonic materials. The moiré potential of the superlattice, however, is fixed by the interlayer coupling of the stacked functional layers (e.g. graphene) and dependent on carrier types (e.g. electrons or holes) and valleys (e.g. Γ vs. K). In contrast, twisted hexagonal boron nitride (hBN) layers are predicted to impose a periodic electrostatic potential that may be used to engineer the properties of an adjacent functional thin layer. Here, we show that this potential is described by a simple theory of electric polarization originating from the interfacial charge redistribution, validated by its dependence on supercell sizes and distance from the twisted interfaces. We demonstrate that the potential depth and profile can be further controlled by assembling a double moiré structure. When the twist angles are similar at the two interfaces, the potential is deepened by adding the potential from the two twisted interfaces, reaching ~ 400 meV. When the twist angles are dissimilar at the two interfaces, multi-level polarization states are observed. As an example of controlling a functional layer, we demonstrate how the electrostatic potential from a twisted hBN substrate impedes exciton diffusion in a semiconductor monolayer. These findings suggest exciting opportunities for engineering properties of an adjacent functional layer using the surface potential of a twisted hBN substrate. △ Less

Submitted 13 June, 2023; originally announced June 2023.

arXiv:2306.06669 [pdf, other]

TransMRSR: Transformer-based Self-Distilled Generative Prior for Brain MRI Super-Resolution

Authors: Shan Huang, Xiaohong Liu, Tao Tan, Menghan Hu, Xiaoer Wei, Tingli Chen, Bin Sheng

Abstract: Magnetic resonance images (MRI) acquired with low through-plane resolution compromise time and cost. The poor resolution in one orientation is insufficient to meet the requirement of high resolution for early diagnosis of brain disease and morphometric study. The common Single image super-resolution (SISR) solutions face two main challenges: (1) local detailed and global anatomical structural info… ▽ More Magnetic resonance images (MRI) acquired with low through-plane resolution compromise time and cost. The poor resolution in one orientation is insufficient to meet the requirement of high resolution for early diagnosis of brain disease and morphometric study. The common Single image super-resolution (SISR) solutions face two main challenges: (1) local detailed and global anatomical structural information combination; and (2) large-scale restoration when applied for reconstructing thick-slice MRI into high-resolution (HR) iso-tropic data. To address these problems, we propose a novel two-stage network for brain MRI SR named TransMRSR based on the convolutional blocks to extract local information and transformer blocks to capture long-range dependencies. TransMRSR consists of three modules: the shallow local feature extraction, the deep non-local feature capture, and the HR image reconstruction. We perform a generative task to encapsulate diverse priors into a generative network (GAN), which is the decoder sub-module of the deep non-local feature capture part, in the first stage. The pre-trained GAN is used for the second stage of SR task. We further eliminate the potential latent space shift caused by the two-stage training strategy through the self-distilled truncation trick. The extensive experiments show that our method achieves superior performance to other SSIR methods on both public and private datasets. Code is released at https://github.com/goddesshs/TransMRSR.git . △ Less

Submitted 11 June, 2023; originally announced June 2023.

Comments: 2023 CGI

arXiv:2306.06316 [pdf, other]

Optimal 1D Ly$α$ Forest Power Spectrum Estimation -- III. DESI early data

Authors: Naim Göksel Karaçaylı, Paul Martini, Julien Guy, Corentin Ravoux, Marie Lynn Abdul Karim, Eric Armengaud, Michael Walther, J. Aguilar, S. Ahlen, S. Bailey, J. Bautista, S. F. Beltran, D. Brooks, L. Cabayol-Garcia, S. Chabanier, E. Chaussidon, J. Chaves-Montero, K. Dawson, R. de la Cruz, A. de la Macorra, P. Doel, A. Font-Ribera, J. E. Forero-Romero, S. Gontcho A Gontcho, A. X. Gonzalez-Morales , et al. (37 additional authors not shown)

Abstract: The one-dimensional power spectrum $P_{\mathrm{1D}}$ of the Ly$α$ forest provides important information about cosmological and astrophysical parameters, including constraints on warm dark matter models, the sum of the masses of the three neutrino species, and the thermal state of the intergalactic medium. We present the first measurement of $P_{\mathrm{1D}}$ with the quadratic maximum likelihood e… ▽ More The one-dimensional power spectrum $P_{\mathrm{1D}}$ of the Ly$α$ forest provides important information about cosmological and astrophysical parameters, including constraints on warm dark matter models, the sum of the masses of the three neutrino species, and the thermal state of the intergalactic medium. We present the first measurement of $P_{\mathrm{1D}}$ with the quadratic maximum likelihood estimator (QMLE) from the Dark Energy Spectroscopic Instrument (DESI) survey early data sample. This early sample of $54~600$ quasars is already comparable in size to the largest previous studies, and we conduct a thorough investigation of numerous instrumental and analysis systematic errors to evaluate their impact on DESI data with QMLE. We demonstrate the excellent performance of the spectroscopic pipeline noise estimation and the impressive accuracy of the spectrograph resolution matrix with two-dimensional image simulations of raw DESI images that we processed with the DESI spectroscopic pipeline. We also study metal line contamination and noise calibration systematics with quasar spectra on the red side of the Ly$α$ emission line. In a companion paper, we present a similar analysis based on the Fast Fourier Transform estimate of the power spectrum. We conclude with a comparison of these two approaches and implications for the upcoming DESI Year 1 analysis. △ Less

Submitted 12 January, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: 23 pages, 20 figures. To be published in MNRAS

arXiv:2306.06312 [pdf, other]

The Lyman-$α$ forest catalog from the Dark Energy Spectroscopic Instrument Early Data Release

Authors: César Ramírez-Pérez, Ignasi Pérez-Ràfols, Andreu Font-Ribera, M. Abdul Karim, E. Armengaud, J. Bautista, S. F. Beltran, L. Cabayol-Garcia, Z. Cai, S. Chabanier, E. Chaussidon, J. Chaves-Montero, A. Cuceu, R. de la Cruz, J. García-Bellido, A. X. Gonzalez-Morales, C. Gordon, H. K. Herrera-Alcantar, V. Iršič, M. Ishak, N. G. Karaçaylı, Zarija Lukić, C. J. Manser, P. Montero-Camacho, L. Napolitano , et al. (45 additional authors not shown)

Abstract: We present and validate the catalog of Lyman-$α$ forest fluctuations for 3D analyses using the Early Data Release (EDR) from the Dark Energy Spectroscopic Instrument (DESI) survey. We used 88,511 quasars collected from DESI Survey Validation (SV) data and the first two months of the main survey (M2). We present several improvements to the method used to extract the Lyman-$α$ absorption fluctuation… ▽ More We present and validate the catalog of Lyman-$α$ forest fluctuations for 3D analyses using the Early Data Release (EDR) from the Dark Energy Spectroscopic Instrument (DESI) survey. We used 88,511 quasars collected from DESI Survey Validation (SV) data and the first two months of the main survey (M2). We present several improvements to the method used to extract the Lyman-$α$ absorption fluctuations performed in previous analyses from the Sloan Digital Sky Survey (SDSS). In particular, we modify the weighting scheme and show that it can improve the precision of the correlation function measurement by more than 20%. This catalog can be downloaded from https://data.desi.lbl.gov/public/edr/vac/edr/lya/fuji/v0.3 and it will be used in the near future for the first DESI measurements of the 3D correlations in the Lyman-$α$ forest. △ Less

Submitted 25 December, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

arXiv:2306.06311 [pdf, other]

doi 10.1093/mnras/stad3008

The Dark Energy Spectroscopic Instrument: One-dimensional power spectrum from first Lyman-$α$ forest samples with Fast Fourier Transform

Authors: Corentin Ravoux, Marie Lynn Abdul Karim, Eric Armengaud, Michael Walther, Naim Göksel Karaçaylı, Paul Martini, Julien Guy, Jessica Nicole Aguilar, Steven Ahlen, Stephen Bailey, Julian Bautista, Sergio Felipe Beltran, David Brooks, Laura Cabayol-Garcia, Solène Chabanier, Edmond Chaussidon, Jonás Chaves-Montero, Kyle Dawson, Rodrigo de la Cruz, Axel de la Macorra, Peter Doel, Kevin Fanning, Andreu Font-Ribera, Jaime Forero-Romero, Satya Gontcho A Gontcho , et al. (41 additional authors not shown)

Abstract: We present the one-dimensional Lyman-$α$ forest power spectrum measurement using the first data provided by the Dark Energy Spectroscopic Instrument (DESI). The data sample comprises $26,330$ quasar spectra, at redshift $z > 2.1$, contained in the DESI Early Data Release and the first two months of the main survey. We employ a Fast Fourier Transform (FFT) estimator and compare the resulting power… ▽ More We present the one-dimensional Lyman-$α$ forest power spectrum measurement using the first data provided by the Dark Energy Spectroscopic Instrument (DESI). The data sample comprises $26,330$ quasar spectra, at redshift $z > 2.1$, contained in the DESI Early Data Release and the first two months of the main survey. We employ a Fast Fourier Transform (FFT) estimator and compare the resulting power spectrum to an alternative likelihood-based method in a companion paper. We investigate methodological and instrumental contaminants associated to the new DESI instrument, applying techniques similar to previous Sloan Digital Sky Survey (SDSS) measurements. We use synthetic data based on log-normal approximation to validate and correct our measurement. We compare our resulting power spectrum with previous SDSS and high-resolution measurements. With relatively small number statistics, we successfully perform the FFT measurement, which is already competitive in terms of the scale range. At the end of the DESI survey, we expect a five times larger Lyman-$α$ forest sample than SDSS, providing an unprecedented precise one-dimensional power spectrum measurement. △ Less

Submitted 24 October, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: 23 pages, 23 figures, Journal version

Journal ref: MNRAS, Volume 526, Issue 4, December 2023, Pages 5118-5140

arXiv:2306.06308 [pdf, other]

doi 10.5281/zenodo.7964161

The Early Data Release of the Dark Energy Spectroscopic Instrument

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, G. Aldering, D. M. Alexander, R. Alfarsy, C. Allende Prieto, M. Alvarez, O. Alves, A. Anand, F. Andrade-Oliveira, E. Armengaud, J. Asorey, S. Avila, A. Aviles, S. Bailey, A. Balaguera-Antolínez, O. Ballester, C. Baltay, A. Bault, J. Bautista, J. Behera, S. F. Beltran , et al. (240 additional authors not shown)

Abstract: The Dark Energy Spectroscopic Instrument (DESI) completed its five-month Survey Validation in May 2021. Spectra of stellar and extragalactic targets from Survey Validation constitute the first major data sample from the DESI survey. This paper describes the public release of those spectra, the catalogs of derived properties, and the intermediate data products. In total, the public release includes… ▽ More The Dark Energy Spectroscopic Instrument (DESI) completed its five-month Survey Validation in May 2021. Spectra of stellar and extragalactic targets from Survey Validation constitute the first major data sample from the DESI survey. This paper describes the public release of those spectra, the catalogs of derived properties, and the intermediate data products. In total, the public release includes good-quality spectral information from 466,447 objects targeted as part of the Milky Way Survey, 428,758 as part of the Bright Galaxy Survey, 227,318 as part of the Luminous Red Galaxy sample, 437,664 as part of the Emission Line Galaxy sample, and 76,079 as part of the Quasar sample. In addition, the release includes spectral information from 137,148 objects that expand the scope beyond the primary samples as part of a series of secondary programs. Here, we describe the spectral data, data quality, data products, Large-Scale Structure science catalogs, access to the data, and references that provide relevant background to using these spectra. △ Less

Submitted 15 June, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: 43 pages, 7 figures, 17 tables, submitted to AJ, DESI EDR references added

arXiv:2306.06307 [pdf, other]

doi 10.5281/zenodo.7858207

Validation of the Scientific Program for the Dark Energy Spectroscopic Instrument

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, G. Aldering, D. M. Alexander, R. Alfarsy, C. Allende Prieto, M. Alvarez, O. Alves, A. Anand, F. Andrade-Oliveira, E. Armengaud, J. Asorey, S. Avila, A. Aviles, S. Bailey, A. Balaguera-Antolínez, O. Ballester, C. Baltay, A. Bault, J. Bautista, J. Behera, S. F. Beltran , et al. (239 additional authors not shown)

Abstract: The Dark Energy Spectroscopic Instrument (DESI) was designed to conduct a survey covering 14,000 deg$^2$ over five years to constrain the cosmic expansion history through precise measurements of Baryon Acoustic Oscillations (BAO). The scientific program for DESI was evaluated during a five month Survey Validation (SV) campaign before beginning full operations. This program produced deep spectra of… ▽ More The Dark Energy Spectroscopic Instrument (DESI) was designed to conduct a survey covering 14,000 deg$^2$ over five years to constrain the cosmic expansion history through precise measurements of Baryon Acoustic Oscillations (BAO). The scientific program for DESI was evaluated during a five month Survey Validation (SV) campaign before beginning full operations. This program produced deep spectra of tens of thousands of objects from each of the stellar (MWS), bright galaxy (BGS), luminous red galaxy (LRG), emission line galaxy (ELG), and quasar target classes. These SV spectra were used to optimize redshift distributions, characterize exposure times, determine calibration procedures, and assess observational overheads for the five-year program. In this paper, we present the final target selection algorithms, redshift distributions, and projected cosmology constraints resulting from those studies. We also present a `One-Percent survey' conducted at the conclusion of Survey Validation covering 140 deg$^2$ using the final target selection algorithms with exposures of a depth typical of the main survey. The Survey Validation indicates that DESI will be able to complete the full 14,000 deg$^2$ program with spectroscopically-confirmed targets from the MWS, BGS, LRG, ELG, and quasar programs with total sample sizes of 7.2, 13.8, 7.46, 15.7, and 2.87 million, respectively. These samples will allow exploration of the Milky Way halo, clustering on all scales, and BAO measurements with a statistical precision of 0.28% over the redshift interval $z<1.1$, 0.39% over the redshift interval $1.1<z<1.9$, and 0.46% over the redshift interval $1.9<z<3.5$. △ Less

Submitted 12 January, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: 42 pages, 18 figures, accepted by AJ

arXiv:2305.15565 [pdf, other]

doi 10.1051/0004-6361/202244617

TOI-1130: A photodynamical analysis of a hot Jupiter in resonance with an inner low-mass planet

Authors: J. Korth, D. Gandolfi, J. Šubjak, S. Howard, S. Ataiee, K. A. Collins, S. N. Quinn, A. J. Mustill, T. Guillot, N. Lodieu, A. M. S. Smith, M. Esposito, F. Rodler, A. Muresan, L. Abe, S. H. Albrecht, A. Alqasim, K. Barkaoui, P. G. Beck, C. J. Burke, R. P. Butler, D. M. Conti, K. I. Collins, J. D. Crane, F. Dai , et al. (37 additional authors not shown)

Abstract: The TOI-1130 is a known planetary system around a K-dwarf consisting of a gas giant planet, TOI-1130 c, on an 8.4-day orbit, accompanied by an inner Neptune-sized planet, TOI-1130 b, with an orbital period of 4.1 days. We collected precise radial velocity (RV) measurements of TOI-1130 with the HARPS and PFS spectrographs as part of our ongoing RV follow-up program. We perform a photodynamical mode… ▽ More The TOI-1130 is a known planetary system around a K-dwarf consisting of a gas giant planet, TOI-1130 c, on an 8.4-day orbit, accompanied by an inner Neptune-sized planet, TOI-1130 b, with an orbital period of 4.1 days. We collected precise radial velocity (RV) measurements of TOI-1130 with the HARPS and PFS spectrographs as part of our ongoing RV follow-up program. We perform a photodynamical modeling of the HARPS and PFS RVs, and transit photometry from the Transiting Exoplanet Survey Satellite (TESS) and the TESS Follow-up Observing Program. We determine the planet masses and radii of TOI-1130 b and TOI-1130 c to be Mb = 19.28 $\pm$ 0.97 M$_\oplus$ and Rb = 3.56 $\pm$ 0.13 R$_\oplus$, and Mc = 325.59 $\pm$ 5.59 M$_\oplus$ and Rc = 13.32+1.55-1.41 R$_\oplus$, respectively. We spectroscopically confirm TOI-1130 b that was previously only validated. We find that the two planets orbit with small eccentricities in a 2:1 resonant configuration. This is the first known system with a hot Jupiter and an inner lower mass planet locked in a mean-motion resonance. TOI-1130 belongs to the small yet increasing population of hot Jupiters with an inner low-mass planet that challenges the pathway for hot Jupiter formation. We also detect a linear RV trend possibly due to the presence of an outer massive companion. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: 19 pages, Accepted to A&A

Journal ref: A&A 675, A115 (2023)

arXiv:2305.12228 [pdf, other]

Dynamic Transformers Provide a False Sense of Efficiency

Authors: Yiming Chen, Simin Chen, Zexin Li, Wei Yang, Cong Liu, Robby T. Tan, Haizhou Li

Abstract: Despite much success in natural language processing (NLP), pre-trained language models typically lead to a high computational cost during inference. Multi-exit is a mainstream approach to address this issue by making a trade-off between efficiency and accuracy, where the saving of computation comes from an early exit. However, whether such saving from early-exiting is robust remains unknown. Motiv… ▽ More Despite much success in natural language processing (NLP), pre-trained language models typically lead to a high computational cost during inference. Multi-exit is a mainstream approach to address this issue by making a trade-off between efficiency and accuracy, where the saving of computation comes from an early exit. However, whether such saving from early-exiting is robust remains unknown. Motivated by this, we first show that directly adapting existing adversarial attack approaches targeting model accuracy cannot significantly reduce inference efficiency. To this end, we propose a simple yet effective attacking framework, SAME, a novel slowdown attack framework on multi-exit models, which is specially tailored to reduce the efficiency of the multi-exit models. By leveraging the multi-exit models' design characteristics, we utilize all internal predictions to guide the adversarial sample generation instead of merely considering the final prediction. Experiments on the GLUE benchmark show that SAME can effectively diminish the efficiency gain of various multi-exit models by 80% on average, convincingly validating its effectiveness and generalization ability. △ Less

Submitted 20 May, 2023; originally announced May 2023.

Comments: Accepted by ACL2023

arXiv:2305.11522 [pdf, other]

DSFNet: Dual Space Fusion Network for Occlusion-Robust 3D Dense Face Alignment

Authors: Heyuan Li, Bo Wang, Yu Cheng, Mohan Kankanhalli, Robby T. Tan

Abstract: Sensitivity to severe occlusion and large view angles limits the usage scenarios of the existing monocular 3D dense face alignment methods. The state-of-the-art 3DMM-based method, directly regresses the model's coefficients, underutilizing the low-level 2D spatial and semantic information, which can actually offer cues for face shape and orientation. In this work, we demonstrate how modeling 3D fa… ▽ More Sensitivity to severe occlusion and large view angles limits the usage scenarios of the existing monocular 3D dense face alignment methods. The state-of-the-art 3DMM-based method, directly regresses the model's coefficients, underutilizing the low-level 2D spatial and semantic information, which can actually offer cues for face shape and orientation. In this work, we demonstrate how modeling 3D facial geometry in image and model space jointly can solve the occlusion and view angle problems. Instead of predicting the whole face directly, we regress image space features in the visible facial region by dense prediction first. Subsequently, we predict our model's coefficients based on the regressed feature of the visible regions, leveraging the prior knowledge of whole face geometry from the morphable models to complete the invisible regions. We further propose a fusion network that combines the advantages of both the image and model space predictions to achieve high robustness and accuracy in unconstrained scenarios. Thanks to the proposed fusion module, our method is robust not only to occlusion and large pitch and roll view angles, which is the benefit of our image space approach, but also to noise and large yaw angles, which is the benefit of our model space method. Comprehensive evaluations demonstrate the superior performance of our method compared with the state-of-the-art methods. On the 3D dense face alignment task, we achieve 3.80% NME on the AFLW2000-3D dataset, which outperforms the state-of-the-art method by 5.5%. Code is available at https://github.com/lhyfst/DSFNet. △ Less

Submitted 19 May, 2023; originally announced May 2023.

Comments: Accepted into CVPR'23

arXiv:2305.09512 [pdf, other]

Light-VQA: A Multi-Dimensional Quality Assessment Model for Low-Light Video Enhancement

Authors: Yunlong Dong, Xiaohong Liu, Yixuan Gao, Xunchu Zhou, Tao Tan, Guangtao Zhai

Abstract: Recently, Users Generated Content (UGC) videos becomes ubiquitous in our daily lives. However, due to the limitations of photographic equipments and techniques, UGC videos often contain various degradations, in which one of the most visually unfavorable effects is the underexposure. Therefore, corresponding video enhancement algorithms such as Low-Light Video Enhancement (LLVE) have been proposed… ▽ More Recently, Users Generated Content (UGC) videos becomes ubiquitous in our daily lives. However, due to the limitations of photographic equipments and techniques, UGC videos often contain various degradations, in which one of the most visually unfavorable effects is the underexposure. Therefore, corresponding video enhancement algorithms such as Low-Light Video Enhancement (LLVE) have been proposed to deal with the specific degradation. However, different from video enhancement algorithms, almost all existing Video Quality Assessment (VQA) models are built generally rather than specifically, which measure the quality of a video from a comprehensive perspective. To the best of our knowledge, there is no VQA model specially designed for videos enhanced by LLVE algorithms. To this end, we first construct a Low-Light Video Enhancement Quality Assessment (LLVE-QA) dataset in which 254 original low-light videos are collected and then enhanced by leveraging 8 LLVE algorithms to obtain 2,060 videos in total. Moreover, we propose a quality assessment model specialized in LLVE, named Light-VQA. More concretely, since the brightness and noise have the most impact on low-light enhanced VQA, we handcraft corresponding features and integrate them with deep-learning-based semantic features as the overall spatial information. As for temporal information, in addition to deep-learning-based motion features, we also investigate the handcrafted brightness consistency among video frames, and the overall temporal information is their concatenation. Subsequently, spatial and temporal information is fused to obtain the quality-aware representation of a video. Extensive experimental results show that our Light-VQA achieves the best performance against the current State-Of-The-Art (SOTA) on LLVE-QA and public dataset. Dataset and Codes can be found at https://github.com/wenzhouyidu/Light-VQA. △ Less

Submitted 6 August, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

arXiv:2305.05548 [pdf, ps, other]

CIT-EmotionNet: CNN Interactive Transformer Network for EEG Emotion Recognition

Authors: Wei Lu, Hua Ma, Tien-** Tan

Abstract: Emotion recognition using Electroencephalogram (EEG) signals has emerged as a significant research challenge in affective computing and intelligent interaction. However, effectively combining global and local features of EEG signals to improve performance in emotion recognition is still a difficult task. In this study, we propose a novel CNN Interactive Transformer Network for EEG Emotion Recognit… ▽ More Emotion recognition using Electroencephalogram (EEG) signals has emerged as a significant research challenge in affective computing and intelligent interaction. However, effectively combining global and local features of EEG signals to improve performance in emotion recognition is still a difficult task. In this study, we propose a novel CNN Interactive Transformer Network for EEG Emotion Recognition, known as CIT-EmotionNet, which efficiently integrates global and local features of EEG signals. Initially, we convert raw EEG signals into spatial-frequency representations, which serve as inputs. Then, we integrate Convolutional Neural Network (CNN) and Transformer within a single framework in a parallel manner. Finally, we design a CNN interactive Transformer module, which facilitates the interaction and fusion of local and global features, thereby enhancing the model's ability to extract both types of features from EEG spatial-frequency representations. The proposed CIT-EmotionNet outperforms state-of-the-art methods, achieving an average recognition accuracy of 98.57\% and 92.09\% on two publicly available datasets, SEED and SEED-IV, respectively. △ Less

Submitted 7 May, 2023; originally announced May 2023.

Comments: 10 pages,3 tables

arXiv:2304.13583 [pdf, other]

Multi-Modality Deep Network for Extreme Learned Image Compression

Authors: Xuhao Jiang, Weimin Tan, Tian Tan, Bo Yan, Liquan Shen

Abstract: Image-based single-modality compression learning approaches have demonstrated exceptionally powerful encoding and decoding capabilities in the past few years , but suffer from blur and severe semantics loss at extremely low bitrates. To address this issue, we propose a multimodal machine learning method for text-guided image compression, in which the semantic information of text is used as prior i… ▽ More Image-based single-modality compression learning approaches have demonstrated exceptionally powerful encoding and decoding capabilities in the past few years , but suffer from blur and severe semantics loss at extremely low bitrates. To address this issue, we propose a multimodal machine learning method for text-guided image compression, in which the semantic information of text is used as prior information to guide image compression for better compression performance. We fully study the role of text description in different components of the codec, and demonstrate its effectiveness. In addition, we adopt the image-text attention module and image-request complement module to better fuse image and text features, and propose an improved multimodal semantic-consistent loss to produce semantically complete reconstructions. Extensive experiments, including a user study, prove that our method can obtain visually pleasing results at extremely low bitrates, and achieves a comparable or even better performance than state-of-the-art methods, even though these methods are at 2x to 4x bitrates of ours. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Comments: 13 pages, 14 figures, accepted by AAAI 2023

arXiv:2304.13493 [pdf]

Towards clinical AI fairness: A translational perspective

Authors: Mingxuan Liu, Yilin Ning, Salinelat Teixayavong, Mayli Mertens, Jie Xu, Daniel Shu Wei Ting, Lionel Tim-Ee Cheng, Jasmine Chiat Ling Ong, Zhen Ling Teo, Ting Fang Tan, Ravi Chandran Narrendar, Fei Wang, Leo Anthony Celi, Marcus Eng Hock Ong, Nan Liu

Abstract: Artificial intelligence (AI) has demonstrated the ability to extract insights from data, but the issue of fairness remains a concern in high-stakes fields such as healthcare. Despite extensive discussion and efforts in algorithm development, AI fairness and clinical concerns have not been adequately addressed. In this paper, we discuss the misalignment between technical and clinical perspectives o… ▽ More Artificial intelligence (AI) has demonstrated the ability to extract insights from data, but the issue of fairness remains a concern in high-stakes fields such as healthcare. Despite extensive discussion and efforts in algorithm development, AI fairness and clinical concerns have not been adequately addressed. In this paper, we discuss the misalignment between technical and clinical perspectives of AI fairness, highlight the barriers to AI fairness' translation to healthcare, advocate multidisciplinary collaboration to bridge the knowledge gap, and provide possible solutions to address the clinical concerns pertaining to AI fairness. △ Less

Submitted 26 April, 2023; originally announced April 2023.

arXiv:2304.12566 [pdf, other]

AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation

Authors: Yi-Fan Zhang, Xue Wang, Kexin **, Kun Yuan, Zhang Zhang, Liang Wang, Rong **, Tieniu Tan

Abstract: Many recent machine learning tasks focus to develop models that can generalize to unseen distributions. Domain generalization (DG) has become one of the key topics in various fields. Several literatures show that DG can be arbitrarily hard without exploiting target domain information. To address this issue, test-time adaptive (TTA) methods are proposed. Existing TTA methods require offline target… ▽ More Many recent machine learning tasks focus to develop models that can generalize to unseen distributions. Domain generalization (DG) has become one of the key topics in various fields. Several literatures show that DG can be arbitrarily hard without exploiting target domain information. To address this issue, test-time adaptive (TTA) methods are proposed. Existing TTA methods require offline target data or extra sophisticated optimization procedures during the inference stage. In this work, we adopt Non-Parametric Classifier to perform the test-time Adaptation (AdaNPC). In particular, we construct a memory that contains the feature and label pairs from training domains. During inference, given a test instance, AdaNPC first recalls K closed samples from the memory to vote for the prediction, and then the test feature and predicted label are added to the memory. In this way, the sample distribution in the memory can be gradually changed from the training distribution towards the test distribution with very little extra computation cost. We theoretically justify the rationality behind the proposed method. Besides, we test our model on extensive numerical experiments. AdaNPC significantly outperforms competitive baselines on various DG benchmarks. In particular, when the adaptation target is a series of domains, the adaptation accuracy of AdaNPC is 50% higher than advanced TTA methods. The code is available at https://github.com/yfzhang114/AdaNPC. △ Less

Submitted 9 May, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

Comments: 30 pages, 12 figures

Journal ref: The Fortieth International Conference on Machine Learning, ICML, 2023

arXiv:2304.12034 [pdf]

Context Sensitivity without Contexts: A Cut-Shortcut Approach to Fast and Precise Pointer Analysis

Authors: Wenjie Ma, Shengyuan Yang, Tian Tan, Xiaoxing Ma, Chang Xu, Yue Li

Abstract: Over the past decades, context sensitivity has been considered as one of the most effective ideas for improving the precision of pointer analysis for Java. However, despite great precision benefits, as each method is equivalently cloned and analyzed under each context, context sensitivity brings heavy efficiency costs. In this work, we present a fundamentally different approach called Cut-Shortcut… ▽ More Over the past decades, context sensitivity has been considered as one of the most effective ideas for improving the precision of pointer analysis for Java. However, despite great precision benefits, as each method is equivalently cloned and analyzed under each context, context sensitivity brings heavy efficiency costs. In this work, we present a fundamentally different approach called Cut-Shortcut for fast and precise pointer analysis for Java. Its insight is simple: the main effect of cloning methods under different contexts is to filter spurious object flows that have been merged inside a callee method; from the view of a typical pointer flow graph (PFG), such effect can be simulated by cutting off (Cut) the edges that introduce precision loss to certain pointers and adding Shortcut edges directly from source pointers to the target ones circumventing the method on PFG. As a result, we can achieve the effect of context sensitivity without contexts. We identify three general program patterns and develop algorithms based on them to safely cut off and add shortcut edges on PFG, formalize them and formally prove the soundness. To comprehensively validate Cut-Shortcut's effectiveness, we implement two versions of Cut-Shortcut for two state-of-the-art pointer analysis frameworks for Java, one in Datalog for the declarative Doop and the other in Java for the imperative Tai-e, and we consider all the large and complex programs used in recent literatures that meet the experimental requirements. The evaluation results are extremely promising: Cut-Shortcut is even able to run faster than context insensitivity for most evaluated programs while obtaining high precision that is comparable to context sensitivity (if scalable) in both frameworks. This is for the first time that we have been able to achieve such a good efficiency and precision trade-off for those hard-to-analyze programs. △ Less

Submitted 24 April, 2023; originally announced April 2023.

Comments: Accepted paper at PLDI 2023

arXiv:2304.08078 [pdf, other]

Collaborative Feature Learning for Fine-grained Facial Forgery Detection and Segmentation

Authors: Weinan Guan, Wei Wang, **g Dong, Bo Peng, Tieniu Tan

Abstract: Detecting maliciously falsified facial images and videos has attracted extensive attention from digital-forensics and computer-vision communities. An important topic in manipulation detection is the localization of the fake regions. Previous work related to forgery detection mostly focuses on the entire faces. However, recent forgery methods have developed to edit important facial components while… ▽ More Detecting maliciously falsified facial images and videos has attracted extensive attention from digital-forensics and computer-vision communities. An important topic in manipulation detection is the localization of the fake regions. Previous work related to forgery detection mostly focuses on the entire faces. However, recent forgery methods have developed to edit important facial components while maintaining others unchanged. This drives us to not only focus on the forgery detection but also fine-grained falsified region segmentation. In this paper, we propose a collaborative feature learning approach to simultaneously detect manipulation and segment the falsified components. With the collaborative manner, detection and segmentation can boost each other efficiently. To enable our study of forgery detection and segmentation, we build a facial forgery dataset consisting of both entire and partial face forgeries with their pixel-level manipulation ground-truth. Experiment results have justified the mutual promotion between forgery detection and manipulated region segmentation. The overall performance of the proposed approach is better than the state-of-the-art detection or segmentation approaches. The visualization results have shown that our proposed model always captures the artifacts on facial regions, which is more reasonable. △ Less

Submitted 17 April, 2023; originally announced April 2023.

arXiv:2304.07239

Separating Key Agreement and Computational Differential Privacy

Authors: Vipul Arora, Eldon Chung, Zeyong Li, Thomas Tan

Abstract: Two party differential privacy allows two parties who do not trust each other, to come together and perform a joint analysis on their data whilst maintaining individual-level privacy. We show that any efficient, computationally differentially private protocol that has black-box access to key agreement (and nothing stronger), is also an efficient, information-theoretically differentially private pr… ▽ More Two party differential privacy allows two parties who do not trust each other, to come together and perform a joint analysis on their data whilst maintaining individual-level privacy. We show that any efficient, computationally differentially private protocol that has black-box access to key agreement (and nothing stronger), is also an efficient, information-theoretically differentially private protocol. In other words, the existence of efficient key agreement protocols is insufficient for efficient, computationally differentially private protocols. In doing so, we make progress in answering an open question posed by Vadhan about the minimal computational assumption needed for computational differential privacy. Combined with the information-theoretic lower bound due to McGregor, Mironov, Pitassi, Reingold, Talwar, and Vadhan in [FOCS'10], we show that there is no fully black-box reduction from efficient, computationally differentially private protocols for computing the Hamming distance (or equivalently inner product over the integers) on $n$ bits, with additive error lower than $O\left(\frac{\sqrt{n}}{e^ε\log(n)}\right)$, to key agreement. This complements the result by Haitner, Mazor, Silbak, and Tsfadia in [STOC'22], which showed that computing the Hamming distance implies key agreement. We conclude that key agreement is \emph{strictly} weaker than computational differential privacy for computing the inner product, thereby answering their open question on whether key agreement is sufficient. △ Less

Submitted 28 August, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

Comments: A key step in relating the probability that can be computed by the PSPACE algorithm to the statistical distinguishing probability is missing and not yet shown. Our arguments in this work so far have not yet been able to show this step. Thus the final conclusion that key agreement is black-box insufficient for CDP is not yet proven

arXiv:2304.06655 [pdf, other]

doi 10.1051/0004-6361/202345961

TOI-733 b: a planet in the small-planet radius valley orbiting a Sun-like star

Authors: Iskra Y. Georgieva, Carina M. Persson, Elisa Goffo, Lorena Acuña, Artyom Aguichine, Luisa M. Serrano, Kristine W. F. Lam, Davide Gandolfi, Karen A. Collins, Steven B. Howell, Fei Dai, Malcolm Fridlund, Judith Korth, Magali Deleuil, Oscar Barragán, William D. Cochran, Szilárd Csizmadia, Hans J. Deeg, Eike Guenther, Artie P. Hatzes, Jon M. Jenkins, John Livingston, Rafael Luque, Olivier Mousis, Hannah L. M. Osborne , et al. (18 additional authors not shown)

Abstract: We report the discovery of a hot ($T_{\rm eq}$ $\approx$ 1055 K) planet in the small planet radius valley transiting the Sun-like star TOI-733, as part of the KESPRINT follow-up program of TESS planets carried out with the HARPS spectrograph. TESS photometry from sectors 9 and 36 yields an orbital period of $P_{\rm orb}$ = $4.884765 _{ - 2.4e-5 } ^ { + 1.9e-5 }$ days and a radius of… ▽ More We report the discovery of a hot ($T_{\rm eq}$ $\approx$ 1055 K) planet in the small planet radius valley transiting the Sun-like star TOI-733, as part of the KESPRINT follow-up program of TESS planets carried out with the HARPS spectrograph. TESS photometry from sectors 9 and 36 yields an orbital period of $P_{\rm orb}$ = $4.884765 _{ - 2.4e-5 } ^ { + 1.9e-5 }$ days and a radius of $R_{\mathrm{p}}$ = $1.992 _{ - 0.090 } ^ { + 0.085 }$ $R_{\oplus}$. Multi-dimensional Gaussian process modelling of the radial velocity measurements from HARPS and activity indicators, gives a semi-amplitude of $K$ = $2.23 \pm 0.26 $ m s$^{-1}$, translating into a planet mass of $M_{\mathrm{p}}$ = $5.72 _{ - 0.68 } ^ { + 0.70 }$ $M_{\oplus}$. These parameters imply that the planet is of moderate density ($ρ_\mathrm{p}$ = $3.98 _{ - 0.66 } ^ { + 0.77 }$ g cm$^{-3}$) and place it in the transition region between rocky and volatile-rich planets with H/He-dominated envelopes on the mass-radius diagram. Combining these with stellar parameters and abundances, we calculate planet interior and atmosphere models, which in turn suggest that TOI-733 b has a volatile-enriched, most likely secondary outer envelope, and may represent a highly irradiated ocean world - one of only a few such planets around G-type stars that are well-characterised. △ Less

Submitted 26 April, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

Comments: Accepted for publication in A&A

Journal ref: A&A 674, A117 (2023)

arXiv:2303.17480 [pdf, other]

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

Authors: Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li

Abstract: Talking face generation, also known as speech-to-lip generation, reconstructs facial motions concerning lips given coherent speech input. The previous studies revealed the importance of lip-speech synchronization and visual quality. Despite much progress, they hardly focus on the content of lip movements i.e., the visual intelligibility of the spoken words, which is an important aspect of generati… ▽ More Talking face generation, also known as speech-to-lip generation, reconstructs facial motions concerning lips given coherent speech input. The previous studies revealed the importance of lip-speech synchronization and visual quality. Despite much progress, they hardly focus on the content of lip movements i.e., the visual intelligibility of the spoken words, which is an important aspect of generation quality. To address the problem, we propose using a lip-reading expert to improve the intelligibility of the generated lip regions by penalizing the incorrect generation results. Moreover, to compensate for data scarcity, we train the lip-reading expert in an audio-visual self-supervised manner. With a lip-reading expert, we propose a novel contrastive learning to enhance lip-speech synchronization, and a transformer to encode audio synchronically with video, while considering global temporal dependency of audio. For evaluation, we propose a new strategy with two different lip-reading experts to measure intelligibility of the generated videos. Rigorous experiments show that our proposal is superior to other State-of-the-art (SOTA) methods, such as Wav2Lip, in reading intelligibility i.e., over 38% Word Error Rate (WER) on LRS2 dataset and 27.8% accuracy on LRW dataset. We also achieve the SOTA performance in lip-speech synchronization and comparable performances in visual quality. △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: accepted by CVPR 2023

arXiv:2303.15361 [pdf, other]

A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts

Authors: Jian Liang, Ran He, Tieniu Tan

Abstract: Machine learning methods strive to acquire a robust model during training that can generalize well to test samples, even under distribution shifts. However, these methods often suffer from a performance drop due to unknown test distributions. Test-time adaptation (TTA), an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.… ▽ More Machine learning methods strive to acquire a robust model during training that can generalize well to test samples, even under distribution shifts. However, these methods often suffer from a performance drop due to unknown test distributions. Test-time adaptation (TTA), an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions. Recent progress in this paradigm highlights the significant benefits of utilizing unlabeled data for training self-adapted models prior to inference. In this survey, we divide TTA into several distinct categories, namely, test-time (source-free) domain adaptation, test-time batch adaptation, online test-time adaptation, and test-time prior adaptation. For each category, we provide a comprehensive taxonomy of advanced algorithms, followed by a discussion of different learning scenarios. Furthermore, we analyze relevant applications of TTA and discuss open challenges and promising areas for future research. A comprehensive list of TTA methods can be found at \url{https://github.com/tim-learn/awesome-test-time-adaptation}. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: Discussions, comments, and questions are all welcomed in \url{https://github.com/tim-learn/awesome-test-time-adaptation}

arXiv:2303.14123 [pdf, other]

Semantic Prompt for Few-Shot Image Recognition

Authors: Wentao Chen, Chenyang Si, Zhang Zhang, Liang Wang, Zilei Wang, Tieniu Tan

Abstract: Few-shot learning is a challenging problem since only a few examples are provided to recognize a new class. Several recent studies exploit additional semantic information, e.g. text embeddings of class names, to address the issue of rare samples through combining semantic prototypes with visual prototypes. However, these methods still suffer from the spurious visual features learned from the rare… ▽ More Few-shot learning is a challenging problem since only a few examples are provided to recognize a new class. Several recent studies exploit additional semantic information, e.g. text embeddings of class names, to address the issue of rare samples through combining semantic prototypes with visual prototypes. However, these methods still suffer from the spurious visual features learned from the rare support samples, resulting in limited benefits. In this paper, we propose a novel Semantic Prompt (SP) approach for few-shot learning. Instead of the naive exploitation of semantic information for remedying classifiers, we explore leveraging semantic information as prompts to tune the visual feature extraction network adaptively. Specifically, we design two complementary mechanisms to insert semantic prompts into the feature extractor: one is to enable the interaction between semantic prompts and patch embeddings along the spatial dimension via self-attention, another is to supplement visual features with the transformed semantic prompts along the channel dimension. By combining these two mechanisms, the feature extractor presents a better ability to attend to the class-specific features and obtains more generalized image representations with merely a few support samples. Through extensive experiments on four datasets, the proposed approach achieves promising results, improving the 1-shot learning accuracy by 3.67% on average. △ Less

Submitted 24 March, 2023; originally announced March 2023.

Comments: Accepted by CVPR 2023

arXiv:2303.13853 [pdf, other]

2PCNet: Two-Phase Consistency Training for Day-to-Night Unsupervised Domain Adaptive Object Detection

Authors: Mikhail Kennerley, Jian-Gang Wang, Bharadwaj Veeravalli, Robby T. Tan

Abstract: Object detection at night is a challenging problem due to the absence of night image annotations. Despite several domain adaptation methods, achieving high-precision results remains an issue. False-positive error propagation is still observed in methods using the well-established student-teacher framework, particularly for small-scale and low-light objects. This paper proposes a two-phase consiste… ▽ More Object detection at night is a challenging problem due to the absence of night image annotations. Despite several domain adaptation methods, achieving high-precision results remains an issue. False-positive error propagation is still observed in methods using the well-established student-teacher framework, particularly for small-scale and low-light objects. This paper proposes a two-phase consistency unsupervised domain adaptation network, 2PCNet, to address these issues. The network employs high-confidence bounding-box predictions from the teacher in the first phase and appends them to the student's region proposals for the teacher to re-evaluate in the second phase, resulting in a combination of high and low confidence pseudo-labels. The night images and pseudo-labels are scaled-down before being used as input to the student, providing stronger small-scale pseudo-labels. To address errors that arise from low-light regions and other night-related attributes in images, we propose a night-specific augmentation pipeline called NightAug. This pipeline involves applying random augmentations, such as glare, blur, and noise, to daytime images. Experiments on publicly available datasets demonstrate that our method achieves superior results to state-of-the-art methods by 20\%, and to supervised models trained directly on the target data. △ Less

Submitted 24 March, 2023; originally announced March 2023.

Comments: Accepted into CVPR'23

arXiv:2303.10876 [pdf, other]

EqMotion: Equivariant Multi-agent Motion Prediction with Invariant Interaction Reasoning

Authors: Chenxin Xu, Robby T. Tan, Yuhong Tan, Siheng Chen, Yu Guang Wang, Xinchao Wang, Yanfeng Wang

Abstract: Learning to predict agent motions with relationship reasoning is important for many applications. In motion prediction tasks, maintaining motion equivariance under Euclidean geometric transformations and invariance of agent interaction is a critical and fundamental principle. However, such equivariance and invariance properties are overlooked by most existing methods. To fill this gap, we propose… ▽ More Learning to predict agent motions with relationship reasoning is important for many applications. In motion prediction tasks, maintaining motion equivariance under Euclidean geometric transformations and invariance of agent interaction is a critical and fundamental principle. However, such equivariance and invariance properties are overlooked by most existing methods. To fill this gap, we propose EqMotion, an efficient equivariant motion prediction model with invariant interaction reasoning. To achieve motion equivariance, we propose an equivariant geometric feature learning module to learn a Euclidean transformable feature through dedicated designs of equivariant operations. To reason agent's interactions, we propose an invariant interaction reasoning module to achieve a more stable interaction modeling. To further promote more comprehensive motion features, we propose an invariant pattern feature learning module to learn an invariant pattern feature, which cooperates with the equivariant geometric feature to enhance network expressiveness. We conduct experiments for the proposed model on four distinct scenarios: particle dynamics, molecule dynamics, human skeleton motion prediction and pedestrian trajectory prediction. Experimental results show that our method is not only generally applicable, but also achieves state-of-the-art prediction performances on all the four tasks, improving by 24.0/30.1/8.6/9.2%. Code is available at https://github.com/MediaBrain-SJTU/EqMotion. △ Less

Submitted 27 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

Comments: Accepted to CVPR 2023

arXiv:2303.10594 [pdf, other]

AdaptGuard: Defending Against Universal Attacks for Model Adaptation

Authors: Lijun Sheng, Jian Liang, Ran He, Zilei Wang, Tieniu Tan

Abstract: Model adaptation aims at solving the domain transfer problem under the constraint of only accessing the pretrained source models. With the increasing considerations of data privacy and transmission efficiency, this paradigm has been gaining recent popularity. This paper studies the vulnerability to universal attacks transferred from the source domain during model adaptation algorithms due to the e… ▽ More Model adaptation aims at solving the domain transfer problem under the constraint of only accessing the pretrained source models. With the increasing considerations of data privacy and transmission efficiency, this paradigm has been gaining recent popularity. This paper studies the vulnerability to universal attacks transferred from the source domain during model adaptation algorithms due to the existence of malicious providers. We explore both universal adversarial perturbations and backdoor attacks as loopholes on the source side and discover that they still survive in the target models after adaptation. To address this issue, we propose a model preprocessing framework, named AdaptGuard, to improve the security of model adaptation algorithms. AdaptGuard avoids direct use of the risky source parameters through knowledge distillation and utilizes the pseudo adversarial samples under adjusted radius to enhance the robustness. AdaptGuard is a plug-and-play module that requires neither robust pretrained models nor any changes for the following model adaptation algorithms. Extensive results on three commonly used datasets and two popular adaptation methods validate that AdaptGuard can effectively defend against universal attacks and maintain clean accuracy in the target domain simultaneously. We hope this research will shed light on the safety and robustness of transfer learning. Code is available at https://github.com/TomSheng21/AdaptGuard. △ Less

Submitted 27 November, 2023; v1 submitted 19 March, 2023; originally announced March 2023.

Comments: ICCV2023

arXiv:2303.09849 [pdf, other]

Exploiting Semantic Attributes for Transductive Zero-Shot Learning

Authors: Zhengbo Wang, Jian Liang, Zilei Wang, Tieniu Tan

Abstract: Zero-shot learning (ZSL) aims to recognize unseen classes by generalizing the relation between visual features and semantic attributes learned from the seen classes. A recent paradigm called transductive zero-shot learning further leverages unlabeled unseen data during training and has obtained impressive results. These methods always synthesize unseen features from attributes through a generative… ▽ More Zero-shot learning (ZSL) aims to recognize unseen classes by generalizing the relation between visual features and semantic attributes learned from the seen classes. A recent paradigm called transductive zero-shot learning further leverages unlabeled unseen data during training and has obtained impressive results. These methods always synthesize unseen features from attributes through a generative adversarial network to mitigate the bias towards seen classes. However, they neglect the semantic information in the unlabeled unseen data and thus fail to generate high-fidelity attribute-consistent unseen features. To address this issue, we present a novel transductive ZSL method that produces semantic attributes of the unseen data and imposes them on the generative process. In particular, we first train an attribute decoder that learns the map** from visual features to semantic attributes. Then, from the attribute decoder, we obtain pseudo-attributes of unlabeled data and integrate them into the generative model, which helps capture the detailed differences within unseen classes so as to synthesize more discriminative features. Experiments on five standard benchmarks show that our method yields state-of-the-art results for zero-shot learning. △ Less

Submitted 17 March, 2023; originally announced March 2023.

Comments: ICASSP 2023

arXiv:2303.08320 [pdf, other]

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

Authors: Zhengxiong Luo, Dayou Chen, Yingya Zhang, Yan Huang, Liang Wang, Yujun Shen, Deli Zhao, **gren Zhou, Tieniu Tan

Abstract: A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distribution. Despite its recent success in image synthesis, applying DPMs to video generation is still challenging due to high-dimensional data spaces. Previous methods u… ▽ More A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distribution. Despite its recent success in image synthesis, applying DPMs to video generation is still challenging due to high-dimensional data spaces. Previous methods usually adopt a standard diffusion process, where frames in the same video clip are destroyed with independent noises, ignoring the content redundancy and temporal correlation. This work presents a decomposed diffusion process via resolving the per-frame noise into a base noise that is shared among all frames and a residual noise that varies along the time axis. The denoising pipeline employs two jointly-learned networks to match the noise decomposition accordingly. Experiments on various datasets confirm that our approach, termed as VideoFusion, surpasses both GAN-based and diffusion-based alternatives in high-quality video generation. We further show that our decomposed formulation can benefit from pre-trained image diffusion models and well-support text-conditioned video creation. △ Less

Submitted 12 October, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

Comments: Accepted to CVPR2023

arXiv:2303.06285 [pdf, other]

DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation

Authors: Yueming Lyu, Tianwei Lin, Fu Li, Dongliang He, **g Dong, Tieniu Tan

Abstract: Text-driven image manipulation remains challenging in training or inference flexibility. Conditional generative models depend heavily on expensive annotated training data. Meanwhile, recent frameworks, which leverage pre-trained vision-language models, are limited by either per text-prompt optimization or inference-time hyper-parameters tuning. In this work, we propose a novel framework named \tex… ▽ More Text-driven image manipulation remains challenging in training or inference flexibility. Conditional generative models depend heavily on expensive annotated training data. Meanwhile, recent frameworks, which leverage pre-trained vision-language models, are limited by either per text-prompt optimization or inference-time hyper-parameters tuning. In this work, we propose a novel framework named \textit{DeltaEdit} to address these problems. Our key idea is to investigate and identify a space, namely delta image and text space that has well-aligned distribution between CLIP visual feature differences of two images and CLIP textual embedding differences of source and target texts. Based on the CLIP delta space, the DeltaEdit network is designed to map the CLIP visual features differences to the editing directions of StyleGAN at training phase. Then, in inference phase, DeltaEdit predicts the StyleGAN's editing directions from the differences of the CLIP textual features. In this way, DeltaEdit is trained in a text-free manner. Once trained, it can well generalize to various text prompts for zero-shot inference without bells and whistles. Code is available at https://github.com/Yueming6568/DeltaEdit. △ Less

Submitted 10 March, 2023; originally announced March 2023.

Comments: Accepted by CVPR2023. Code is available at https://github.com/Yueming6568/DeltaEdit

arXiv:2303.05171 [pdf, other]

RiDDLE: Reversible and Diversified De-identification with Latent Encryptor

Authors: Dongze Li, Wei Wang, Kang Zhao, **g Dong, Tieniu Tan

Abstract: This work presents RiDDLE, short for Reversible and Diversified De-identification with Latent Encryptor, to protect the identity information of people from being misused. Built upon a pre-learned StyleGAN2 generator, RiDDLE manages to encrypt and decrypt the facial identity within the latent space. The design of RiDDLE has three appealing properties. First, the encryption process is cipher-guided… ▽ More This work presents RiDDLE, short for Reversible and Diversified De-identification with Latent Encryptor, to protect the identity information of people from being misused. Built upon a pre-learned StyleGAN2 generator, RiDDLE manages to encrypt and decrypt the facial identity within the latent space. The design of RiDDLE has three appealing properties. First, the encryption process is cipher-guided and hence allows diverse anonymization using different passwords. Second, the true identity can only be decrypted with the correct password, otherwise the system will produce another de-identified face to maintain the privacy. Third, both encryption and decryption share an efficient implementation, benefiting from a carefully tailored lightweight encryptor. Comparisons with existing alternatives confirm that our approach accomplishes the de-identification task with better quality, higher diversity, and stronger reversibility. We further demonstrate the effectiveness of RiDDLE in anonymizing videos. Code and models will be made publicly available. △ Less

Submitted 23 April, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

Comments: Accepted by CVPR 2023

arXiv:2303.03721 [pdf, other]

doi 10.1103/PhysRevA.107.042809

Precision theoretical determination of electric-dipole matrix elements in atomic cesium

Authors: H. B. Tran Tan, A. Derevianko

Abstract: We compute the reduced electric-dipole matrix elements $\langle{nS_{1/2}}||D||{n'P_J}\rangle$ with $n=6,7$ and $n'=6,7,\ldots,12$ in cesium using the most complete to date ab initio relativistic coupled-cluster method which includes singles, doubles, perturbative core triples, and valence triples. Our results agree with previous calculations at the linearized single double level but also show larg… ▽ More We compute the reduced electric-dipole matrix elements $\langle{nS_{1/2}}||D||{n'P_J}\rangle$ with $n=6,7$ and $n'=6,7,\ldots,12$ in cesium using the most complete to date ab initio relativistic coupled-cluster method which includes singles, doubles, perturbative core triples, and valence triples. Our results agree with previous calculations at the linearized single double level but also show large contributions from nonlinear singles and doubles as well as valence triples. We also calculate the normalized ratio $ξ_{n,n'}\equiv(1/\sqrt{2})\langle{nS_{1/2}}||D||{n'P_{1/2}}\rangle/\langle{nS_{1/2}}||D||{n'P_{3/2}}\rangle$ which is important for experimental determination of matrix elements. The ratios $ξ_{6,n}$ display large deviations from the nonrelativistic limit which we associate with Cooper-like minima. Several appendices are provided where we document the procedure for constructing finite basis sets and our implementation of the random phase approximation and Brueckner-orbitals method. △ Less

Submitted 14 April, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

Comments: 26 pages, 17 figures v.2: uncertainties for removal energies provided, further comments and additional references added, typos corrected

arXiv:2302.08009 [pdf, other]

doi 10.1103/PhysRevResearch.6.023001

Scrambling and Recovery of Quantum Information in Inhomogeneous Quenches in Two-dimensional Conformal Field Theories

Authors: Kanato Goto, Masahiro Nozaki, Shinsei Ryu, Kotaro Tamaoka, Mao Tian Tan

Abstract: We study various quantum quench processes induced by the Möbius/sine-square deformation of the Hamiltonian in two-dimensional conformal field theories starting from the thermofield double state in the two copies of the Hilbert space. These quantum quenches, some of which are directly related to the operator entanglement of the time-evolution operators, allow us to study scrambling and recovery of… ▽ More We study various quantum quench processes induced by the Möbius/sine-square deformation of the Hamiltonian in two-dimensional conformal field theories starting from the thermofield double state in the two copies of the Hilbert space. These quantum quenches, some of which are directly related to the operator entanglement of the time-evolution operators, allow us to study scrambling and recovery of quantum information. In particular, under the SSD time-evolution, we show from the time-dependence of mutual information that the Bell pairs, initially shared by the subsystems of the two Hilbert spaces, may revive even after the mutual information for small subsystems is completely destroyed by quantum information scrambling dynamics. This mutual information is robust against the strong scrambling dynamics. As a consequence, the steady state has a non-local correlation shared not by any of two parties but by three parties. In the holographic dual description, a wormhole connecting the two Hilbert spaces may non-linearly grow with time during the quantum quenches. We also propose effective pictures that describe the dynamics of mutual information during the time-evolution by inhomogeneous Hamiltonians. △ Less

Submitted 15 February, 2023; originally announced February 2023.

Comments: 36+26 pages, 23 figures

Report number: RIKEN-iTHEMS-Report-23

Journal ref: Phys. Rev. Research 6, 023001 (2024)

arXiv:2302.04922 [pdf, other]

Validating AU Microscopii d with Transit Timing Variations

Authors: Justin M. Wittrock, Peter Plavchan, Bryson L. Cale, Thomas Barclay, Mathis R. Ludwig, Richard P. Schwarz, Djamel Mekarnia, Amaury Triaud, Lyu Abe, Olga Suarez, Tristan Guillot, Dennis M. Conti, Karen A. Collins, Ian A. Waite, John F. Kielkopf, Kevin I. Collins, Stefan Dreizler, Mohammed El Mufti, Dax Feliz, Eric Gaidos, Claire Geneser, Keith Horne, Stephen R. Kane, Patrick J. Lowrance, Eder Martioli , et al. (9 additional authors not shown)

Abstract: AU Mic is a young (22 Myr) nearby exoplanetary system that exhibits excess TTVs that cannot be accounted for by the two known transiting planets nor stellar activity. We present the statistical "validation" of the tentative planet AU Mic d (even though there are examples of "confirmed" planets with ambiguous orbital periods). We add 18 new transits and nine midpoint times in an updated TTV analysi… ▽ More AU Mic is a young (22 Myr) nearby exoplanetary system that exhibits excess TTVs that cannot be accounted for by the two known transiting planets nor stellar activity. We present the statistical "validation" of the tentative planet AU Mic d (even though there are examples of "confirmed" planets with ambiguous orbital periods). We add 18 new transits and nine midpoint times in an updated TTV analysis to prior work. We perform the joint modeling of transit light curves using EXOFASTv2 and extract the transit midpoint times. Next, we construct an O-C diagram and use Exo-Striker to model the TTVs. We generate TTV log-likelihood periodograms to explore possible solutions for the period of planet d and then follow those up with detailed TTV and RV MCMC modeling and stability tests. We find several candidate periods for AU Mic d, all of which are near resonances with AU Mic b and c of varying order. Based on our model comparisons, the most-favored orbital period of AU Mic d is 12.73596+/-0.00793 days (T_{C,d}=2458340.55781+/-0.11641 BJD), which puts the three planets near a 4:6:9 mean-motion orbital resonance. The mass for d is 1.053+/-0.511 M_E, making this planet Earth-like in mass. If confirmed, AU Mic d would be the first known Earth-mass planet orbiting a young star and would provide a valuable opportunity in probing a young terrestrial planet's atmosphere. Additional TTV observation of the AU Mic system are needed to further constrain the planetary masses, search for possible transits of AU Mic d, and detect possible additional planets beyond AU Mic c. △ Less

Submitted 15 September, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

Comments: 89 pages, 35 figures, 34 tables. Redid EXOFASTv2 transit modeling to recover more reasonable stellar posteriors, so redid Exo-Striker TTV modeling for consistency. Despite these changes, the overall results remain unchanged: the 12-7-day case is still the most favored. Submitted to AAS Journals on 2023 Feb 9th

arXiv:2302.01788 [pdf, other]

IMPORTANT-Net: Integrated MRI Multi-Parameter Reinforcement Fusion Generator with Attention Network for Synthesizing Absent Data

Authors: Tianyu Zhang, Tao Tan, Luyi Han, Xin Wang, Yuan Gao, Jonas Teuwen, Regina Beets-Tan, Ritse Mann

Abstract: Magnetic resonance imaging (MRI) is highly sensitive for lesion detection in the breasts. Sequences obtained with different settings can capture the specific characteristics of lesions. Such multi-parameter MRI information has been shown to improve radiologist performance in lesion classification, as well as improving the performance of artificial intelligence models in various tasks. However, obt… ▽ More Magnetic resonance imaging (MRI) is highly sensitive for lesion detection in the breasts. Sequences obtained with different settings can capture the specific characteristics of lesions. Such multi-parameter MRI information has been shown to improve radiologist performance in lesion classification, as well as improving the performance of artificial intelligence models in various tasks. However, obtaining multi-parameter MRI makes the examination costly in both financial and time perspectives, and there may be safety concerns for special populations, thus making acquisition of the full spectrum of MRI sequences less durable. In this study, different than naive input fusion or feature concatenation from existing MRI parameters, a novel $\textbf{I}$ntegrated MRI $\textbf{M}$ulti-$\textbf{P}$arameter reinf$\textbf{O}$rcement fusion generato$\textbf{R}$ wi$\textbf{T}$h $\textbf{A}$tte$\textbf{NT}$ion Network (IMPORTANT-Net) is developed to generate missing parameters. First, the parameter reconstruction module is used to encode and restore the existing MRI parameters to obtain the corresponding latent representation information at any scale level. Then the multi-parameter fusion with attention module enables the interaction of the encoded information from different parameters through a set of algorithmic strategies, and applies different weights to the information through the attention mechanism after information fusion to obtain refined representation information. Finally, a reinforcement fusion scheme embedded in a $V^{-}$-shape generation module is used to combine the hierarchical representations to generate the missing MRI parameter. Results showed that our IMPORTANT-Net is capable of generating missing MRI parameters and outperforms comparable state-of-the-art networks. Our code is available at https://github.com/Netherlands-Cancer-Institute/MRI_IMPORTANT_NET. △ Less

Submitted 3 February, 2023; originally announced February 2023.

arXiv:2302.01608 [pdf, other]

CFFT-GAN: Cross-domain Feature Fusion Transformer for Exemplar-based Image Translation

Authors: Tianxiang Ma, Bingchuan Li, Wei Liu, Miao Hua, **g Dong, Tieniu Tan

Abstract: Exemplar-based image translation refers to the task of generating images with the desired style, while conditioning on certain input image. Most of the current methods learn the correspondence between two input domains and lack the mining of information within the domains. In this paper, we propose a more general learning approach by considering two domain features as a whole and learning both int… ▽ More Exemplar-based image translation refers to the task of generating images with the desired style, while conditioning on certain input image. Most of the current methods learn the correspondence between two input domains and lack the mining of information within the domains. In this paper, we propose a more general learning approach by considering two domain features as a whole and learning both inter-domain correspondence and intra-domain potential information interactions. Specifically, we propose a Cross-domain Feature Fusion Transformer (CFFT) to learn inter- and intra-domain feature fusion. Based on CFFT, the proposed CFFT-GAN works well on exemplar-based image translation. Moreover, CFFT-GAN is able to decouple and fuse features from multiple domains by cascading CFFT modules. We conduct rich quantitative and qualitative experiments on several image translation tasks, and the results demonstrate the superiority of our approach compared to state-of-the-art methods. Ablation studies show the importance of our proposed CFFT. Application experimental results reflect the potential of our method. △ Less

Submitted 3 February, 2023; originally announced February 2023.

Comments: Accepted by AAAI2023

arXiv:2302.01579 [pdf, other]

Semantic 3D-aware Portrait Synthesis and Manipulation Based on Compositional Neural Radiance Field

Authors: Tianxiang Ma, Bingchuan Li, Qian He, **g Dong, Tieniu Tan

Abstract: Recently 3D-aware GAN methods with neural radiance field have developed rapidly. However, current methods model the whole image as an overall neural radiance field, which limits the partial semantic editability of synthetic results. Since NeRF renders an image pixel by pixel, it is possible to split NeRF in the spatial dimension. We propose a Compositional Neural Radiance Field (CNeRF) for semanti… ▽ More Recently 3D-aware GAN methods with neural radiance field have developed rapidly. However, current methods model the whole image as an overall neural radiance field, which limits the partial semantic editability of synthetic results. Since NeRF renders an image pixel by pixel, it is possible to split NeRF in the spatial dimension. We propose a Compositional Neural Radiance Field (CNeRF) for semantic 3D-aware portrait synthesis and manipulation. CNeRF divides the image by semantic regions and learns an independent neural radiance field for each region, and finally fuses them and renders the complete image. Thus we can manipulate the synthesized semantic regions independently, while fixing the other parts unchanged. Furthermore, CNeRF is also designed to decouple shape and texture within each semantic region. Compared to state-of-the-art 3D-aware GAN methods, our approach enables fine-grained semantic region manipulation, while maintaining high-quality 3D-consistent synthesis. The ablation studies show the effectiveness of the structure and loss function used by our method. In addition real image inversion and cartoon portrait 3D editing experiments demonstrate the application potential of our method. △ Less

Submitted 10 April, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: Accepted by AAAI2023 Oral

arXiv:2302.00517 [pdf, other]

Synthesis-based Imaging-Differentiation Representation Learning for Multi-Sequence 3D/4D MRI

Authors: Luyi Han, Tao Tan, Tianyu Zhang, Yunzhi Huang, Xin Wang, Yuan Gao, Jonas Teuwen, Ritse Mann

Abstract: Multi-sequence MRIs can be necessary for reliable diagnosis in clinical practice due to the complimentary information within sequences. However, redundant information exists across sequences, which interferes with mining efficient representations by modern machine learning or deep learning models. To handle various clinical scenarios, we propose a sequence-to-sequence generation framework (Seq2Seq… ▽ More Multi-sequence MRIs can be necessary for reliable diagnosis in clinical practice due to the complimentary information within sequences. However, redundant information exists across sequences, which interferes with mining efficient representations by modern machine learning or deep learning models. To handle various clinical scenarios, we propose a sequence-to-sequence generation framework (Seq2Seq) for imaging-differentiation representation learning. In this study, not only do we propose arbitrary 3D/4D sequence generation within one model to generate any specified target sequence, but also we are able to rank the importance of each sequence based on a new metric estimating the difficulty of a sequence being generated. Furthermore, we also exploit the generation inability of the model to extract regions that contain unique information for each sequence. We conduct extensive experiments using three datasets including a toy dataset of 20,000 simulated subjects, a brain MRI dataset of 1,251 subjects, and a breast MRI dataset of 2,101 subjects, to demonstrate that (1) our proposed Seq2Seq is efficient and lightweight for complex clinical datasets and can achieve excellent image quality; (2) top-ranking sequences can be used to replace complete sequences with non-inferior performance; (3) combining MRI with our imaging-differentiation map leads to better performance in clinical tasks such as glioblastoma MGMT promoter methylation status prediction and breast cancer pathological complete response status prediction. Our code is available at https://github.com/fiy2W/mri_seq2seq. △ Less

Submitted 1 February, 2023; originally announced February 2023.

arXiv:2302.00194 [pdf, other]

Free Lunch for Domain Adversarial Training: Environment Label Smoothing

Authors: YiFan Zhang, Xue Wang, Jian Liang, Zhang Zhang, Liang Wang, Rong **, Tieniu Tan

Abstract: A fundamental challenge for machine learning models is how to generalize learned models for out-of-distribution (OOD) data. Among various approaches, exploiting invariant features by Domain Adversarial Training (DAT) received widespread attention. Despite its success, we observe training instability from DAT, mostly due to over-confident domain discriminator and environment label noise. To address… ▽ More A fundamental challenge for machine learning models is how to generalize learned models for out-of-distribution (OOD) data. Among various approaches, exploiting invariant features by Domain Adversarial Training (DAT) received widespread attention. Despite its success, we observe training instability from DAT, mostly due to over-confident domain discriminator and environment label noise. To address this issue, we proposed Environment Label Smoothing (ELS), which encourages the discriminator to output soft probability, which thus reduces the confidence of the discriminator and alleviates the impact of noisy environment labels. We demonstrate, both experimentally and theoretically, that ELS can improve training stability, local convergence, and robustness to noisy environment labels. By incorporating ELS with DAT methods, we are able to yield state-of-art results on a wide range of domain generalization/adaptation tasks, particularly when the environment labels are highly noisy. △ Less

Submitted 31 January, 2023; originally announced February 2023.

Comments: ICLR 2023, 38 pages, 8 figures, 18 tables

arXiv:2301.09865 [pdf, other]

doi 10.3847/1538-3881/acd548

VaTEST II: Statistical Validation of 11 TESS-Detected Exoplanets Orbiting K-type Stars

Authors: Priyashkumar Mistry, Kamlesh Pathak, Aniket Prasad, Georgios Lekkas, Surendra Bhattarai, Sarvesh Gharat, Mousam Maity, Dhruv Kumar, Karen A. Collins, Richard P. Schwarz, Christopher R. Mann, Elise Furlan, Steve B. Howell, David Ciardi, Allyson Bieryla, Elisabeth C. Matthews, Erica Gonzales, Carl Ziegler, Ian Crossfield, Steven Giacalone, Thiam-Guan Tan, Phil Evans, Krzysztof G. Helminiak, Kevin I. Collins, Norio Narita , et al. (26 additional authors not shown)

Abstract: NASA's Transiting Exoplanet Survey Satellite (TESS) is an all-sky survey mission designed to find transiting exoplanets orbiting nearby bright stars. It has identified more than 329 transiting exoplanets, and almost 6,000 candidates remain unvalidated. In this manuscript, we discuss the findings from the ongoing VaTEST (Validation of Transiting Exoplanets using Statistical Tools) project, which ai… ▽ More NASA's Transiting Exoplanet Survey Satellite (TESS) is an all-sky survey mission designed to find transiting exoplanets orbiting nearby bright stars. It has identified more than 329 transiting exoplanets, and almost 6,000 candidates remain unvalidated. In this manuscript, we discuss the findings from the ongoing VaTEST (Validation of Transiting Exoplanets using Statistical Tools) project, which aims to validate new exoplanets for further characterization. We validated 11 new exoplanets by examining the light curves of 24 candidates using the LATTE and TESS-Plot tools and computing the False Positive Probabilities using the statistical validation tool TRICERATOPS. These include planets suitable for atmospheric characterization using transmission spectroscopy (TOI-2194b), emission spectroscopy (TOI-3082b and TOI-5704b) and for both transmission and emission spectroscopy (TOI-672b, TOI- 1694b, and TOI-2443b); One super-Earth (TOI-2194b) orbiting a bright (V = 8.42 mag), metal-poor ([Fe/H] = -0.3720 $\pm$ 0.1) star; one short-period Neptune-like planet (TOI-5704) in the Hot Neptune Desert. In total, we validated 1 super-Earth, 7 sub-Neptunes, 1 Neptune-like, and 2 sub-Saturn or super-Neptune-like exoplanets. Additionally, we identify five likely planet candidates (TOI-323, TOI- 1180, TOI-2200, TOI-2408 and TOI-3913) which can be further studied to establish their planetary nature. △ Less

Submitted 13 May, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

Comments: Accepted for Publication in Astronomical Journal, 28 Pages, 7 Figures

arXiv:2301.07950 [pdf]

doi 10.1002/adma.202207777

A Monolithic Graphene-Functionalized Microlaser for Multispecies Gas Detection

Authors: Yanhong Guo, Zhaoyu Li, Ning An, Yongzheng Guo, Yuchen Wang, Yusen Yuan, Hao Zhang, Teng Tan, Caihao Wu, Bo Peng, Giancarlo Soavi, Yunjiang Rao, Baicheng Yao

Abstract: Optical microcavity enhanced light-matter interaction offers a powerful tool to develop fast and precise sensing techniques, spurring applications in the detection of biochemical targets ranging from cells, nanoparticles, and large molecules. However, the intrinsic inertness of such pristine microresonators limits their spread in new fields such as gas detection. Here, a functionalized microlaser… ▽ More Optical microcavity enhanced light-matter interaction offers a powerful tool to develop fast and precise sensing techniques, spurring applications in the detection of biochemical targets ranging from cells, nanoparticles, and large molecules. However, the intrinsic inertness of such pristine microresonators limits their spread in new fields such as gas detection. Here, a functionalized microlaser sensor is realized by depositing graphene in an erbium-doped over-modal microsphere. By using a 980 nm pump, multiple laser lines excited in different mode families of the microresonator are co-generated in a single device. The interference between these splitting mode lasers produce beat notes in the electrical domain (0.2-1.1 MHz) with sub-kHz accuracy, thanks to the graphene-induced intracavity backward scattering. This allows for multispecies gas identification from a mixture, and ultrasensitive gas detection down to individual molecule. △ Less

Submitted 19 January, 2023; originally announced January 2023.

Journal ref: Advanced Materials 34 (2022) 2207777

arXiv:2301.06304 [pdf]

LYSTO: The Lymphocyte Assessment Hackathon and Benchmark Dataset

Authors: Yi** Jiao, Jeroen van der Laak, Shadi Albarqouni, Zhang Li, Tao Tan, Abhir Bhalerao, Jiabo Ma, Jiamei Sun, Johnathan Pocock, Josien P. W. Pluim, Navid Alemi Koohbanani, Raja Muhammad Saad Bashir, Shan E Ahmed Raza, Sibo Liu, Simon Graham, Suzanne Wetstein, Syed Ali Khurram, Thomas Watson, Nasir Rajpoot, Mitko Veta, Francesco Ciompi

Abstract: We introduce LYSTO, the Lymphocyte Assessment Hackathon, which was held in conjunction with the MICCAI 2019 Conference in Shenzen (China). The competition required participants to automatically assess the number of lymphocytes, in particular T-cells, in histopathological images of colon, breast, and prostate cancer stained with CD3 and CD8 immunohistochemistry. Differently from other challenges se… ▽ More We introduce LYSTO, the Lymphocyte Assessment Hackathon, which was held in conjunction with the MICCAI 2019 Conference in Shenzen (China). The competition required participants to automatically assess the number of lymphocytes, in particular T-cells, in histopathological images of colon, breast, and prostate cancer stained with CD3 and CD8 immunohistochemistry. Differently from other challenges setup in medical image analysis, LYSTO participants were solely given a few hours to address this problem. In this paper, we describe the goal and the multi-phase organization of the hackathon; we describe the proposed methods and the on-site results. Additionally, we present post-competition results where we show how the presented methods perform on an independent set of lung cancer slides, which was not part of the initial competition, as well as a comparison on lymphocyte assessment between presented methods and a panel of pathologists. We show that some of the participants were capable to achieve pathologist-level performance at lymphocyte assessment. After the hackathon, LYSTO was left as a lightweight plug-and-play benchmark dataset on grand-challenge website, together with an automatic evaluation platform. LYSTO has supported a number of research in lymphocyte assessment in oncology. LYSTO will be a long-lasting educational challenge for deep learning and digital pathology, it is available at https://lysto.grand-challenge.org/. △ Less

Submitted 13 April, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

Comments: will be sumitted to IEEE-JBHI

MSC Class: 68T07 ACM Class: I.4.9; I.5.4; I.2.1

arXiv:2212.08896 [pdf]

doi 10.1145/3665869

Human Image Generation: A Comprehensive Survey

Authors: Zhen Jia, Zhang Zhang, Liang Wang, Tieniu Tan

Abstract: Image and video synthesis has become a blooming topic in computer vision and machine learning communities along with the developments of deep generative models, due to its great academic and application value. Many researchers have been devoted to synthesizing high-fidelity human images as one of the most commonly seen object categories in daily lives, where a large number of studies are performed… ▽ More Image and video synthesis has become a blooming topic in computer vision and machine learning communities along with the developments of deep generative models, due to its great academic and application value. Many researchers have been devoted to synthesizing high-fidelity human images as one of the most commonly seen object categories in daily lives, where a large number of studies are performed based on various models, task settings and applications. Thus, it is necessary to give a comprehensive overview on these variant methods on human image generation. In this paper, we divide human image generation techniques into three paradigms, i.e., data-driven methods, knowledge-guided methods and hybrid methods. For each paradigm, the most representative models and the corresponding variants are presented, where the advantages and characteristics of different methods are summarized in terms of model architectures. Besides, the main public human image datasets and evaluation metrics in the literature are summarized. Furthermore, due to the wide application potentials, the typical downstream usages of synthesized human images are covered. Finally, the challenges and potential opportunities of human image generation are discussed to shed light on future research. △ Less

Submitted 23 May, 2024; v1 submitted 17 December, 2022; originally announced December 2022.

Comments: Accepted by ACM Computing Surveys (CSUR)

arXiv:2212.08242 [pdf, other]

doi 10.3847/1538-3881/acc3a0

Spinning up a Daze: TESS Uncovers a Hot Jupiter orbiting the Rapid-Rotator TOI-778

Authors: Jake Clark, Brett Addison, Jack Okumura, Sydney Vach, Alexis Heitzmann, Joseph Rodriguez, Duncan Wright, Mathieu Clerte, Carolyn Brown, Tara Fetherolf, Robert Wittenmyer, Peter Plavchan, Stephen Kane, Jonathan Horner, John Kielkopf, Avi Shporer, C. Tinney, Liu Hui-Gen, Sarah Ballard, Brendan Bowler, Matthew Mengel, George Zhou, Annette Lee, Avelyn David, Jessica Heim , et al. (46 additional authors not shown)

Abstract: NASA's Transiting Exoplanet Survey Satellite (TESS) mission, has been uncovering a growing number of exoplanets orbiting nearby, bright stars. Most exoplanets that have been discovered by TESS orbit narrow-line, slow-rotating stars, facilitating the confirmation and mass determination of these worlds. We present the discovery of a hot Jupiter orbiting a rapidly rotating ($v\sin{(i)}= 35.1\pm1.0$km… ▽ More NASA's Transiting Exoplanet Survey Satellite (TESS) mission, has been uncovering a growing number of exoplanets orbiting nearby, bright stars. Most exoplanets that have been discovered by TESS orbit narrow-line, slow-rotating stars, facilitating the confirmation and mass determination of these worlds. We present the discovery of a hot Jupiter orbiting a rapidly rotating ($v\sin{(i)}= 35.1\pm1.0$km/s) early F3V-dwarf, HD115447 (TOI-778). The transit signal taken from Sectors 10 and 37 of TESS's initial detection of the exoplanet is combined with follow-up ground-based photometry and velocity measurements taken from Minerva-Australis, TRES, CORALIE and CHIRON to confirm and characterise TOI-778b. A joint analysis of the light curves and the radial velocity measurements yield a mass, radius, and orbital period for TOI-778b of $2.76^{+0.24}_{-0.23}$Mjup, $1.370\pm0.043$Rjup and $\sim4.63$ days, respectively. The planet orbits a bright ($V = 9.1$mag) F3-dwarf with $M=1.40\pm0.05$Msun, $R=1.70\pm0.05$Rsun, and $\log g=4.05\pm0.17$. We observed a spectroscopic transit of TOI-778b, which allowed us to derive a sky-projected spin-orbit angle of $18^{\circ}\pm11^{\circ}$, consistent with an aligned planetary system. This discovery demonstrates the capability of smaller aperture telescopes such as Minerva-Australis to detect the radial velocity signals produced by planets orbiting broad-line, rapidly rotating stars. △ Less

Submitted 30 April, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

Comments: 18 pages, 9 figures, and 4 tables. Submitted to the Astronomical Journal

Journal ref: AJ 165 207 (2023)

arXiv:2212.04385 [pdf, other]

BEVBert: Multimodal Map Pre-training for Language-guided Navigation

Authors: Dong An, Yuankai Qi, Yangguang Li, Yan Huang, Liang Wang, Tieniu Tan, **g Shao

Abstract: Large-scale pre-training has shown promising results on the vision-and-language navigation (VLN) task. However, most existing pre-training methods employ discrete panoramas to learn visual-textual associations. This requires the model to implicitly correlate incomplete, duplicate observations within the panoramas, which may impair an agent's spatial understanding. Thus, we propose a new map-based… ▽ More Large-scale pre-training has shown promising results on the vision-and-language navigation (VLN) task. However, most existing pre-training methods employ discrete panoramas to learn visual-textual associations. This requires the model to implicitly correlate incomplete, duplicate observations within the panoramas, which may impair an agent's spatial understanding. Thus, we propose a new map-based pre-training paradigm that is spatial-aware for use in VLN. Concretely, we build a local metric map to explicitly aggregate incomplete observations and remove duplicates, while modeling navigation dependency in a global topological map. This hybrid design can balance the demand of VLN for both short-term reasoning and long-term planning. Then, based on the hybrid map, we devise a pre-training framework to learn a multimodal map representation, which enhances spatial-aware cross-modal reasoning thereby facilitating the language-guided navigation goal. Extensive experiments demonstrate the effectiveness of the map-based pre-training route for VLN, and the proposed method achieves state-of-the-art on four VLN benchmarks. △ Less

Submitted 3 August, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

Comments: ICCV 2023, project page: https://github.com/MarSaKi/VLN-BEVBert

arXiv:2211.14751 [pdf, other]

Estimating Reflectance Layer from A Single Image: Integrating Reflectance Guidance and Shadow/Specular Aware Learning

Authors: Yeying **, Ruoteng Li, Wenhan Yang, Robby T. Tan

Abstract: Estimating the reflectance layer from a single image is a challenging task. It becomes more challenging when the input image contains shadows or specular highlights, which often render an inaccurate estimate of the reflectance layer. Therefore, we propose a two-stage learning method, including reflectance guidance and a Shadow/Specular-Aware (S-Aware) network to tackle the problem. In the first st… ▽ More Estimating the reflectance layer from a single image is a challenging task. It becomes more challenging when the input image contains shadows or specular highlights, which often render an inaccurate estimate of the reflectance layer. Therefore, we propose a two-stage learning method, including reflectance guidance and a Shadow/Specular-Aware (S-Aware) network to tackle the problem. In the first stage, an initial reflectance layer free from shadows and specularities is obtained with the constraint of novel losses that are guided by prior-based shadow-free and specular-free images. To further enforce the reflectance layer to be independent of shadows and specularities in the second-stage refinement, we introduce an S-Aware network that distinguishes the reflectance image from the input image. Our network employs a classifier to categorize shadow/shadow-free, specular/specular-free classes, enabling the activation features to function as attention maps that focus on shadow/specular regions. Our quantitative and qualitative evaluations show that our method outperforms the state-of-the-art methods in the reflectance layer estimation that is free from shadows and specularities. Code is at: \url{https://github.com/**yeying/S-Aware-network}. △ Less

Submitted 5 August, 2023; v1 submitted 27 November, 2022; originally announced November 2022.

Comments: Accepted to AAAI2023, https://github.com/**yeying/S-Aware-network

Journal ref: published AAAI2023

arXiv:2211.13409 [pdf, other]

Object Detection in Foggy Scenes by Embedding Depth and Reconstruction into Domain Adaptation

Authors: Xin Yang, Michael Bi Mi, Yuan Yuan, Xin Wang, Robby T. Tan

Abstract: Most existing domain adaptation (DA) methods align the features based on the domain feature distributions and ignore aspects related to fog, background and target objects, rendering suboptimal performance. In our DA framework, we retain the depth and background information during the domain feature alignment. A consistency loss between the generated depth and fog transmission map is introduced to… ▽ More Most existing domain adaptation (DA) methods align the features based on the domain feature distributions and ignore aspects related to fog, background and target objects, rendering suboptimal performance. In our DA framework, we retain the depth and background information during the domain feature alignment. A consistency loss between the generated depth and fog transmission map is introduced to strengthen the retention of the depth information in the aligned features. To address false object features potentially generated during the DA process, we propose an encoder-decoder framework to reconstruct the fog-free background image. This reconstruction loss also reinforces the encoder, i.e., our DA backbone, to minimize false object features.Moreover, we involve our target data in training both our DA module and our detection module in a semi-supervised manner, so that our detection module is also exposed to the unlabeled target data, the type of data used in the testing stage. Using these ideas, our method significantly outperforms the state-of-the-art method (47.6 mAP against the 44.3 mAP on the Foggy Cityscapes dataset), and obtains the best performance on multiple real-image public datasets. Code is available at: https://github.com/VIML-CVDL/Object-Detection-in-Foggy-Scenes △ Less

Submitted 23 November, 2022; originally announced November 2022.

Comments: Accepted by ACCV

arXiv:2211.12674 [pdf, other]

Semantic-aware One-shot Face Re-enactment with Dense Correspondence Estimation

Authors: Yunfan Liu, Qi Li, Zhenan Sun, Tieniu Tan

Abstract: One-shot face re-enactment is a challenging task due to the identity mismatch between source and driving faces. Specifically, the suboptimally disentangled identity information of driving subjects would inevitably interfere with the re-enactment results and lead to face shape distortion. To solve this problem, this paper proposes to use 3D Morphable Model (3DMM) for explicit facial semantic decomp… ▽ More One-shot face re-enactment is a challenging task due to the identity mismatch between source and driving faces. Specifically, the suboptimally disentangled identity information of driving subjects would inevitably interfere with the re-enactment results and lead to face shape distortion. To solve this problem, this paper proposes to use 3D Morphable Model (3DMM) for explicit facial semantic decomposition and identity disentanglement. Instead of using 3D coefficients alone for re-enactment control, we take the advantage of the generative ability of 3DMM to render textured face proxies. These proxies contain abundant yet compact geometric and semantic information of human faces, which enable us to compute the face motion field between source and driving images by estimating the dense correspondence. In this way, we could approximate re-enactment results by war** source images according to the motion field, and a Generative Adversarial Network (GAN) is adopted to further improve the visual quality of war** results. Extensive experiments on various datasets demonstrate the advantages of the proposed method over existing start-of-the-art benchmarks in both identity preservation and re-enactment fulfillment. △ Less

Submitted 22 November, 2022; originally announced November 2022.

Showing 101–150 of 662 results for author: Tan, T