Search | arXiv e-print repository

Memory Complexity of Entropy Estimation

Authors: Tomer Berg, Or Ordentlich, Ofer Shayevitz

Abstract: We observe an infinite sequence of independent identically distributed random variables $X_1,X_2,\ldots$ drawn from an unknown distribution $p$ over $[n]$, and our goal is to estimate the entropy $H(p)=-\mathbb{E}[\log p(X)]$ within an $\varepsilon$-additive error. To that end, at each time point we are allowed to update a finite-state machine with $S$ states, using a possibly randomized but time-… ▽ More We observe an infinite sequence of independent identically distributed random variables $X_1,X_2,\ldots$ drawn from an unknown distribution $p$ over $[n]$, and our goal is to estimate the entropy $H(p)=-\mathbb{E}[\log p(X)]$ within an $\varepsilon$-additive error. To that end, at each time point we are allowed to update a finite-state machine with $S$ states, using a possibly randomized but time-invariant rule, where each state of the machine is assigned an entropy estimate. Our goal is to characterize the minimax memory complexity $S^*$ of this problem, which is the minimal number of states for which the estimation task is feasible with probability at least $1-δ$ asymptotically, uniformly in $p$. Specifically, we show that there exist universal constants $C_1$ and $C_2$ such that $ S^* \leq C_1\cdot\frac{n (\log n)^4}{\varepsilon^2δ}$ for $\varepsilon$ not too small, and $S^* \geq C_2 \cdot \max \{n, \frac{\log n}{\varepsilon}\}$ for $\varepsilon$ not too large. The upper bound is proved using approximate counting to estimate the logarithm of $p$, and a finite memory bias estimation machine to estimate the expectation operation. The lower bound is proved via a reduction of entropy estimation to uniformity testing. We also apply these results to derive bounds on the memory complexity of mutual information estimation. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2404.10058 [pdf]

GHOST Commissioning Science Results III: Characterizing an iron-poor damped Lyman $α$ system

Authors: Trystyn A. M. Berg, Christian R. Hayes, Stefano Cristiani, Alan McConnachie, J. Gordon Robertson, Federico Sestito, Chris Simpson, Fletcher Waller, Timothy Chin, Adam Densmore, Ruben J. Diaz, Michael L. Edgar, Javier Fuentes Lettura, Manuel Gómez-Jiménez, Venu M. Kalari, Jon Lawrence, Steven Margheim, John Pazder, Roque Ruiz-Carmona, Ricardo Salinas, Karleyne M. G. Silva, Katherine Silversides, Kim A. Venn

Abstract: The Gemini High-resolution Optical SpecTrograph (GHOST) is a new echelle spectrograph available on the Gemini-South telescope as of Semester 2024A. We present the first high resolution spectrum of the quasar J1449-1227 (redshift z_em=3.27) using data taken during the commissioning of GHOST. The observed quasar hosts an intervening iron-poor ([Fe/H] = -2.5) damped Lyman alpha (DLA) system at redshi… ▽ More The Gemini High-resolution Optical SpecTrograph (GHOST) is a new echelle spectrograph available on the Gemini-South telescope as of Semester 2024A. We present the first high resolution spectrum of the quasar J1449-1227 (redshift z_em=3.27) using data taken during the commissioning of GHOST. The observed quasar hosts an intervening iron-poor ([Fe/H] = -2.5) damped Lyman alpha (DLA) system at redshift z=2.904. Taking advantage of the high spectral resolving power of GHOST (R~55000), we are able to accurately model the metal absorption lines of the metal-poor DLA and find a supersolar [Si/Fe], suggesting the DLA gas is in an early stage of chemical enrichment. Using simple ionization models, we find that the large range in the C IV/Si IV column density ratio of individual components within the DLA's high ionization absorption profile can be reproduced by several metal-poor Lyman limit systems surrounding the low-ionization gas of the DLA. It is possible that this metal-poor DLA resides within a complex system of metal-poor galaxies or filaments with inflowing gas. The high spectral resolution, wavelength coverage and sensitivity of GHOST makes it an ideal spectrograph for characterizing the chemistry and kinematics of quasar absorption lines. △ Less

Submitted 18 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

Comments: Accepted for publication in MNRAS. 8 Pages, 5 figures

arXiv:2403.15379 [pdf]

Time-efficient, high-resolution 3T whole-brain relaxometry using Cartesian 3D MR-STAT with CSF suppression

Authors: Hongyan Liu, Edwin Versteeg, Miha Fuderer, Oscar van der Heide, Martin B. Schilder, Cornelis A. T. van den Berg, Alessandro Sbrizzi

Abstract: Purpose: Current 3D Magnetic Resonance Spin TomogrAphy in Time-domain (MR-STAT) protocols use transient-state, gradient-spoiled gradient-echo sequences that are prone to cerebrospinal fluid (CSF) pulsation artifacts when applied to the brain. This study aims at develo** a 3D MR-STAT protocol for whole-brain relaxometry that overcomes the challenges posed by CSF-induced ghosting artifacts. Method… ▽ More Purpose: Current 3D Magnetic Resonance Spin TomogrAphy in Time-domain (MR-STAT) protocols use transient-state, gradient-spoiled gradient-echo sequences that are prone to cerebrospinal fluid (CSF) pulsation artifacts when applied to the brain. This study aims at develo** a 3D MR-STAT protocol for whole-brain relaxometry that overcomes the challenges posed by CSF-induced ghosting artifacts. Method: We optimized the flip-angle train within the Cartesian 3D MR-STAT framework to achieve two objectives: (1) minimization of the noise level in the reconstructed quantitative maps, and (2) reduction of the CSF-to-white-matter signal ratio to suppress CSF signal and the associated pulsation artifacts. The optimized new sequence was tested on a gel/water-phantom to evaluate the accuracy of the quantitative maps, and on healthy volunteers to explore the effectiveness of the CSF artifact suppression and robustness of the new protocol. Results: A new optimized sequence with both high parameter encoding capability and low CSF intensity was proposed and initially validated in the gel/water-phantom experiment. From in-vivo experiments with five volunteers, the proposed CSF-suppressed sequence shows no CSF ghosting artifacts and overall greatly improved image quality for all quantitative maps compared to the baseline sequence. Statistical analysis indicated low inter-subject and inter-scan variability for quantitative parameters in gray matter and white matter (1.6%-2.4% for T1 and 2.0%-4.6% for T2), demonstrating the robustness of the new sequence. Conclusion: We presented a new 3D MR-STAT sequence with CSF suppression that effectively eliminates CSF pulsation artifacts. The new sequence ensures consistently high-quality, 1mm^3 whole-brain relaxometry within a rapid 5.5-minute scan time. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2401.07452 [pdf, other]

The Science Performance of the Gemini High Resolution Optical Spectrograph

Authors: Alan W. McConnachie, Christian R. Hayes, J. Gordon Robertson, John Pazder, Michael Ireland, Greg Burley, Vladimir Churilov, Jordan Lothrop, Ross Zhelem, Venu Kalari, André Anthony, Gabriella Baker, Trystyn Berg, Edward L. Chapin, Timothy Chin, Adam Densmore, Ruben Diaz, Jennifer Dunn, Michael L. Edgar, Tony Farrell, Veronica Firpo, Javier Fuentes, Manuel Gomez-Jimenez, Tim Hardy, David Henderson , et al. (24 additional authors not shown)

Abstract: The Gemini High Resolution Optical Spectrograph (GHOST) is a fiber-fed spectrograph system on the Gemini South telescope that provides simultaneous wavelength coverage from 348 - 1061nm, and designed for optimal performance between 363 - 950nm. It can observe up to two objects simultaneously in a 7.5 arcmin diameter field of regard at R = 56,000 or a single object at R = 75,000. The spectral resol… ▽ More The Gemini High Resolution Optical Spectrograph (GHOST) is a fiber-fed spectrograph system on the Gemini South telescope that provides simultaneous wavelength coverage from 348 - 1061nm, and designed for optimal performance between 363 - 950nm. It can observe up to two objects simultaneously in a 7.5 arcmin diameter field of regard at R = 56,000 or a single object at R = 75,000. The spectral resolution modes are obtained by using integral field units to image slice a 1.2" aperture by a factor of five in width using 19 fibers in the high resolution mode and by a factor of three in width using 7 fibers in the standard resolution mode. GHOST is equipped with hardware to allow for precision radial velocity measurements, expected to approach meters per second precision. Here, we describe the basic design and operational capabilities of GHOST, and proceed to derive and quantify the key aspects of its on-sky performance that are of most relevance to its science users. △ Less

Submitted 14 January, 2024; originally announced January 2024.

Comments: 37 pages, 27 figures. Accepted for publication in Publications of the Astronomical Society of the Pacific

arXiv:2312.15225 [pdf, ps, other]

Statistical Inference with Limited Memory: A Survey

Authors: Tomer Berg, Or Ordentlich, Ofer Shayevitz

Abstract: The problem of statistical inference in its various forms has been the subject of decades-long extensive research. Most of the effort has been focused on characterizing the behavior as a function of the number of available samples, with far less attention given to the effect of memory limitations on performance. Recently, this latter topic has drawn much interest in the engineering and computer sc… ▽ More The problem of statistical inference in its various forms has been the subject of decades-long extensive research. Most of the effort has been focused on characterizing the behavior as a function of the number of available samples, with far less attention given to the effect of memory limitations on performance. Recently, this latter topic has drawn much interest in the engineering and computer science literature. In this survey paper, we attempt to review the state-of-the-art of statistical inference under memory constraints in several canonical problems, including hypothesis testing, parameter estimation, and distribution property testing/estimation. We discuss the main results in this develo** field, and by identifying recurrent themes, we extract some fundamental building blocks for algorithmic construction, as well as useful techniques for lower bound derivations. △ Less

Submitted 23 December, 2023; originally announced December 2023.

Comments: Submitted to JSAIT Special Issue

arXiv:2310.17024 [pdf, other]

SPLUS J142445.34-254247.1: An R-Process Enhanced, Actinide-Boost, Extremely Metal-Poor star observed with GHOST

Authors: Vinicius M. Placco, Felipe Almeida-Fernandes, Erika M. Holmbeck, Ian U. Roederer, Mohammad K. Mardini, Christian R. Hayes, Kim Venn, Kristin Chiboucas, Emily Deibert, Roberto Gamen, Jeong-Eun Heo, Miji Jeong, Venu Kalari, Eder Martioli, Siyi Xu, Ruben Diaz, Manuel Gomez-Jimenez, David Henderson, Pablo Prado, Carlos Quiroz, Roque Ruiz-Carmona, Chris Simpson, Cristian Urrutia, Alan W. McConnachie, John Pazder , et al. (11 additional authors not shown)

Abstract: We report on the chemo-dynamical analysis of SPLUS J142445.34-254247.1, an extremely metal-poor halo star enhanced in elements formed by the rapid neutron-capture process. This star was first selected as a metal-poor candidate from its narrow-band S-PLUS photometry and followed up spectroscopically in medium-resolution with Gemini South/GMOS, which confirmed its low-metallicity status. High-resolu… ▽ More We report on the chemo-dynamical analysis of SPLUS J142445.34-254247.1, an extremely metal-poor halo star enhanced in elements formed by the rapid neutron-capture process. This star was first selected as a metal-poor candidate from its narrow-band S-PLUS photometry and followed up spectroscopically in medium-resolution with Gemini South/GMOS, which confirmed its low-metallicity status. High-resolution spectroscopy was gathered with GHOST at Gemini South, allowing for the determination of chemical abundances for 36 elements, from carbon to thorium. At [Fe/H]=-3.39, SPLUS J1424-2542 is one of the lowest metallicity stars with measured Th and has the highest logeps(Th/Eu) observed to date, making it part of the "actinide-boost" category of r-process enhanced stars. The analysis presented here suggests that the gas cloud from which SPLUS J1424-2542 was formed must have been enriched by at least two progenitor populations. The light-element (Z<=30) abundance pattern is consistent with the yields from a supernova explosion of metal-free stars with 11.3-13.4 Msun, and the heavy-element (Z>=38) abundance pattern can be reproduced by the yields from a neutron star merger (1.66Msun and 1.27Msun) event. A kinematical analysis also reveals that SPLUS J1424-2542 is a low-mass, old halo star with a likely in-situ origin, not associated with any known early merger events in the Milky Way. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: 26 pages, 11 figures, accepted for publication on ApJ

arXiv:2310.13732 [pdf, other]

doi 10.1051/0004-6361/202347867

Directly constraining the spatial coherence of the $z\sim1$ circumgalactic medium

Authors: A. Afruni, S. Lopez, P. Anshul, N. Tejos, P. Noterdaeme, T. A. M. Berg, C. Ledoux, M. Solimano, J. Gonzalez-Lopez, M. Gronke, F. Barrientos, E. J. Johnston

Abstract: One of the biggest puzzles regarding the circumgalactic medium (CGM) is the structure of its cool ($T\sim10^4$ K) gas phase. While the kinematics of quasar absorption systems suggests the CGM is composed of a population of different clouds, constraining the clouds' extent and spatial distribution has proven challenging, both from the theoretical and observational points of view. In this work we st… ▽ More One of the biggest puzzles regarding the circumgalactic medium (CGM) is the structure of its cool ($T\sim10^4$ K) gas phase. While the kinematics of quasar absorption systems suggests the CGM is composed of a population of different clouds, constraining the clouds' extent and spatial distribution has proven challenging, both from the theoretical and observational points of view. In this work we study the spatial structure of the $z\sim 1$ CGM with unprecedented detail via resolved spectroscopy of giant gravitational arcs. We put together a sample of Mg II$λλ2796,2803$ detections obtained with VLT/MUSE in 91 spatially independent and contiguous sight-lines toward 3 arcs, each probing an isolated star-forming galaxy believed to be detected in absorption. We constrain the coherence scale of this gas ($C_{\rm{length}}$), which represents the spatial scale over which the Mg II equivalent width (EW) remains constant, by comparing EW variations measured across all sight-lines with empirical models. We find $1.4 <C_{\rm{length}}/\rm{kpc} <7.8$ (95% confidence). This measurement, of unprecedented accuracy, represents the scale over which the cool gas tends to cluster in separate structures. We argue that, if $C_{\rm{length}}$ is a universal property of the CGM, it needs to be reproduced by current and future theoretical models in order to understand the exact role of this medium in galaxy evolution. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: 19 pages, 13 figures, accepted for publication in A&A

Journal ref: A&A 680, A112 (2023)

arXiv:2310.07622 [pdf, other]

Time-Resolved Reconstruction of Motion, Force, and Stiffness using Spectro-Dynamic MRI

Authors: Max H. C. van Riel, Tristan van Leeuwen, Cornelis A. T. van den Berg, Alessandro Sbrizzi

Abstract: Measuring the dynamics and mechanical properties of muscles and joints is important to understand the (patho)physiology of muscles. However, acquiring dynamic time-resolved MRI data is challenging. We have previously developed Spectro-Dynamic MRI which allows the characterization of dynamical systems at a high spatial and temporal resolution directly from k-space data. This work presents an extend… ▽ More Measuring the dynamics and mechanical properties of muscles and joints is important to understand the (patho)physiology of muscles. However, acquiring dynamic time-resolved MRI data is challenging. We have previously developed Spectro-Dynamic MRI which allows the characterization of dynamical systems at a high spatial and temporal resolution directly from k-space data. This work presents an extended Spectro-Dynamic MRI framework that reconstructs 1) time-resolved MR images, 2) time-resolved motion fields, 3) dynamical parameters, and 4) an activation force, at a temporal resolution of 11 ms. An iterative algorithm solves a minimization problem containing four terms: a motion model relating the motion to the fully-sampled k-space data, a dynamical model describing the expected type of dynamics, a data consistency term describing the undersampling pattern, and finally a regularization term for the activation force. We acquired MRI data using a dynamic motion phantom programmed to move like an actively driven linear elastic system, from which all dynamic variables could be accurately reconstructed, regardless of the sampling pattern. The proposed method performed better than a two-step approach, where time-resolved images were first reconstructed from the undersampled data without any information about the motion, followed by a motion estimation step. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 11 pages, 7 figures, 5 supplementary figures, 1 supplementary video. The video can be viewed by downloading the source file under "Other Formats"

arXiv:2310.03075 [pdf, other]

doi 10.1093/mnras/stad3673

Probing the early Milky Way with GHOST spectra of an extremely metal-poor star in the Galactic disk

Authors: Anya Dovgal, Kim A. Venn, Federico Sestito, Christian R. Hayes, Alan W. McConnachie, Julio F. Navarro, Vinicius M. Placco, Else Starkenburg, Nicolas F. Martin, John S. Pazder, Kristin Chiboucas, Emily Deibert, Roberto Gamen, Jeong-Eun Heo, Venu M. Kalari, Eder Martioli, Siyi Xu, Ruben Diaz, Manuel Gomez-Jiminez, David Henderson, Pablo Prado, Carlos Quiroz, J. Gordon Robertson, Roque Ruiz-Carmona, Chris Simpson , et al. (9 additional authors not shown)

Abstract: Pristine_183.6849+04.8619 (P1836849) is an extremely metal-poor ([Fe/H]$=-3.3\pm0.1$) star on a prograde orbit confined to the Galactic disk. Such stars are rare and may have their origins in protogalactic fragments that formed the early Milky Way, in low mass satellites accreted later, or forming in situ in the Galactic plane. Here we present a chemo-dynamical analysis of the spectral features be… ▽ More Pristine_183.6849+04.8619 (P1836849) is an extremely metal-poor ([Fe/H]$=-3.3\pm0.1$) star on a prograde orbit confined to the Galactic disk. Such stars are rare and may have their origins in protogalactic fragments that formed the early Milky Way, in low mass satellites accreted later, or forming in situ in the Galactic plane. Here we present a chemo-dynamical analysis of the spectral features between $3700-11000$Å from a high-resolution spectrum taken during Science Verification of the new Gemini High-resolution Optical SpecTrograph (GHOST). Spectral features for many chemical elements are analysed (Mg, Al, Si, Ca, Sc, Ti, Cr, Mn, Fe, Ni), and valuable upper limits are determined for others (C, Na, Sr, Ba). This main sequence star exhibits several rare chemical signatures, including (i) extremely low metallicity for a star in the Galactic disk, (ii) very low abundances of the light $α$-elements (Na, Mg, Si) compared to other metal-poor stars, and (iii) unusually large abundances of Cr and Mn, where [Cr, Mn/Fe]$_{\rm NLTE}>+0.5$. A comparison to theoretical yields from supernova models suggests that two low mass Population III objects (one 10 M$_\odot$ supernova and one 17 M$_\odot$ hypernova) can reproduce the abundance pattern well (reduced $χ^2<1$). When this star is compared to other extremely metal-poor stars on quasi-circular, prograde planar orbits, differences in both chemistry and kinematics imply there is little evidence for a common origin. The unique chemistry of P1836849 is discussed in terms of the earliest stages in the formation of the Milky Way. △ Less

Submitted 26 November, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: 16 pages, 10 figures, 6 tables. Accepted by MNRAS November 22; Revisions include comparisons to more EMP stars, results unchanged

Journal ref: MNRAS 527 (2024) 7810-7824

arXiv:2308.07366 [pdf, other]

GHOST Commissioning Science Results II: a very metal-poor star witnessing the early Galactic assembly

Authors: Federico Sestito, Christian R. Hayes, Kim A. Venn, Jaclyn Jensen, Alan W. McConnachie, John Pazder, Fletcher Waller, Anke Arentsen, Pascale Jablonka, Nicolas F. Martin, Tadafumi Matsuno, Julio F. Navarro, Else Starkenburg, Sara Vitali, John Bassett, Trystyn A. M. Berg, Ruben Diaz, Michael L. Edgar, Veronica Firpo, Manuel Gomez-Jimenez, Venu Kalari, Sam Lambert, Jon Lawrence, Gordon Robertson, Roque Ruiz-Carmona , et al. (3 additional authors not shown)

Abstract: This study focuses on Pristine$\_180956.78$$-$$294759.8$ (hereafter P180956, $[Fe/H] =-1.95\pm0.02$), a star selected from the Pristine Inner Galaxy Survey (PIGS), and followed-up with the recently commissioned Gemini High-resolution Optical SpecTrograph (GHOST) at the Gemini South telescope. The GHOST spectrograph's high efficiency in the blue spectral region ($3700-4800$~Å) enables the detection… ▽ More This study focuses on Pristine$\_180956.78$$-$$294759.8$ (hereafter P180956, $[Fe/H] =-1.95\pm0.02$), a star selected from the Pristine Inner Galaxy Survey (PIGS), and followed-up with the recently commissioned Gemini High-resolution Optical SpecTrograph (GHOST) at the Gemini South telescope. The GHOST spectrograph's high efficiency in the blue spectral region ($3700-4800$~Å) enables the detection of elemental tracers of early supernovae (\eg Al, Mn, Sr, Eu). The star exhibits chemical signatures resembling those found in ultra-faint dwarf systems, characterised by very low abundances of neutron-capture elements (Sr, Ba, Eu), which are uncommon among stars in the Milky Way halo. Our analysis suggests that P180956 bears the chemical imprints of a small number (2 or 4) of low-mass hypernovae ($\sim10-15 M_{\odot}$), which are needed to mostly reproduce the abundance pattern of the light-elements (\eg [Si, Ti/Mg, Ca] $\sim0.6$), and one fast-rotating intermediate-mass supernova ($\sim300\kms$, $\sim80-120 M_{\odot}$), which is the main channel contributing to the high [Sr/Ba] ($\sim +1.2$). The small pericentric ($\sim0.7$ kpc) and apocentric ($\sim13$ kpc) distances and its orbit confined to the plane ($\lesssim 2$ kpc), indicate that this star was likely accreted during the early Galactic assembly phase. Its chemo-dynamical properties suggest that P180956 formed in a system similar to an ultra-faint dwarf galaxy accreted either alone, as one of the low-mass building blocks of the proto-Galaxy, or as a satellite of Gaia-Sausage-Enceladus. The combination of Gemini's large aperture with GHOST's high efficiency and broad spectral coverage makes this new spectrograph one of the leading instruments for near-field cosmology investigations. △ Less

Submitted 20 January, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

Comments: Accepted version, minor editing. New figure showing Sr and Ba lines. Section 4.7 revised

arXiv:2307.06161 [pdf, other]

Towards an automatic approach to modelling the circumgalactic medium: new tools for mock making and fitting of metal profiles in large surveys

Authors: Alessia Longobardi, Matteo Fossati, Michele Fumagalli, Bhaskar Agarwal, Emma Lofthouse, Marta Galbiati, Rajeshwari Dutta, Trystyn A. M. Berg, Louise A. Welsh

Abstract: We present two new tools for studying and modelling metal absorption lines in the circumgalactic medium. The first tool, dubbed ``NMF Profile Maker'' (NMF$-$PM), uses a non-negative matrix factorization (NMF) method and provides a robust means to generate large libraries of realistic metal absorption profiles. The method is trained and tested on 650 unsaturated metal absorbers in the redshift inte… ▽ More We present two new tools for studying and modelling metal absorption lines in the circumgalactic medium. The first tool, dubbed ``NMF Profile Maker'' (NMF$-$PM), uses a non-negative matrix factorization (NMF) method and provides a robust means to generate large libraries of realistic metal absorption profiles. The method is trained and tested on 650 unsaturated metal absorbers in the redshift interval $z=0.9-4.2$ with column densities between $11.2 \le \log{(\mathrm{N/cm^{-2}})} \le 16.3$, obtained from high-resolution ($R> 4000$) and high signal-to-noise ratio ($S/N \ge 10$) quasar spectroscopy. To avoid spurious features, we train on infinite $S/N$ Voigt models of the observed line profiles derived using the code ``Monte-Carlo Absorption Line Fitter'' (MC$-$ALF), a novel automatic Bayesian fitting code that is the second tool we present in this work. MC$-$ALF is a Monte Carlo code based on nested sampling that, without the need for any prior guess or human intervention, can decompose metal lines into individual Voigt components. Both MC$-$ALF and NMF$-$PM are made publicly available to allow the community to produce large libraries of synthetic metal profiles and to reconstruct Voigt models of absorption lines in an automatic fashion. Both tools contribute to the scientific effort of simulating and analysing metal absorbers in very large spectroscopic surveys of quasars like the ongoing Dark Energy Spectroscopic Instrument (DESI), the 4-meter Multi-Object Spectroscopic Telescope (4MOST), and the WHT Enhanced Area Velocity Explorer (WEAVE) surveys. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: 22 pages, 19 figures. Accepted on RASTI

arXiv:2306.11079 [pdf]

doi 10.1088/1361-6560/ace023

Real-time myocardial landmark tracking for MRI-guided cardiac radio-ablation using Gaussian Processes

Authors: Niek R. F. Huttinga, Osman Akdag, Martin F. Fast, Joost Verhoeff, Firdaus A. A. Mohamed Hoesein, Cornelis A. T. van den Berg, Alessandro Sbrizzi, Stefano Mandija

Abstract: The high speed of cardiorespiratory motion introduces a unique challenge for cardiac stereotactic radio-ablation (STAR) treatments with the MR-linac. Such treatments require tracking myocardial landmarks with a maximum latency of 100 ms, which includes the acquisition of the required data. The aim of this study is to present a new method that allows to track myocardial landmarks from few readouts… ▽ More The high speed of cardiorespiratory motion introduces a unique challenge for cardiac stereotactic radio-ablation (STAR) treatments with the MR-linac. Such treatments require tracking myocardial landmarks with a maximum latency of 100 ms, which includes the acquisition of the required data. The aim of this study is to present a new method that allows to track myocardial landmarks from few readouts of MRI data, thereby achieving a latency sufficient for STAR treatments. We present a tracking framework that requires only few readouts of k-space data as input, which can be acquired at least an order of magnitude faster than MR-images. Combined with the real-time tracking speed of a probabilistic machine learning framework called Gaussian Processes, this allows to track myocardial landmarks with a sufficiently low latency for cardiac STAR guidance, including both the acquisition of required data, and the tracking inference. The framework is demonstrated in 2D on a motion phantom, and in vivo on volunteers and a ventricular tachycardia (arrhythmia) patient. Moreover, the feasibility of an extension to 3D was demonstrated by in silico 3D experiments with a digital motion phantom. The framework was compared with template matching - a reference, image-based, method - and linear regression methods. Results indicate an order of magnitude lower total latency (<10 ms) for the proposed framework in comparison with alternative methods. The root-mean-square-distances and mean end-point-distance with the reference tracking method was less than 0.8 mm for all experiments, showing excellent (sub-voxel) agreement. The high accuracy in combination with a total latency of less than 10 ms - including data acquisition and processing - make the proposed method a suitable candidate for tracking during STAR treatments. △ Less

Submitted 19 June, 2023; originally announced June 2023.

arXiv:2305.13022 [pdf]

doi 10.1002/nbm.5050

A three-dimensional MR-STAT protocol for high-resolution multi-parametric quantitative MRI

Authors: Hongyan Liu, Oscar van der Heide, Edwin Versteeg, Martijn Froeling, Miha Fuderer, Fei Xu, Cornelis A. T. van den Berg, Alessandro Sbrizzi

Abstract: Magnetic Resonance Spin Tomography in Time-Domain (MR-STAT) is a multiparametric quantitative MR framework, which allows for simultaneously acquiring quantitative tissue parameters such as T1, T2 and proton density from one single short scan. A typical 2D MR-STAT acquisition uses a gradient-spoiled, gradient-echo sequence with a slowly varying RF flip-angle train and Cartesian readouts, and the qu… ▽ More Magnetic Resonance Spin Tomography in Time-Domain (MR-STAT) is a multiparametric quantitative MR framework, which allows for simultaneously acquiring quantitative tissue parameters such as T1, T2 and proton density from one single short scan. A typical 2D MR-STAT acquisition uses a gradient-spoiled, gradient-echo sequence with a slowly varying RF flip-angle train and Cartesian readouts, and the quantitative tissue maps are reconstructed by an iterative, model-based optimization algorithm. In this work, we design a 3D MR-STAT framework based on previous 2D work, in order to achieve better image SNR, higher though-plan resolution and better tissue characterization. Specifically, we design a 7-minute, high-resolution 3D MR-STAT sequence, and the corresponding two-step reconstruction algorithm for the large-scale dataset. To reduce the long acquisition time, Cartesian undersampling strategies such as SENSE are adopted in our transient-state quantitative framework. To reduce the computational burden, a data splitting scheme is designed for decoupling the 3D reconstruction problem into independent 2D reconstructions. The proposed 3D framework is validated by numerical simulations, phantom experiments and in-vivo experiments. High-quality knee quantitative maps with 0.8 x 0.8 x 1.5mm3 resolution and bilateral lower leg maps with 1.6mm isotropic resolution can be acquired using the proposed 7-minute acquisition sequence and the 3-minute-per-slice decoupled reconstruction algorithm. The proposed 3D MR-STAT framework could have wide clinical applications in the future. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Journal ref: NMR Biomed. 2023. e5050

arXiv:2305.12570 [pdf, ps, other]

doi 10.1002/mp.16884

Generalizable synthetic MRI with physics-informed convolutional networks

Authors: Luuk Jacobs, Stefano Mandija, Hongyan Liu, Cornelis A. T. van den Berg, Alessandro Sbrizzi, Matteo Maspero

Abstract: In this study, we develop a physics-informed deep learning-based method to synthesize multiple brain magnetic resonance imaging (MRI) contrasts from a single five-minute acquisition and investigate its ability to generalize to arbitrary contrasts to accelerate neuroimaging protocols. A dataset of fifty-five subjects acquired with a standard MRI protocol and a five-minute transient-state sequence w… ▽ More In this study, we develop a physics-informed deep learning-based method to synthesize multiple brain magnetic resonance imaging (MRI) contrasts from a single five-minute acquisition and investigate its ability to generalize to arbitrary contrasts to accelerate neuroimaging protocols. A dataset of fifty-five subjects acquired with a standard MRI protocol and a five-minute transient-state sequence was used to develop a physics-informed deep learning-based method. The model, based on a generative adversarial network, maps data acquired from the five-minute scan to "effective" quantitative parameter maps, here named q*-maps, by using its generated PD, T1, and T2 values in a signal model to synthesize four standard contrasts (proton density-weighted, T1-weighted, T2-weighted, and T2-weighted fluid-attenuated inversion recovery), from which losses are computed. The q*-maps are compared to literature values and the synthetic contrasts are compared to an end-to-end deep learning-based method proposed by literature. The generalizability of the proposed method is investigated for five volunteers by synthesizing three non-standard contrasts unseen during training and comparing these to respective ground truth acquisitions via contrast-to-noise ratio and quantitative assessment. The physics-informed method was able to match the high-quality synthMRI of the end-to-end method for the four standard contrasts, with mean \pm standard deviation structural similarity metrics above 0.75 \pm 0.08 and peak signal-to-noise ratios above 22.4 \pm 1.9 and 22.6 \pm 2.1. Additionally, the physics-informed method provided retrospective contrast adjustment, with visually similar signal contrast and comparable contrast-to-noise ratios to the ground truth acquisitions for three sequences unused for model training, demonstrating its generalizability and potential application to accelerate neuroimaging protocols. △ Less

Submitted 21 May, 2023; originally announced May 2023.

Comments: 23 pages, 7 figures, 1 table. Presented at ISMRM 2022. Will be submitted to NMR in biomedicine

Journal ref: Med Phys. (2023)

arXiv:2305.02346 [pdf, other]

doi 10.3847/1538-4357/acc39f

Evidence of First Stars-enriched Gas in High-redshift Absorbers

Authors: A. Saccardi, S. Salvadori, V. D'Odorico, G. Cupani, M. Fumagalli, T. A. M. Berg, G. D. Becker, S. Ellison, S. Lopez

Abstract: The first stars were born from chemically pristine gas. They were likely massive, and thus they rapidly exploded as supernovae, enriching the surrounding gas with the first heavy elements. In the Local Group, the chemical signatures of the first stellar population were identified among low-mass, long-lived, very metal-poor ([Fe/H]<-2) stars, characterized by high abundances of carbon over iron ([C… ▽ More The first stars were born from chemically pristine gas. They were likely massive, and thus they rapidly exploded as supernovae, enriching the surrounding gas with the first heavy elements. In the Local Group, the chemical signatures of the first stellar population were identified among low-mass, long-lived, very metal-poor ([Fe/H]<-2) stars, characterized by high abundances of carbon over iron ([C/Fe]>+0.7): the so-called carbon-enhanced metal-poor stars. Conversely, a similar carbon excess caused by first-star pollution was not found in dense neutral gas traced by absorption systems at different cosmic time. Here we present the detection of 14 very metal-poor, optically thick absorbers at redshift z~3-4. Among these, 3 are carbon-enhanced and reveal an overabundance with respect to Fe of all the analyzed chemical elements (O, Mg, Al, and Si). Their relative abundances show a distribution with respect to [Fe/H] that is in very good agreement with those observed in nearby very metal-poor stars. All the tests we performed support the idea that these C-rich absorbers preserve the chemical yields of the first stars. Our new findings suggest that the first-star signatures can survive in optically thick but relatively diffuse absorbers, which are not sufficiently dense to sustain star formation and hence are not dominated by the chemical products of normal stars. △ Less

Submitted 3 May, 2023; originally announced May 2023.

arXiv:2209.14134 [pdf, other]

doi 10.1093/mnras/stac2851

Orientation effects on cool gas absorption from gravitational-arc tomography of a z = 0.77 disc galaxy

Authors: A. Fernandez-Figueroa, S. Lopez, N. Tejos, T. A. M. Berg, C. Ledoux, P. Noterdaeme, A. Afruni, L. F. Barrientos, J. Gonzalez-Lopez, M. Hamel, E. J. Johnston, A. Katsianis, K. Sharon, M. Solimano

Abstract: We use spatially-resolved spectroscopy of a distant giant gravitational arc to test orientation effects on MgII absorption equivalent width (EW) and covering fraction (kappa) in the circumgalactic medium of a foreground star-forming galaxy (G1) at z~0.77. Forty-two spatially-binned arc positions uniformly sample impact parameters (D) to G1 between 10 and 30 kpc and azimuthal angles alpha between 3… ▽ More We use spatially-resolved spectroscopy of a distant giant gravitational arc to test orientation effects on MgII absorption equivalent width (EW) and covering fraction (kappa) in the circumgalactic medium of a foreground star-forming galaxy (G1) at z~0.77. Forty-two spatially-binned arc positions uniformly sample impact parameters (D) to G1 between 10 and 30 kpc and azimuthal angles alpha between 30 and 90 degrees (minor axis). We find an EW-D anti-correlation, akin to that observed statistically in quasar absorber studies, and an apparent correlation of both EW and kappa with alpha, revealing a non-isotropic gas distribution. In line with our previous results on MgII kinematics suggesting the presence of outflows in G1, at minimum a simple 3-D static double-cone model (to represent the trace of bipolar outflows) is required to recreate the EW spatial distribution. The D and alpha values probed by the arc cannot confirm the presence of a disc, but the data highly disfavor a disc alone. Our results support the interpretation that the EW-alpha correlation observed statistically using other extant probes is partly shaped by bipolar metal-rich winds. △ Less

Submitted 28 September, 2022; originally announced September 2022.

Comments: Accepted in Monthly Notices of the Royal Astronomical Society

arXiv:2206.09395 [pdf, ps, other]

On The Memory Complexity of Uniformity Testing

Authors: Tomer Berg, Or Ordentlich, Ofer Shayevitz

Abstract: In this paper we consider the problem of uniformity testing with limited memory. We observe a sequence of independent identically distributed random variables drawn from a distribution $p$ over $[n]$, which is either uniform or is $\varepsilon$-far from uniform under the total variation distance, and our goal is to determine the correct hypothesis. At each time point we are allowed to update the s… ▽ More In this paper we consider the problem of uniformity testing with limited memory. We observe a sequence of independent identically distributed random variables drawn from a distribution $p$ over $[n]$, which is either uniform or is $\varepsilon$-far from uniform under the total variation distance, and our goal is to determine the correct hypothesis. At each time point we are allowed to update the state of a finite-memory machine with $S$ states, where each state of the machine is assigned one of the hypotheses, and we are interested in obtaining an asymptotic probability of error at most $0<δ<1/2$ uniformly under both hypotheses. The main contribution of this paper is deriving upper and lower bounds on the number of states $S$ needed in order to achieve a constant error probability $δ$, as a function of $n$ and $\varepsilon$, where our upper bound is $O(\frac{n\log n}{\varepsilon})$ and our lower bound is $Ω(n+\frac{1}{\varepsilon})$. Prior works in the field have almost exclusively used collision counting for upper bounds, and the Paninski mixture for lower bounds. Somewhat surprisingly, in the limited memory with unlimited samples setup, the optimal solution does not involve counting collisions, and the Paninski prior is not hard. Thus, different proof techniques are needed in order to attain our bounds. △ Less

Submitted 19 June, 2022; originally announced June 2022.

Comments: To be presented in COLT 2022

arXiv:2206.09390 [pdf, ps, other]

Deterministic Finite-Memory Bias Estimation

Authors: Tomer Berg, Or Ordentlich, Ofer Shayevitz

Abstract: In this paper we consider the problem of estimating a Bernoulli parameter using finite memory. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables with expectation $θ$, where $θ\in [0,1]$. Consider a finite-memory deterministic machine with $S$ states, that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule… ▽ More In this paper we consider the problem of estimating a Bernoulli parameter using finite memory. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables with expectation $θ$, where $θ\in [0,1]$. Consider a finite-memory deterministic machine with $S$ states, that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule $M_n = f(M_{n-1},X_n)$, where $f$ is a deterministic time-invariant function. Assume that the machine outputs an estimate at each time point according to some fixed map** from the state space to the unit interval. The quality of the estimation procedure is measured by the asymptotic risk, which is the long-term average of the instantaneous quadratic risk. The main contribution of this paper is an upper bound on the smallest worst-case asymptotic risk any such machine can attain. This bound coincides with a lower bound derived by Leighton and Rivest, to imply that $Θ(1/S)$ is the minimax asymptotic risk for deterministic $S$-state machines. In particular, our result disproves a longstanding $Θ(\log S/S)$ conjecture for this quantity, also posed by Leighton and Rivest. △ Less

Submitted 19 June, 2022; originally announced June 2022.

Comments: Presented in COLT 2021

arXiv:2206.03428 [pdf, other]

Revealing Single Frame Bias for Video-and-Language Learning

Authors: Jie Lei, Tamara L. Berg, Mohit Bansal

Abstract: Training an effective video-and-language model intuitively requires multiple frames as model inputs. However, it is unclear whether using multiple frames is beneficial to downstream tasks, and if yes, whether the performance gain is worth the drastically-increased computation and memory costs resulting from using more frames. In this work, we explore single-frame models for video-and-language lear… ▽ More Training an effective video-and-language model intuitively requires multiple frames as model inputs. However, it is unclear whether using multiple frames is beneficial to downstream tasks, and if yes, whether the performance gain is worth the drastically-increased computation and memory costs resulting from using more frames. In this work, we explore single-frame models for video-and-language learning. On a diverse set of video-and-language tasks (including text-to-video retrieval and video question answering), we show the surprising result that, with large-scale pre-training and a proper frame ensemble strategy at inference time, a single-frame trained model that does not consider temporal information can achieve better performance than existing methods that use multiple frames for training. This result reveals the existence of a strong "static appearance bias" in popular video-and-language datasets. Therefore, to allow for a more comprehensive evaluation of video-and-language models, we propose two new retrieval tasks based on existing fine-grained action recognition datasets that encourage temporal modeling. Our code is available at https://github.com/jayleicn/singularity △ Less

Submitted 7 June, 2022; originally announced June 2022.

Comments: 19 pages, 8 figures

arXiv:2205.04488 [pdf, other]

doi 10.1051/0004-6361/202243208

Performance of ESPRESSO's high resolution 4x2 binning for characterizing intervening absorbers towards faint quasars

Authors: Trystyn A. M. Berg, Guido Cupani, Pedro Figueira, Andrea Mehner

Abstract: As of October 2021 (Period 108), the European Southern Observatory (ESO) offers a new mode of the ESPRESSO spectrograph designed to use the High Resolution grating with 4x2 binning (spatial by spectral; HR42 mode) with the specific objective of observing faint targets with a single Unit Telescope at Paranal. We validated the new HR42 mode using four hours of on-target observations of the quasar J0… ▽ More As of October 2021 (Period 108), the European Southern Observatory (ESO) offers a new mode of the ESPRESSO spectrograph designed to use the High Resolution grating with 4x2 binning (spatial by spectral; HR42 mode) with the specific objective of observing faint targets with a single Unit Telescope at Paranal. We validated the new HR42 mode using four hours of on-target observations of the quasar J0003-2603, known to host an intervening metal-poor absorber along the line of sight. The capabilities of the ESPRESSO HR42 mode (resolving power R~137 000) were evaluated by comparing to a UVES spectrum of the same target with a similar integration time but lower resolving power (R~48 000). For both data sets we tested the ability to decompose the velocity profile of the intervening absorber using Voigt profile fitting and extracted the total column densities of CIV, NI, SiII, AlII, FeII, and NiII. With ~3x the resolving power and ~2x lower S/N for a nearly equivalent exposure time, the ESPRESSO data is able to just as accurately characterize the individual components of the absorption lines as the comparison UVES data, but has the added bonus of identifying narrower components not detected by UVES. For UVES to provide similar spectral resolution (R>100 000; 0.3'' slit) and the broad wavelength coverage of ESPRESSO, the Exposure Time Calculator (ETC) supplied by ESO estimates 8 hrs of exposure time spread over two settings; requiring double the time investment than that of ESPRESSO's HR42 mode whilst not properly sampling the UVES spectral resolution element. Thus ESPRESSO's HR42 mode offers nearly triple the resolving power of UVES (0.8'' slit to match typical ambient conditions at Paranal) and provides more accurate characterization of quasar absorption features for an equivalent exposure time. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: 9 pages, 5 figures, accepted for publication in A&A

Journal ref: A&A 662, A35 (2022)

arXiv:2205.02335 [pdf]

doi 10.1109/TMI.2022.3168436

Acceleration Strategies for MR-STAT: Achieving High-Resolution Reconstructions on a Desktop PC within 3 minutes

Authors: Hongyan Liu, Oscar van der Heide, Stefano Mandija, Cornelis A. T. van den Berg, Alessandro Sbrizzi

Abstract: MR-STAT is an emerging quantitative magnetic resonance imaging technique which aims at obtaining multi-parametric tissue parameter maps from single short scans. It describes the relationship between the spatial-domain tissue parameters and the time-domain measured signal by using a comprehensive, volumetric forward model. The MR-STAT reconstruction solves a large-scale nonlinear problem, thus is v… ▽ More MR-STAT is an emerging quantitative magnetic resonance imaging technique which aims at obtaining multi-parametric tissue parameter maps from single short scans. It describes the relationship between the spatial-domain tissue parameters and the time-domain measured signal by using a comprehensive, volumetric forward model. The MR-STAT reconstruction solves a large-scale nonlinear problem, thus is very computationally challenging. In previous work, MR-STAT reconstruction using Cartesian readout data was accelerated by approximating the Hessian matrix with sparse, banded blocks, and can be done on high performance CPU clusters with tens of minutes. In the current work, we propose an accelerated Cartesian MR-STAT algorithm incorporating two different strategies: firstly, a neural network is trained as a fast surrogate to learn the magnetization signal not only in the full time-domain but also in the compressed lowrank domain; secondly, based on the surrogate model, the Cartesian MR-STAT problem is re-formulated and split into smaller sub-problems by the alternating direction method of multipliers. The proposed method substantially reduces the computational requirements for runtime and memory. Simulated and in-vivo balanced MR-STAT experiments show similar reconstruction results using the proposed algorithm compared to the previous sparse Hessian method, and the reconstruction times are at least 40 times shorter. Incorporating sensitivity encoding and regularization terms is straightforward, and allows for better image quality with a negligible increase in reconstruction time. The proposed algorithm could reconstruct both balanced and gradient-spoiled in-vivo data within 3 minutes on a desktop PC, and could thereby facilitate the translation of MR-STAT in clinical settings. △ Less

Submitted 4 May, 2022; originally announced May 2022.

Comments: 12 pages, 7 figures, accepted by IEEE Transactions on Medical Imaging (in press)

arXiv:2205.01668 [pdf, other]

End-to-End Visual Editing with a Generatively Pre-Trained Artist

Authors: Andrew Brown, Cheng-Yang Fu, Omkar Parkhi, Tamara L. Berg, Andrea Vedaldi

Abstract: We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change. Differently from prior works, we solve this problem by learning a conditional probability distribution of the edits, end-to-end. Training such a model requires addressing a fundamental technical challenge: the lack of example edits for training. To this end, we… ▽ More We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change. Differently from prior works, we solve this problem by learning a conditional probability distribution of the edits, end-to-end. Training such a model requires addressing a fundamental technical challenge: the lack of example edits for training. To this end, we propose a self-supervised approach that simulates edits by augmenting off-the-shelf images in a target domain. The benefits are remarkable: implemented as a state-of-the-art auto-regressive transformer, our approach is simple, sidesteps difficulties with previous methods based on GAN-like priors, obtains significantly better edits, and is efficient. Furthermore, we show that different blending effects can be learned by an intuitive control of the augmentation process, with no other changes required to the model architecture. We demonstrate the superiority of this approach across several datasets in extensive quantitative and qualitative experiments, including human studies, significantly outperforming prior work. △ Less

Submitted 3 May, 2022; originally announced May 2022.

arXiv:2204.09873 [pdf, other]

Gaussian Processes for real-time 3D motion and uncertainty estimation during MR-guided radiotherapy

Authors: Niek R. F. Huttinga, Tom Bruijnen, Cornelis A. T. van den Berg, Alessandro Sbrizzi

Abstract: Respiratory motion during radiotherapy causes uncertainty in the tumor's location, which is typically addressed by an increased radiation area and a decreased dose. As a result, the treatments' efficacy is reduced. The recently proposed hybrid MR-linac scanner holds the promise to efficiently deal with such respiratory motion through real-time adaptive MR-guided radiotherapy (MRgRT). For MRgRT, mo… ▽ More Respiratory motion during radiotherapy causes uncertainty in the tumor's location, which is typically addressed by an increased radiation area and a decreased dose. As a result, the treatments' efficacy is reduced. The recently proposed hybrid MR-linac scanner holds the promise to efficiently deal with such respiratory motion through real-time adaptive MR-guided radiotherapy (MRgRT). For MRgRT, motion-fields should be estimated from MR-data and the radiotherapy plan should be adapted in real-time according to the estimated motion-fields. All of this should be performed with a total latency of maximally 200 ms, including data acquisition and reconstruction. A measure of confidence in such estimated motion-fields is highly desirable, for instance to ensure the patient's safety in case of unexpected and undesirable motion. In this work, we propose a framework based on Gaussian Processes to infer 3D motion-fields and uncertainty maps in real-time from only three readouts of MR-data. We demonstrated an inference frame rate up to 69 Hz including data acquisition and reconstruction, thereby exploiting the limited amount of required MR-data. Additionally, we designed a rejection criterion based on the motion-field uncertainty maps to demonstrate the framework's potential for quality assurance. The framework was validated in silico and in vivo on healthy volunteer data (n=5) acquired using an MR-linac, thereby taking into account different breathing patterns and controlled bulk motion. Results indicate end-point-errors with a 75th percentile below 1mm in silico, and a correct detection of erroneous motion estimates with the rejection criterion. Altogether, the results show the potential of the framework for application in real-time MR-guided radiotherapy with an MR-linac. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Comments: This manuscript has supplementary files which can be downloaded at https://surfdrive.surf. nl/files/index.php/s/scLts9nJYXfbLMx. The files include videos that show reconstructed motion-fields and spatial uncertainty maps. See the Appendix for a description of all individual files

arXiv:2203.05465 [pdf, other]

LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval

Authors: Jie Lei, Xinlei Chen, Ning Zhang, Mengjiao Wang, Mohit Bansal, Tamara L. Berg, Licheng Yu

Abstract: Dual encoders and cross encoders have been widely used for image-text retrieval. Between the two, the dual encoder encodes the image and text independently followed by a dot product, while the cross encoder jointly feeds image and text as the input and performs dense multi-modal fusion. These two architectures are typically modeled separately without interaction. In this work, we propose LoopITR,… ▽ More Dual encoders and cross encoders have been widely used for image-text retrieval. Between the two, the dual encoder encodes the image and text independently followed by a dot product, while the cross encoder jointly feeds image and text as the input and performs dense multi-modal fusion. These two architectures are typically modeled separately without interaction. In this work, we propose LoopITR, which combines them in the same network for joint learning. Specifically, we let the dual encoder provide hard negatives to the cross encoder, and use the more discriminative cross encoder to distill its predictions back to the dual encoder. Both steps are efficiently performed together in the same model. Our work centers on empirical analyses of this combined architecture, putting the main focus on the design of the distillation objective. Our experimental results highlight the benefits of training the two encoders in the same network, and demonstrate that distillation can be quite effective with just a few hard negative examples. Experiments on two standard datasets (Flickr30K and COCO) show our approach achieves state-of-the-art dual encoder performance when compared with approaches using a similar amount of data. △ Less

Submitted 10 March, 2022; originally announced March 2022.

arXiv:2202.12206 [pdf, other]

doi 10.1093/mnras/stac545

The evolution of the Si IV content in the Universe from the epoch of reionization to cosmic noon

Authors: V. D'Odorico, K. Finlator, S. Cristiani, G. Cupani, S. Perrotta, F. Calura, M. Cènturion, G. Becker, T. A. M. Berg, S. Lopez, S. Ellison, E. Pomante

Abstract: We investigate the abundance and distribution of metals in the high-redshift intergalactic medium and circum-galactic medium through the analysis of a sample of almost 600 SiIV absorption lines detected in high and intermediate resolution spectra of 147 quasars. The evolution of the number density of SiIV lines, the column density distribution function and the cosmic mass density are studied in th… ▽ More We investigate the abundance and distribution of metals in the high-redshift intergalactic medium and circum-galactic medium through the analysis of a sample of almost 600 SiIV absorption lines detected in high and intermediate resolution spectra of 147 quasars. The evolution of the number density of SiIV lines, the column density distribution function and the cosmic mass density are studied in the redshift interval 1.7 <= z <= 6.2 and for log N(SiIV) >= 12.5. All quantities show a rapid increase between z~6 and z< 5 and then an almost constant behaviour to z~2 in very good agreement with what is already observed for CIV absorption lines. The present results are challenging for numerical simulations: when simulations reproduce our SiIV results, they tend to underpredict the properties of CIV, and when the properties of CIV are reproduced, the number of strong SiIV lines (log N(SiIV) > 14) is overpredicted. △ Less

Submitted 24 February, 2022; originally announced February 2022.

Comments: 14 pages, 8 figures, accepted for publication in MNRAS

arXiv:2202.07247 [pdf, other]

CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval

Authors: Licheng Yu, Jun Chen, Animesh Sinha, Mengjiao MJ Wang, Hugo Chen, Tamara L. Berg, Ning Zhang

Abstract: We introduce CommerceMM - a multimodal model capable of providing a diverse and granular understanding of commerce topics associated to the given piece of content (image, text, image+text), and having the capability to generalize to a wide range of tasks, including Multimodal Categorization, Image-Text Retrieval, Query-to-Product Retrieval, Image-to-Product Retrieval, etc. We follow the pre-traini… ▽ More We introduce CommerceMM - a multimodal model capable of providing a diverse and granular understanding of commerce topics associated to the given piece of content (image, text, image+text), and having the capability to generalize to a wide range of tasks, including Multimodal Categorization, Image-Text Retrieval, Query-to-Product Retrieval, Image-to-Product Retrieval, etc. We follow the pre-training + fine-tuning training regime and present 5 effective pre-training tasks on image-text pairs. To embrace more common and diverse commerce data with text-to-multimodal, image-to-multimodal, and multimodal-to-multimodal map**, we propose another 9 novel cross-modal and cross-pair retrieval tasks, called Omni-Retrieval pre-training. The pre-training is conducted in an efficient manner with only two forward/backward updates for the combined 14 tasks. Extensive experiments and analysis show the effectiveness of each task. When combining all pre-training tasks, our model achieves state-of-the-art performance on 7 commerce-related downstream tasks after fine-tuning. Additionally, we propose a novel approach of modality randomization to dynamically adjust our model under different efficiency constraints. △ Less

Submitted 15 February, 2022; originally announced February 2022.

Comments: 10 pages, 7 figures. Commerce Multimodal Model towards Real Applications at Facebook

arXiv:2202.03021 [pdf, other]

Free-breathing motion compensated 4D (3D+respiration) T2-weighted turbo spin-echo MRI for body imaging

Authors: T. Bruijnen, T. Schake, O. Akdag, C. V. M. Bruel, J. J. W. Lagendijk, C. A. T. van den Berg, R. H. N. Tijssen

Abstract: Purpose: To develop and evaluate a free-breathing respiratory motion compensated 4D (3D+respiration) $T_2$-weighted turbo spin echo sequence with application to radiology and MR-guided radiotherapy. Methods: k-space data are continuously acquired using a rewound Cartesian acquisition with spiral profile ordering (rCASPR) to provide matching contrast to the conventional linear phase encode orderi… ▽ More Purpose: To develop and evaluate a free-breathing respiratory motion compensated 4D (3D+respiration) $T_2$-weighted turbo spin echo sequence with application to radiology and MR-guided radiotherapy. Methods: k-space data are continuously acquired using a rewound Cartesian acquisition with spiral profile ordering (rCASPR) to provide matching contrast to the conventional linear phase encode ordering and to sort data into multiple respiratory phases. Low-resolution respiratory-correlated 4D images were reconstructed with compressed sensing and used to estimate non-rigid deformation vector fields, which were subsequently used for a motion compensated image reconstruction. rCASPR sampling was compared to linear and CASPR sampling in terms of point-spread-function (PSF) and image contrast with in silico, phantom and in vivo experiments. Reconstruction parameters for low-resolution 4D-MRI (spatial resolution and temporal regularization) were determined using a grid search. The proposed motion compensated rCASPR was evaluated in eight healthy volunteers and compared to free-breathing scans with linear sampling. Image quality was compared based on visual inspection and quantitatively by means of the gradient entropy. Results: rCASPR provided a superior PSF (similar in ky and narrower in kz) and showed no considerable differences in images contrast compared to linear sampling. The optimal 4D-MRI reconstruction parameters were spatial resolution=$4.5 mm^3$ and $λ_t=10^{-4}$. The groupwise average gradient entropy was 22.31 for linear, 22.20 for rCASPR, 22.14 for soft-gated rCASPR and 22.02 for motion compensated rCASPR. Conclusion: The proposed motion compensated rCASPR enables high quality free-breathing T2-TSE with minimal changes in image contrast and scan time. The proposed method therefore enables direct transfer of clinically used 3D TSE sequences to free-breathing. △ Less

Submitted 7 February, 2022; originally announced February 2022.

Comments: 19 pages, 11 figures

arXiv:2108.00061 [pdf, other]

MTVR: Multilingual Moment Retrieval in Videos

Authors: Jie Lei, Tamara L. Berg, Mohit Bansal

Abstract: We introduce mTVR, a large-scale multilingual video moment retrieval dataset, containing 218K English and Chinese queries from 21.8K TV show video clips. The dataset is collected by extending the popular TVR dataset (in English) with paired Chinese queries and subtitles. Compared to existing moment retrieval datasets, mTVR is multilingual, larger, and comes with diverse annotations. We further pro… ▽ More We introduce mTVR, a large-scale multilingual video moment retrieval dataset, containing 218K English and Chinese queries from 21.8K TV show video clips. The dataset is collected by extending the popular TVR dataset (in English) with paired Chinese queries and subtitles. Compared to existing moment retrieval datasets, mTVR is multilingual, larger, and comes with diverse annotations. We further propose mXML, a multilingual moment retrieval model that learns and operates on data from both languages, via encoder parameter sharing and language neighborhood constraints. We demonstrate the effectiveness of mXML on the newly collected MTVR dataset, where mXML outperforms strong monolingual baselines while using fewer parameters. In addition, we also provide detailed dataset analyses and model ablations. Data and code are publicly available at https://github.com/jayleicn/mTVRetrieval △ Less

Submitted 30 July, 2021; originally announced August 2021.

Comments: ACL 2021 (9 pages, 4 figures)

arXiv:2107.09609 [pdf, other]

QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

Authors: Jie Lei, Tamara L. Berg, Mohit Bansal

Abstract: Detecting customized moments and highlights from videos given natural language (NL) user queries is an important but under-studied topic. One of the challenges in pursuing this direction is the lack of annotated data. To address this issue, we present the Query-based Video Highlights (QVHIGHLIGHTS) dataset. It consists of over 10,000 YouTube videos, covering a wide range of topics, from everyday a… ▽ More Detecting customized moments and highlights from videos given natural language (NL) user queries is an important but under-studied topic. One of the challenges in pursuing this direction is the lack of annotated data. To address this issue, we present the Query-based Video Highlights (QVHIGHLIGHTS) dataset. It consists of over 10,000 YouTube videos, covering a wide range of topics, from everyday activities and travel in lifestyle vlog videos to social and political activities in news videos. Each video in the dataset is annotated with: (1) a human-written free-form NL query, (2) relevant moments in the video w.r.t. the query, and (3) five-point scale saliency scores for all query-relevant clips. This comprehensive annotation enables us to develop and evaluate systems that detect relevant moments as well as salient highlights for diverse, flexible user queries. We also present a strong baseline for this task, Moment-DETR, a transformer encoder-decoder model that views moment retrieval as a direct set prediction problem, taking extracted video and query representations as inputs and predicting moment coordinates and saliency scores end-to-end. While our model does not utilize any human prior, we show that it performs competitively when compared to well-engineered architectures. With weakly supervised pretraining using ASR captions, MomentDETR substantially outperforms previous methods. Lastly, we present several ablations and visualizations of Moment-DETR. Data and code is publicly available at https://github.com/jayleicn/moment_detr △ Less

Submitted 29 November, 2021; v1 submitted 20 July, 2021; originally announced July 2021.

Comments: Accepted to NeurIPS 2021

arXiv:2106.04632 [pdf, other]

VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation

Authors: Linjie Li, Jie Lei, Zhe Gan, Licheng Yu, Yen-Chun Chen, Rohit Pillai, Yu Cheng, Luowei Zhou, Xin Eric Wang, William Yang Wang, Tamara Lee Berg, Mohit Bansal, **g**g Liu, Lijuan Wang, Zicheng Liu

Abstract: Most existing video-and-language (VidL) research focuses on a single dataset, or multiple datasets of a single task. In reality, a truly useful VidL system is expected to be easily generalizable to diverse tasks, domains, and datasets. To facilitate the evaluation of such systems, we introduce Video-And-Language Understanding Evaluation (VALUE) benchmark, an assemblage of 11 VidL datasets over 3 p… ▽ More Most existing video-and-language (VidL) research focuses on a single dataset, or multiple datasets of a single task. In reality, a truly useful VidL system is expected to be easily generalizable to diverse tasks, domains, and datasets. To facilitate the evaluation of such systems, we introduce Video-And-Language Understanding Evaluation (VALUE) benchmark, an assemblage of 11 VidL datasets over 3 popular tasks: (i) text-to-video retrieval; (ii) video question answering; and (iii) video captioning. VALUE benchmark aims to cover a broad range of video genres, video lengths, data volumes, and task difficulty levels. Rather than focusing on single-channel videos with visual information only, VALUE promotes models that leverage information from both video frames and their associated subtitles, as well as models that share knowledge across multiple tasks. We evaluate various baseline methods with and without large-scale VidL pre-training, and systematically investigate the impact of video input channels, fusion methods, and different video representations. We also study the transferability between tasks, and conduct multi-task learning under different settings. The significant gap between our best model and human performance calls for future study for advanced VidL models. VALUE is available at https://value-benchmark.github.io/. △ Less

Submitted 18 August, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

Comments: To appear in 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks

arXiv:2105.11373 [pdf, other]

Large-Scale Attribute-Object Compositions

Authors: Filip Radenovic, Animesh Sinha, Albert Gordo, Tamara Berg, Dhruv Mahajan

Abstract: We study the problem of learning how to predict attribute-object compositions from images, and its generalization to unseen compositions missing from the training data. To the best of our knowledge, this is a first large-scale study of this problem, involving hundreds of thousands of compositions. We train our framework with images from Instagram using hashtags as noisy weak supervision. We make c… ▽ More We study the problem of learning how to predict attribute-object compositions from images, and its generalization to unseen compositions missing from the training data. To the best of our knowledge, this is a first large-scale study of this problem, involving hundreds of thousands of compositions. We train our framework with images from Instagram using hashtags as noisy weak supervision. We make careful design choices for data collection and modeling, in order to handle noisy annotations and unseen compositions. Finally, extensive evaluations show that learning to compose classifiers outperforms late fusion of individual attribute and object predictions, especially in the case of unseen attribute-object pairs. △ Less

Submitted 24 May, 2021; originally announced May 2021.

arXiv:2105.05964 [pdf, other]

Connecting What to Say With Where to Look by Modeling Human Attention Traces

Authors: Zihang Meng, Licheng Yu, Ning Zhang, Tamara Berg, Babak Damavandi, Vikas Singh, Amy Bearman

Abstract: We introduce a unified framework to jointly model images, text, and human attention traces. Our work is built on top of the recent Localized Narratives annotation framework [30], where each word of a given caption is paired with a mouse trace segment. We propose two novel tasks: (1) predict a trace given an image and caption (i.e., visual grounding), and (2) predict a caption and a trace given onl… ▽ More We introduce a unified framework to jointly model images, text, and human attention traces. Our work is built on top of the recent Localized Narratives annotation framework [30], where each word of a given caption is paired with a mouse trace segment. We propose two novel tasks: (1) predict a trace given an image and caption (i.e., visual grounding), and (2) predict a caption and a trace given only an image. Learning the grounding of each word is challenging, due to noise in the human-provided traces and the presence of words that cannot be meaningfully visually grounded. We present a novel model architecture that is jointly trained on dual tasks (controlled trace generation and controlled caption generation). To evaluate the quality of the generated traces, we propose a local bipartite matching (LBM) distance metric which allows the comparison of two traces of different lengths. Extensive experiments show our model is robust to the imperfect training data and outperforms the baselines by a clear margin. Moreover, we demonstrate that our model pre-trained on the proposed tasks can be also beneficial to the downstream task of COCO's guided image captioning. Our code and project page are publicly available. △ Less

Submitted 12 May, 2021; originally announced May 2021.

arXiv:2105.01673 [pdf, other]

doi 10.1093/mnras/stab2147

Telltale signs of metal recycling in the circumgalactic medium of a $z \sim 0.77$ galaxy

Authors: N. Tejos, S. López, C. Ledoux, A. Fernández-Figueroa, N. Rivas, K. Sharon, E. J. Johnston, M. K. Florian, G. D'Ago, A. Katsianis, F. Barrientos, T. Berg, F. Corro-Guerra, M. Hamel, C. Moya-Sierralta, S. Poudel, J. R. Rigby, M. Solimano

Abstract: We present gravitational-arc tomography of the cool-warm enriched circumgalactic medium (CGM) of an isolated galaxy (``G1'') at $z \approx 0.77$. Combining VLT/MUSE adaptive-optics and Magellan/MagE echelle spectroscopy we obtain partially-resolved kinematics of MgII in absorption and [OII] in emission. The unique arc configuration allows us to probe 42 spatially independent arc positions transver… ▽ More We present gravitational-arc tomography of the cool-warm enriched circumgalactic medium (CGM) of an isolated galaxy (``G1'') at $z \approx 0.77$. Combining VLT/MUSE adaptive-optics and Magellan/MagE echelle spectroscopy we obtain partially-resolved kinematics of MgII in absorption and [OII] in emission. The unique arc configuration allows us to probe 42 spatially independent arc positions transverse to G1, plus 4 positions in front of it. The transverse positions cover G1's minor and major axes at impact parameters of $\approx 10-30$ kpc and $\approx 60$ kpc, respectively. We observe a direct kinematic connection between the cool-warm enriched CGM (traced by MgII) and the interstellar medium (traced by [OII]). This provides strong evidence for the existence of an extended disc that co-rotates with the galaxy out to tens of kiloparsecs. The MgII velocity dispersion ($σ\approx 30-100$ km s$^{-1}$, depending on position) is of the same order as the modeled galaxy rotational velocity ($v_{\rm rot} \approx 80$ km s$^{-1}$), providing evidence for the presence of a turbulent and pressure-supported CGM component. We regard the absorption to be modulated by a galactic-scale outflow, as it offers a natural scenario for the observed line-of-sight dispersion and asymmetric profiles observed against both the arcs and the galaxy. An extended enriched co-rotating disc together with the signatures of a galactic outflow, are telltale signs of metal recycling in the $z\sim 1$ CGM. △ Less

Submitted 22 July, 2021; v1 submitted 4 May, 2021; originally announced May 2021.

Comments: MNRAS accepted. Moderate changes after addressing referee feedback, including a correction in the systemic redshift; all qualitative results and conclusions remain the same

arXiv:2104.07957 [pdf, other]

doi 10.1109/TMI.2021.3112818

Real-time non-rigid 3D respiratory motion estimation for MR-guided radiotherapy using MR-MOTUS

Authors: Niek R. F. Huttinga, Tom Bruijnen, Cornelis A. T. van den Berg, Alessandro Sbrizzi

Abstract: The MR-Linac is a combination of an MR-scanner and radiotherapy linear accelerator (Linac) which holds the promise to increase the precision of radiotherapy treatments with MR-guided radiotherapy by monitoring motion during radiotherapy with MRI, and adjusting the radiotherapy plan accordingly. Optimal MR-guidance for respiratory motion during radiotherapy requires MR-based 3D motion estimation wi… ▽ More The MR-Linac is a combination of an MR-scanner and radiotherapy linear accelerator (Linac) which holds the promise to increase the precision of radiotherapy treatments with MR-guided radiotherapy by monitoring motion during radiotherapy with MRI, and adjusting the radiotherapy plan accordingly. Optimal MR-guidance for respiratory motion during radiotherapy requires MR-based 3D motion estimation with a latency of 200-500 ms. Currently this is still challenging since typical methods rely on MR-images, and are therefore limited by the 3D MR-imaging latency. In this work, we present a method to perform non-rigid 3D respiratory motion estimation with 170 ms latency, including both acquisition and reconstruction. The proposed method called real-time low-rank MR-MOTUS reconstructs motion-fields directly from k-space data, and leverages an explicit low-rank decomposition of motion-fields to split the large scale 3D+t motion-field reconstruction problem posed in our previous work into two parts: (I) a medium-scale offline preparation phase and (II) a small-scale online inference phase which exploits the results of the offline phase for real-time computations. The method was validated on free-breathing data of five volunteers, acquired with a 1.5T Elekta Unity MR-Linac. Results show that the reconstructed 3D motion-field are anatomically plausible, highly correlated with a self-navigation motion surrogate (R = 0.975 +/- 0.0110), and can be reconstructed with a total latency of 170 ms that is sufficient for real-time MR-guided abdominal radiotherapy. △ Less

Submitted 14 September, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

Comments: This manuscript has supplementary files which can be downloaded at https://surfdrive.surf.nl/files/index.php/s/vz2xmwliglRmcjo. The files include six videos that show reconstructed motion-fields and a document with supporting figures. See Appendix I for a description of all individual files

arXiv:2102.06183 [pdf, other]

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Authors: Jie Lei, Linjie Li, Luowei Zhou, Zhe Gan, Tamara L. Berg, Mohit Bansal, **g**g Liu

Abstract: The canonical approach to video-and-language learning (e.g., video question answering) dictates a neural model to learn from offline-extracted dense video features from vision models and text features from language models. These feature extractors are trained independently and usually on tasks different from the target domains, rendering these fixed features sub-optimal for downstream tasks. Moreo… ▽ More The canonical approach to video-and-language learning (e.g., video question answering) dictates a neural model to learn from offline-extracted dense video features from vision models and text features from language models. These feature extractors are trained independently and usually on tasks different from the target domains, rendering these fixed features sub-optimal for downstream tasks. Moreover, due to the high computational overload of dense video features, it is often difficult (or infeasible) to plug feature extractors directly into existing approaches for easy finetuning. To provide a remedy to this dilemma, we propose a generic framework ClipBERT that enables affordable end-to-end learning for video-and-language tasks, by employing sparse sampling, where only a single or a few sparsely sampled short clips from a video are used at each training step. Experiments on text-to-video retrieval and video question answering on six datasets demonstrate that ClipBERT outperforms (or is on par with) existing methods that exploit full-length videos, suggesting that end-to-end learning with just a few sparsely sampled clips is often more accurate than using densely extracted offline features from full-length videos, proving the proverbial less-is-more principle. Videos in the datasets are from considerably different domains and lengths, ranging from 3-second generic domain GIF videos to 180-second YouTube human activity videos, showing the generalization ability of our approach. Comprehensive ablation studies and thorough analyses are provided to dissect what factors lead to this success. Our code is publicly available at https://github.com/jayleicn/ClipBERT △ Less

Submitted 11 February, 2021; originally announced February 2021.

Comments: 12 pages, 5 figures, 11 tables. - Happy Chinese New Year!

arXiv:2101.07821 [pdf, other]

doi 10.1093/mnras/stab184

Sub-damped Lyman alpha systems in the XQ-100 survey II -- Chemical evolution at 2.4<z<4.3

Authors: Trystyn A. M. Berg, Michele Fumagalli, Valentina D'Odorico, Sara L. Ellison, Sebastian Lopez, George D. Becker, Lise Christensen, Guido Cupani, Kelly D. Denney, Ruben Sanchez-Ramirez, Gabor Worseck

Abstract: We present the measured gas-phase metal column densities in 155 sub-damped Lyman alpha systems (subDLAs) with the aim to investigate the contribution of subDLAs to the chemical evolution of the Universe. The sample was identified within the absorber-blind XQ-100 quasar spectroscopic survey over the redshift range 2.4<=z<=4.3. Using all available column densities of the ionic species investigated (… ▽ More We present the measured gas-phase metal column densities in 155 sub-damped Lyman alpha systems (subDLAs) with the aim to investigate the contribution of subDLAs to the chemical evolution of the Universe. The sample was identified within the absorber-blind XQ-100 quasar spectroscopic survey over the redshift range 2.4<=z<=4.3. Using all available column densities of the ionic species investigated (mainly CIV, SiII, MgII, SiIV, AlII, FeII, CII, and OI; in order of decreasing detection frequency), we estimate the ionization-corrected gas-phase metallicity of each system using Markov Chain Monte Carlo techniques to explore a large grid of Cloudy ionization models. Without accounting for ionization and dust depletion effects, we find that the HI-weighted gas-phase metallicity evolution of subDLAs are consistent with damped Lyman alpha systems (DLAs). When ionization corrections are included, subDLAs are systematically more metal-poor than DLAs (between ~0.5 sigma and ~3 sigma significance) by up to ~1.0 dex over the redshift range 3<=z<=4.3. The correlation of gas-phase [Si/Fe] with metallicity in subDLAs appears to be consistent with that of DLAs, suggesting that the two classes of absorbers have a similar relative dust depletion pattern. As previously seen for Lyman limit systems, the gas-phase [C/O] in subDLAs remains constantly solar for all metallicities indicating that both subDLAs and Lyman limit systems could trace carbon-rich ejecta, potentially in circumgalactic environments. △ Less

Submitted 19 January, 2021; originally announced January 2021.

Comments: Accepted for publication in MNRAS. 64 pages (20 pages of main text, 44 pages of Figures in appendix). Machine-readable versions of Tables 2 and 3 are available in the source files or available online on MNRAS

arXiv:2012.02897 [pdf, other]

Discovering Underground Maps from Fashion

Authors: Utkarsh Mall, Kavita Bala, Tamara Berg, Kristen Grauman

Abstract: The fashion sense -- meaning the clothing styles people wear -- in a geographical region can reveal information about that region. For example, it can reflect the kind of activities people do there, or the type of crowds that frequently visit the region (e.g., tourist hot spot, student neighborhood, business center). We propose a method to automatically create underground neighborhood maps of citi… ▽ More The fashion sense -- meaning the clothing styles people wear -- in a geographical region can reveal information about that region. For example, it can reflect the kind of activities people do there, or the type of crowds that frequently visit the region (e.g., tourist hot spot, student neighborhood, business center). We propose a method to automatically create underground neighborhood maps of cities by analyzing how people dress. Using publicly available images from across a city, our method finds neighborhoods with a similar fashion sense and segments the map without supervision. For 37 cities worldwide, we show promising results in creating good underground maps, as evaluated using experiments with human judges and underground map benchmarks derived from non-image data. Our approach further allows detecting distinct neighborhoods (what is the most unique region of LA?) and answering analogy questions between cities (what is the "Downtown LA" of Bogota?). △ Less

Submitted 4 December, 2020; originally announced December 2020.

arXiv:2010.08471 [pdf, other]

doi 10.1103/PhysRevB.103.155419

A quantum-network approach to spin interferometry driven by Abelian and non-Abelian fields

Authors: A. Hijano, T. van den Berg, D. Frustaglia, D. Bercioux

Abstract: We present a theory of conducting quantum networks that accounts for Abelian and non-Abelian fields acting on spin carriers. We apply this approach to model the conductance of mesoscopic spin interferometers of different geometry (such as squares and rings), reproducing recent experimental findings in nanostructured InAsGa quantum wells subject to Rashba spin-orbit and Zeeman fields (as, e.g., the… ▽ More We present a theory of conducting quantum networks that accounts for Abelian and non-Abelian fields acting on spin carriers. We apply this approach to model the conductance of mesoscopic spin interferometers of different geometry (such as squares and rings), reproducing recent experimental findings in nanostructured InAsGa quantum wells subject to Rashba spin-orbit and Zeeman fields (as, e.g., the manipulation of Aharonov-Casher interference patterns by geometric means). Moreover, by introducing an additional field-texture engineering, we manage to single out a previously unnoticed spin-phase suppression mechanism. We notice that our approach can also be used for the study of complex networks and the spectral properties of closed systems. △ Less

Submitted 22 April, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

Comments: 13 pages with 9 figures

Journal ref: Phys. Rev. B 103, 155419 (2021)

arXiv:2010.07999 [pdf, other]

What is More Likely to Happen Next? Video-and-Language Future Event Prediction

Authors: Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal

Abstract: Given a video with aligned dialogue, people can often infer what is more likely to happen next. Making such predictions requires not only a deep understanding of the rich dynamics underlying the video and dialogue, but also a significant amount of commonsense knowledge. In this work, we explore whether AI models are able to learn to make such multimodal commonsense next-event predictions. To suppo… ▽ More Given a video with aligned dialogue, people can often infer what is more likely to happen next. Making such predictions requires not only a deep understanding of the rich dynamics underlying the video and dialogue, but also a significant amount of commonsense knowledge. In this work, we explore whether AI models are able to learn to make such multimodal commonsense next-event predictions. To support research in this direction, we collect a new dataset, named Video-and-Language Event Prediction (VLEP), with 28,726 future event prediction examples (along with their rationales) from 10,234 diverse TV Show and YouTube Lifestyle Vlog video clips. In order to promote the collection of non-trivial challenging examples, we employ an adversarial human-and-model-in-the-loop data collection procedure. We also present a strong baseline incorporating information from video, dialogue, and commonsense knowledge. Experiments show that each type of information is useful for this challenging task, and that compared to the high human performance on VLEP, our model provides a good starting point but leaves large room for future work. Our dataset and code are available at: https://github.com/jayleicn/VideoLanguageFuturePred △ Less

Submitted 15 October, 2020; originally announced October 2020.

Comments: EMNLP 2020 (17 pages)

arXiv:2008.07440 [pdf]

doi 10.1002/nbm.4527

Fast and Accurate Modeling of Transient-State Gradient-Spoiled Sequences by Recurrent Neural Networks

Authors: Hongyan Liu, Oscar van der Heide, Cornelis A. T. van den Berg, Alessandro Sbrizzi

Abstract: Fast and accurate modeling of MR signal responses are typically required for various quantitative MRI applications, such as MR Fingerprinting and MR-STAT. This work uses a new EPG-Bloch model for accurate simulation of transient-state gradient-spoiled MR sequences, and proposes a Recurrent Neural Network (RNN) as a fast surrogate of the EPG-Bloch model for computing large-scale MR signals and deri… ▽ More Fast and accurate modeling of MR signal responses are typically required for various quantitative MRI applications, such as MR Fingerprinting and MR-STAT. This work uses a new EPG-Bloch model for accurate simulation of transient-state gradient-spoiled MR sequences, and proposes a Recurrent Neural Network (RNN) as a fast surrogate of the EPG-Bloch model for computing large-scale MR signals and derivatives. The computational efficiency of the RNN model is demonstrated by comparing with other existing models, showing one to three orders of acceleration comparing to the latest GPU-accelerated open-source EPG package. By using numerical and in-vivo brain data, two use cases, namely MRF dictionary generation and optimal experimental design, are also provided. Results show that the RNN surrogate model can be efficiently used for computing large-scale dictionaries of transient-states signals and derivatives within tens of seconds, resulting in several orders of magnitude acceleration with respect to state-of-the-art implementations. The practical application of transient-states quantitative techniques can therefore be substantially facilitated. △ Less

Submitted 21 August, 2020; v1 submitted 17 August, 2020; originally announced August 2020.

Comments: Correct for typo errors

arXiv:2007.08019 [pdf, other]

Attention-Based Query Expansion Learning

Authors: Albert Gordo, Filip Radenovic, Tamara Berg

Abstract: Query expansion is a technique widely used in image search consisting in combining highly ranked images from an original query into an expanded query that is then reissued, generally leading to increased recall and precision. An important aspect of query expansion is choosing an appropriate way to combine the images into a new query. Interestingly, despite the undeniable empirical success of query… ▽ More Query expansion is a technique widely used in image search consisting in combining highly ranked images from an original query into an expanded query that is then reissued, generally leading to increased recall and precision. An important aspect of query expansion is choosing an appropriate way to combine the images into a new query. Interestingly, despite the undeniable empirical success of query expansion, ad-hoc methods with different caveats have dominated the landscape, and not a lot of research has been done on learning how to do query expansion. In this paper we propose a more principled framework to query expansion, where one trains, in a discriminative manner, a model that learns how images should be aggregated to form the expanded query. Within this framework, we propose a model that leverages a self-attention mechanism to effectively learn how to transfer information between the different images before aggregating them. Our approach obtains higher accuracy than existing approaches on standard benchmarks. More importantly, our approach is the only one that consistently shows high accuracy under different regimes, overcoming caveats of existing methods. △ Less

Submitted 15 July, 2020; originally announced July 2020.

Comments: Accepted for publication at ECCV2020

arXiv:2007.06209 [pdf, other]

doi 10.1088/1361-6560/abbb9d

Technical feasibility of Magnetic Resonance Fingerprinting on a 1.5T MRI-Linac

Authors: T. Bruijnen, O. van der Heide, M. P. W. Intven, S. Mook, J. J. W. Lagendijk, C. A. T. van den Berg, R. H. N. Tijssen

Abstract: Hybrid MRI-linac (MRL) systems enable daily multiparametric quantitative MRI to assess tumor response to radiotherapy. Magnetic Resonance Fingerprinting (MRF) may provide time efficient means of rapid multiparametric quantitative MRI. The accuracy of MRF, however, relies on adequate control over system imperfections, such as eddy currents and B1+, which are different and not as well established on… ▽ More Hybrid MRI-linac (MRL) systems enable daily multiparametric quantitative MRI to assess tumor response to radiotherapy. Magnetic Resonance Fingerprinting (MRF) may provide time efficient means of rapid multiparametric quantitative MRI. The accuracy of MRF, however, relies on adequate control over system imperfections, such as eddy currents and B1+, which are different and not as well established on MRL systems compared to diagnostic systems. In this study we investigate the technical feasibility of gradient spoiled 2D MRF on a 1.5T MRL. We show with phantom experiments that the MRL generates reliable MRF signals that are temporally stable during the day and have good agreement with spin-echo reference measurements. Subsequent in-vivo MRF scans in healthy volunteers and a patient with a colorectal liver metastasis showed good image quality, where the quantitative values of selected organs corresponded with the values reported in literature. Therefore we conclude that gradient spoiled 2D MRF is feasible on a 1.5T MRL with similar performance as on a diagnostic system. The precision and accuracy of the parametric maps are sufficient for further investigation of the clinical utility of MRF for online quantitatively MRI-guided radiotherapy. △ Less

Submitted 13 July, 2020; originally announced July 2020.

Comments: 17 pages, 9 figures, Submitted to Physics in Medicine & Biology as a technical note

arXiv:2007.01021 [pdf, other]

doi 10.1140/epjp/s13360-020-00837-3

Wave-particle duality of electrons with spin-momentum locking

Authors: D. Bercioux, T. van den Berg, D. Ferraro, J. Rech, T. Jonckheere, T. Martin

Abstract: We investigate the effects of spin-momentum locking on the interference and diffraction patterns due to a double- or single-slit in an electronic \emph{Gedankenexperiment}. We show that the inclusion of the spin-degree-of-freedom, when coupled to the motion direction of the carrier -- a typical situation that occurs in systems with spin-orbit interaction -- leads to a modification of the interfer… ▽ More We investigate the effects of spin-momentum locking on the interference and diffraction patterns due to a double- or single-slit in an electronic \emph{Gedankenexperiment}. We show that the inclusion of the spin-degree-of-freedom, when coupled to the motion direction of the carrier -- a typical situation that occurs in systems with spin-orbit interaction -- leads to a modification of the interference and diffraction patterns that depend on the geometrical parameters of the system. △ Less

Submitted 14 October, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

Comments: 10 pages and 8 figures

Journal ref: Eur. Phys. J. Plus 135, 811 (2020)

arXiv:2007.00488 [pdf, other]

Non-rigid 3D motion estimation at high temporal resolution from prospectively undersampled k-space data using low-rank MR-MOTUS

Authors: Niek R. F. Huttinga, Tom Bruijnen, Cornelis A. T. van den Berg, Alessandro Sbrizzi

Abstract: With the recent introduction of the MR-LINAC, an MR-scanner combined with a radiotherapy LINAC, MR-based motion estimation has become of increasing interest to (retrospectively) characterize tumor and organs-at-risk motion during radiotherapy. To this extent, we introduce low-rank MR-MOTUS, a framework to retrospectively reconstruct time-resolved non-rigid 3D+t motion-fields from a single low-reso… ▽ More With the recent introduction of the MR-LINAC, an MR-scanner combined with a radiotherapy LINAC, MR-based motion estimation has become of increasing interest to (retrospectively) characterize tumor and organs-at-risk motion during radiotherapy. To this extent, we introduce low-rank MR-MOTUS, a framework to retrospectively reconstruct time-resolved non-rigid 3D+t motion-fields from a single low-resolution reference image and prospectively undersampled k-space data acquired during motion. Low-rank MR-MOTUS exploits spatio-temporal correlations in internal body motion with a low-rank motion model, and inverts a signal model that relates motion-fields directly to a reference image and k-space data. The low-rank model reduces the degrees-of-freedom, memory consumption and reconstruction times by assuming a factorization of space-time motion-fields in spatial and temporal components. Low-rank MR-MOTUS was employed to estimate motion in 2D/3D abdominothoracic scans and 3D head scans. Data were acquired using golden-ratio radial readouts. Reconstructed 2D and 3D respiratory motion-fields were respectively validated against time-resolved and respiratory-resolved image reconstructions, and the head motion against static image reconstructions from fully-sampled data acquired right before and right after the motion. Results show that 2D+t respiratory motion can be estimated retrospectively at 40.8 motion-fields-per-second, 3D+t respiratory motion at 7.6 motion-fields-per-second and 3D+t head-neck motion at 9.3 motion-fields-per-second. The validations show good consistency with image reconstructions. The proposed framework can estimate time-resolved non-rigid 3D motion-fields, which allows to characterize drifts and intra and inter-cycle patterns in breathing motion during radiotherapy, and could form the basis for real-time MR-guided radiotherapy. △ Less

Submitted 1 July, 2020; originally announced July 2020.

Comments: 18 pages main text, 8 main figures, 1 main table, 12 supporting videos, 2 supporting figures, 1 supporting information PDF. Submitted to Magnetic Resonance in Medicine as Full Paper

arXiv:2005.07445 [pdf, ps, other]

Binary Hypothesis Testing with Deterministic Finite-Memory Decision Rules

Authors: Tomer Berg, Ofer Shayevitz, Or Ordentlich

Abstract: In this paper we consider the problem of binary hypothesis testing with finite memory systems. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables, with expectation $p$ under $\mathcal{H}_0$ and $q$ under $\mathcal{H}_1$. Consider a finite-memory deterministic machine with $S$ states that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time acc… ▽ More In this paper we consider the problem of binary hypothesis testing with finite memory systems. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables, with expectation $p$ under $\mathcal{H}_0$ and $q$ under $\mathcal{H}_1$. Consider a finite-memory deterministic machine with $S$ states that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule $M_n = f(M_{n-1},X_n)$, where $f$ is a deterministic time-invariant function. Assume that we let the process run for a very long time ($n\rightarrow \infty)$, and then make our decision according to some map** from the state space to the hypothesis space. The main contribution of this paper is a lower bound on the Bayes error probability $P_e$ of any such machine. In particular, our findings show that the ratio between the maximal exponential decay rate of $P_e$ with $S$ for a deterministic machine and for a randomized one, can become unbounded, complementing a result by Hellman. △ Less

Submitted 15 May, 2020; originally announced May 2020.

Comments: To be presented at ISIT 2020

arXiv:2005.05402 [pdf, other]

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

Authors: Jie Lei, Liwei Wang, Yelong Shen, Dong Yu, Tamara L. Berg, Mohit Bansal

Abstract: Generating multi-sentence descriptions for videos is one of the most challenging captioning tasks due to its high requirements for not only visual relevance but also discourse-based coherence across the sentences in the paragraph. Towards this goal, we propose a new approach called Memory-Augmented Recurrent Transformer (MART), which uses a memory module to augment the transformer architecture. Th… ▽ More Generating multi-sentence descriptions for videos is one of the most challenging captioning tasks due to its high requirements for not only visual relevance but also discourse-based coherence across the sentences in the paragraph. Towards this goal, we propose a new approach called Memory-Augmented Recurrent Transformer (MART), which uses a memory module to augment the transformer architecture. The memory module generates a highly summarized memory state from the video segments and the sentence history so as to help better prediction of the next sentence (w.r.t. coreference and repetition aspects), thus encouraging coherent paragraph generation. Extensive experiments, human evaluations, and qualitative analyses on two popular datasets ActivityNet Captions and YouCookII show that MART generates more coherent and less repetitive paragraph captions than baseline methods, while maintaining relevance to the input video events. All code is available open-source at: https://github.com/jayleicn/recurrent-transformer △ Less

Submitted 11 May, 2020; originally announced May 2020.

Comments: ACL 2020 (12 pages)

arXiv:2004.03293 [pdf, other]

doi 10.1103/PhysRevResearch.2.023373

Volkov-Pankratov states in topological graphene nanoribbons

Authors: Tineke L. van den Berg, Alessandro De Martino, M. Reyes Calvo, Dario Bercioux

Abstract: In topological systems, a modulation in the gap onset near interfaces can lead to the appearance of massive edge states, as were first described by Volkov and Pankratov. In this work, we study graphene nanoribbons in the presence of intrinsic spin-orbit coupling smoothly modulated near the system edges. We show that this space modulation leads to the appearance of Volkov-Pankratov states, in addit… ▽ More In topological systems, a modulation in the gap onset near interfaces can lead to the appearance of massive edge states, as were first described by Volkov and Pankratov. In this work, we study graphene nanoribbons in the presence of intrinsic spin-orbit coupling smoothly modulated near the system edges. We show that this space modulation leads to the appearance of Volkov-Pankratov states, in addition to the topologically protected ones. We obtain this result by means of two complementary methods, one based on the effective low-energy Dirac equation description and the other on a fully numerical tight-binding approach, finding excellent agreement between the two. We then show how transport measurements might reveal the presence of Volkov-Pankratov states, and discuss possible graphene-like structures in which such states might be observed. △ Less

Submitted 2 July, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

Comments: 12 pages, 8 figures

Journal ref: Phys. Rev. Research 2, 023373 (2020)

arXiv:2001.09099 [pdf, other]

TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval

Authors: Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal

Abstract: We introduce TV show Retrieval (TVR), a new multimodal retrieval dataset. TVR requires systems to understand both videos and their associated subtitle (dialogue) texts, making it more realistic. The dataset contains 109K queries collected on 21.8K videos from 6 TV shows of diverse genres, where each query is associated with a tight temporal window. The queries are also labeled with query types tha… ▽ More We introduce TV show Retrieval (TVR), a new multimodal retrieval dataset. TVR requires systems to understand both videos and their associated subtitle (dialogue) texts, making it more realistic. The dataset contains 109K queries collected on 21.8K videos from 6 TV shows of diverse genres, where each query is associated with a tight temporal window. The queries are also labeled with query types that indicate whether each of them is more related to video or subtitle or both, allowing for in-depth analysis of the dataset and the methods that built on top of it. Strict qualification and post-annotation verification tests are applied to ensure the quality of the collected data. Further, we present several baselines and a novel Cross-modal Moment Localization (XML ) network for multimodal moment retrieval tasks. The proposed XML model uses a late fusion design with a novel Convolutional Start-End detector (ConvSE), surpassing baselines by a large margin and with better efficiency, providing a strong starting point for future work. We have also collected additional descriptions for each annotated moment in TVR to form a new multimodal captioning dataset with 262K captions, named TV show Caption (TVC). Both datasets are publicly available. TVR: https://tvr.cs.unc.edu, TVC: https://tvr.cs.unc.edu/tvc.html. △ Less

Submitted 18 August, 2020; v1 submitted 24 January, 2020; originally announced January 2020.

Comments: ECCV 2020 (extended version, with TVC dataset+models; 35 pages)

arXiv:2001.05045 [pdf, other]

doi 10.1103/PhysRevB.101.201301

Single-charge occupation in ambipolar quantum dots

Authors: A. J. Sousa de Almeida, A. Marquez Seco, T. van den Berg, B. van de Ven, F. Bruijnes, S. V. Amitonov, F. A. Zwanenburg

Abstract: We demonstrate single-charge occupation of ambipolar quantum dots in silicon via charge sensing. We have fabricated ambipolar quantum dot (QD) devices in a silicon metal-oxide-semiconductor heterostructure comprising a single-electron transistor next to a single-hole transistor. Both QDs can be tuned to simultaneously sense charge transitions of the other. We further detect the few-electron and fe… ▽ More We demonstrate single-charge occupation of ambipolar quantum dots in silicon via charge sensing. We have fabricated ambipolar quantum dot (QD) devices in a silicon metal-oxide-semiconductor heterostructure comprising a single-electron transistor next to a single-hole transistor. Both QDs can be tuned to simultaneously sense charge transitions of the other. We further detect the few-electron and few-hole regimes in the QDs of our ambipolar device by active charge sensing. △ Less

Submitted 14 January, 2020; originally announced January 2020.

Comments: 13 pages, 4 figures

Journal ref: Phys. Rev. B 101, 201301 (2020)

arXiv:1911.09363 [pdf, other]

doi 10.1103/PhysRevResearch.2.013171

Living on the edge: Topology, electrostatics and disorder

Authors: Tineke L. van den Berg, M. Reyes Calvo, Dario Bercioux

Abstract: We address the co-existence of massless and massive topological edge states at the interface between two materials with different topological phases. We modify the well known Bernevig-Hughes-Zhang model to introduce a smooth function describing the band inversion and the band bending due to electrostatic effects between the bulk of the quantum well and the vacuum. Within this minimal model we iden… ▽ More We address the co-existence of massless and massive topological edge states at the interface between two materials with different topological phases. We modify the well known Bernevig-Hughes-Zhang model to introduce a smooth function describing the band inversion and the band bending due to electrostatic effects between the bulk of the quantum well and the vacuum. Within this minimal model we identify distinct parameter sets that can lead to the co-existence of the two types of edge states, and that determine their number and characteristics. We propose several experimental setups that could demonstrate their presence in two-dimensional topological systems, as well as ways to regulate or tune the contribution of the massive edge states to the conductance of associated electronic devices. Our results suggest that such states may also be present in novel two-dimensional Van der Waals topological materials. △ Less

Submitted 21 November, 2019; originally announced November 2019.

Comments: 12 pages, 9 figures

Journal ref: Phys. Rev. Research 2, 013171 (2020)

Showing 1–50 of 109 results for author: Berg, T