-
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Authors:
Asmar Nadeem,
Faegheh Sardari,
Robert Dawes,
Syed Sameed Husain,
Adrian Hilton,
Armin Mustafa
Abstract:
Existing video captioning benchmarks and models lack coherent representations of causal-temporal narrative, which is sequences of events linked through cause and effect, unfolding over time and driven by characters or agents. This lack of narrative restricts models' ability to generate text descriptions that capture the causal and temporal dynamics inherent in video content. To address this gap, w…
▽ More
Existing video captioning benchmarks and models lack coherent representations of causal-temporal narrative, which is sequences of events linked through cause and effect, unfolding over time and driven by characters or agents. This lack of narrative restricts models' ability to generate text descriptions that capture the causal and temporal dynamics inherent in video content. To address this gap, we propose NarrativeBridge, an approach comprising of: (1) a novel Causal-Temporal Narrative (CTN) captions benchmark generated using a large language model and few-shot prompting, explicitly encoding cause-effect temporal relationships in video descriptions, evaluated automatically to ensure caption quality and relevance; and (2) a dedicated Cause-Effect Network (CEN) architecture with separate encoders for capturing cause and effect dynamics independently, enabling effective learning and generation of captions with causal-temporal narrative. Extensive experiments demonstrate that CEN is more accurate in articulating the causal and temporal aspects of video content than the second best model (GIT): 17.88 and 17.44 CIDEr on the MSVD and MSR-VTT datasets, respectively. The proposed framework understands and generates nuanced text descriptions with intricate causal-temporal narrative structures present in videos, addressing a critical limitation in video captioning. For project details, visit https://narrativebridge.github.io/.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
Authors:
Asmar Nadeem,
Adrian Hilton,
Robert Dawes,
Graham Thomas,
Armin Mustafa
Abstract:
In the context of Audio Visual Question Answering (AVQA) tasks, the audio visual modalities could be learnt on three levels: 1) Spatial, 2) Temporal, and 3) Semantic. Existing AVQA methods suffer from two major shortcomings; the audio-visual (AV) information passing through the network isn't aligned on Spatial and Temporal levels; and, inter-modal (audio and visual) Semantic information is often n…
▽ More
In the context of Audio Visual Question Answering (AVQA) tasks, the audio visual modalities could be learnt on three levels: 1) Spatial, 2) Temporal, and 3) Semantic. Existing AVQA methods suffer from two major shortcomings; the audio-visual (AV) information passing through the network isn't aligned on Spatial and Temporal levels; and, inter-modal (audio and visual) Semantic information is often not balanced within a context; this results in poor performance. In this paper, we propose a novel end-to-end Contextual Multi-modal Alignment (CAD) network that addresses the challenges in AVQA methods by i) introducing a parameter-free stochastic Contextual block that ensures robust audio and visual alignment on the Spatial level; ii) proposing a pre-training technique for dynamic audio and visual alignment on Temporal level in a self-supervised setting, and iii) introducing a cross-attention mechanism to balance audio and visual information on Semantic level. The proposed novel CAD network improves the overall performance over the state-of-the-art methods on average by 9.4% on the MUSIC-AVQA dataset. We also demonstrate that our proposed contributions to AVQA can be added to the existing methods to improve their performance without additional complexity requirements.
△ Less
Submitted 27 October, 2023; v1 submitted 25 October, 2023;
originally announced October 2023.
-
Ab initio quantum scattering calculations and a new potential energy surface for the HCl($X^1Σ^+$)-O$_{2}$($X^3Σ^-_g$) system: collision-induced line-shape parameters for O$_{2}$-perturbed R(0) 0-0 line in H$^{35}$Cl
Authors:
Artur Olejnik,
Hubert Jóźwiak,
Maciej Gancewski,
Ernesto Quintas-Sánchez,
Richard Dawes,
Piotr Wcisło
Abstract:
The remote sensing of abundance and properties of HCl -- the main atmospheric reservoir of Cl atoms which directly participate in ozone depletion -- are important for monitoring the partitioning of chlorine between "ozone-depleting" and "reservoir" species. Such remote studies require knowledge of the shapes of molecular resonances of HCl, which are perturbed by collisions with the molecules of th…
▽ More
The remote sensing of abundance and properties of HCl -- the main atmospheric reservoir of Cl atoms which directly participate in ozone depletion -- are important for monitoring the partitioning of chlorine between "ozone-depleting" and "reservoir" species. Such remote studies require knowledge of the shapes of molecular resonances of HCl, which are perturbed by collisions with the molecules of the surrounding air. In this work, we report the first fully quantum calculations of collisional perturbations of the shape of a pure rotational line in H$^{35}$Cl perturbed by an air-relevant molecule (as the first model system we choose the R(0) line in HCl perturbed by O$_2$). The calculations are performed on our new highly-accurate HCl($X^1Σ^+$)-O$_2$($X^3Σ^-_g$) potential energy surface. In addition to pressure broadening and shift, we determine also their speed dependencies and the complex Dicke parameter. This gives important input to the community discussion on the physical meaning of the complex Dicke parameter and its relevance for atmospheric spectra (previously, the complex Dicke parameter for such systems was mainly determined from phenomenological fits to experimental spectra and the physical meaning of its value in that context is questionable). We also calculate the temperature dependence of the line-shape parameters and obtain agreement with the available experimental data. We estimate the total combined uncertainties of our calculations at 2% relative RMSE residuals in the simulated line shape at 296~K. This result constitutes an important step towards computational population of spectroscopic databases with accurate ab initio line-shape parameters for molecular systems of terrestrial atmospheric importance.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
SEM-POS: Grammatically and Semantically Correct Video Captioning
Authors:
Asmar Nadeem,
Adrian Hilton,
Robert Dawes,
Graham Thomas,
Armin Mustafa
Abstract:
Generating grammatically and semantically correct captions in video captioning is a challenging task. The captions generated from the existing methods are either word-by-word that do not align with grammatical structure or miss key information from the input videos. To address these issues, we introduce a novel global-local fusion network, with a Global-Local Fusion Block (GLFB) that encodes and f…
▽ More
Generating grammatically and semantically correct captions in video captioning is a challenging task. The captions generated from the existing methods are either word-by-word that do not align with grammatical structure or miss key information from the input videos. To address these issues, we introduce a novel global-local fusion network, with a Global-Local Fusion Block (GLFB) that encodes and fuses features from different parts of speech (POS) components with visual-spatial features. We use novel combinations of different POS components - 'determinant + subject', 'auxiliary verb', 'verb', and 'determinant + object' for supervision of the POS blocks - Det + Subject, Aux Verb, Verb, and Det + Object respectively. The novel global-local fusion network together with POS blocks helps align the visual features with language description to generate grammatically and semantically correct captions. Extensive qualitative and quantitative experiments on benchmark MSVD and MSRVTT datasets demonstrate that the proposed approach generates more grammatically and semantically correct captions compared to the existing methods, achieving the new state-of-the-art. Ablations on the POS blocks and the GLFB demonstrate the impact of the contributions on the proposed method.
△ Less
Submitted 4 April, 2023; v1 submitted 26 March, 2023;
originally announced March 2023.
-
Collisional excitation and non-LTE modelling of interstellar chiral propylene oxide
Authors:
K. Dzenis,
A. Faure,
B. A. McGuire,
A. J. Remijan,
P. J. Dagdigian,
C. Rist,
R. Dawes,
E. Quintas-Sanchez,
F. Lique,
M. Hochlaf
Abstract:
The first set of theoretical cross sections for propylene oxide (CH3CHCH2O) colliding with cold He atoms has been obtained at the full quantum level using a high-accuracy potential energy surface. By scaling the collision reduced mass, rotational rate coefficients for collisions with para-H2 are deduced in the temperature range 5-30 K. These collisional coefficients are combined with radiative dat…
▽ More
The first set of theoretical cross sections for propylene oxide (CH3CHCH2O) colliding with cold He atoms has been obtained at the full quantum level using a high-accuracy potential energy surface. By scaling the collision reduced mass, rotational rate coefficients for collisions with para-H2 are deduced in the temperature range 5-30 K. These collisional coefficients are combined with radiative data in a non-LTE radiative transfer model in order to reproduce observations of propylene oxide made towards the Sagittarius B2(N) molecular cloud with the Green Bank and Parkes radio telescopes. The three detected absorption lines are found to probe the cold (~ 10 K) and translucent (nH ~ 2000 cm-3) gas in the outer edges of the extended Sgr B2(N) envelope. The derived column density for propylene oxide is Ntot ~ 3e12 cm-2, corresponding to a fractional abundance relative to total hydrogen of ~ 2.5e-11. The present results are expected to help our understanding of the chemistry of propylene oxide, including a potential enantiomeric excess, in the cold interstellar medium.
△ Less
Submitted 5 February, 2022; v1 submitted 16 December, 2021;
originally announced December 2021.
-
CF$^+$ excitation in the interstellar medium
Authors:
Benjamin Desrousseaux,
François Lique,
Javier R. Goicoechea,
Ernesto Quintas-Sánchez,
Richard Dawes
Abstract:
The detection of CF$^+$ in interstellar clouds potentially allows astronomers to infer the elemental fluorine abundance and the ionization fraction in ultraviolet-illuminated molecular gas. Because local thermodynamic equilibrium (LTE) conditions are hardly fulfilled in the interstellar medium (ISM), the accurate determination of the CF$^+$ abundance requires one to model its non-LTE excitation vi…
▽ More
The detection of CF$^+$ in interstellar clouds potentially allows astronomers to infer the elemental fluorine abundance and the ionization fraction in ultraviolet-illuminated molecular gas. Because local thermodynamic equilibrium (LTE) conditions are hardly fulfilled in the interstellar medium (ISM), the accurate determination of the CF$^+$ abundance requires one to model its non-LTE excitation via both radiative and collisional processes. Here, we report quantum calculations of rate coefficients for the rotational excitation of CF$^+$ in collisions with para- and ortho-H2 (for temperatures up to 150 K). As an application, we present non-LTE excitation models that reveal population inversion in physical conditions typical of ISM photodissociation regions (PDRs). We successfully applied these models to fit the CF$^+$ emission lines previously observed toward the Orion Bar and Horsehead PDRs. The radiative transfer models achieved with these new rate coefficients allow the use of CF$^+$ as a powerful probe to study molecular clouds exposed to strong stellar radiation fields.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Temperature-dependent rotationally inelastic collisions of OH- and He
Authors:
Eric S. Endres,
Steve Ndengue,
Olga Lakhmanskaya,
Seunghyun Lee,
Francesco A. Gianturco,
Richard Dawes,
Roland Wester
Abstract:
We have studied the fundamental rotational relaxation and excitation collision of OH- J=0 <-> 1 with helium at different collision energies. Using state-selected photodetachment in a cryogenic ion trap, the collisional excitation of the first excited rotational state of OH- has been investigated and absolute inelastic collision rate coefficients have been extracted for collision temperatures betwe…
▽ More
We have studied the fundamental rotational relaxation and excitation collision of OH- J=0 <-> 1 with helium at different collision energies. Using state-selected photodetachment in a cryogenic ion trap, the collisional excitation of the first excited rotational state of OH- has been investigated and absolute inelastic collision rate coefficients have been extracted for collision temperatures between 20 and 35 K. The rates are compared with accurate quantum scattering calculations for three different potential energy surfaces. Good agreement is found within the experimental accuracy, but the experimental trend of increasing collision rates with temperature is only in part reflected in the calculations.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Pressure Shifts in High-Precision Hydrogen Spectroscopy: I. Long-Range Atom-Atom and Atom-Molecule Interactions
Authors:
U. D. Jentschura,
C. M. Adhikari,
R. Dawes,
A. Matveev,
N. Kolachevsky
Abstract:
We study the theoretical foundations for the pressure shifts in high-precision atomic beam spectrosopy of hydrogen, with a particular emphasis on transitions involving higher excited P states. In particular, the long-range interaction of an excited hydrogen atom in a 4P state with a ground-state and metastable hydrogen atom is studied, with a full resolution of the hyperfine structure. It is found…
▽ More
We study the theoretical foundations for the pressure shifts in high-precision atomic beam spectrosopy of hydrogen, with a particular emphasis on transitions involving higher excited P states. In particular, the long-range interaction of an excited hydrogen atom in a 4P state with a ground-state and metastable hydrogen atom is studied, with a full resolution of the hyperfine structure. It is found that the full inclusion of the 4P_1/2 and 4P_3/2 manifolds becomes necessary in order to obtain reliable theoretical predictions, because the 1S ground state hyperfine frequency is commensurate with the 4P fine-structure splitting. An even more complex problem is encountered in the case of the 4P-2S interaction, where the inclusion of quasi-degenerate 4S-2P_1/2 state becomes necessary in view of the dipole couplings induced by the van der Waals Hamiltonian. Matrices of dimension up to 40 have to be treated despite all efforts to reduce the problem to irreducible submanifolds within the quasi-degenerate basis. We focus on the phenomenologically important second-order van der Waals shifts, proportional to 1/R^6 where R is the interatomic distance, and obtain results with full resolution of the hyperfine structure. The magnitude of van der Waals coefficients for hydrogen atom-atom collisions involving excited P states is drastically enhanced due to energetic quasi-degeneracy; we find no such enhancement for atom-molecule collisions involving atomic nP states, even if the complex molecular spectrum involving ro-vibrational levels requires a deeper analysis.
△ Less
Submitted 4 April, 2019;
originally announced April 2019.
-
Infrared spectrum and intermolecular potential energy surface of the CO-O2 dimer
Authors:
A. J. Barclay,
A. R. W. McKellar,
N. Moazzen-Ahmadi,
Richard Dawes,
Xiao-Gang Wang,
Tucker Carrington Jr
Abstract:
Only a few weakly-bound complexes containing the O2 molecule have been characterized by high resolution spectroscopy, no doubt due to the complications added by the oxygen molecule's unpaired electron spin. Here we report an extensive infrared spectrum of CO-O2, observed in the CO fundamental band region using a tunable quantum cascade laser to probe a pulsed supersonic jet expansion. The rotation…
▽ More
Only a few weakly-bound complexes containing the O2 molecule have been characterized by high resolution spectroscopy, no doubt due to the complications added by the oxygen molecule's unpaired electron spin. Here we report an extensive infrared spectrum of CO-O2, observed in the CO fundamental band region using a tunable quantum cascade laser to probe a pulsed supersonic jet expansion. The rotational energy level pattern derived from the spectrum consists of stacks of levels characterized by the total angular momentum, J, and its projection on the intermolecular axis, K. Five such stacks are observed in the ground vibrational state, and ten in the excited state (v(CO) = 1). They are divided into two groups, with no observed transitions between groups. The groups correspond to different projections of the O2 electron spin, and correlate with the two lowest rotational states of O2, (N, J) = (1, 0) and (1, 2). The rotational constant of the lowest K = 0 stack implies an effective intermolecular separation of 3.82 Angstroms, but this should be interpreted with caution since it ignores possible effects of electron spin. A new high-level 4-dimensional potential energy surface is developed for CO-O2, and rotational energy levels are calculated for this surface, ignoring electron spin. By comparing calculated and observed levels, it is possible to assign detailed quantum labels to the observed level stacks.
△ Less
Submitted 7 April, 2018;
originally announced April 2018.