Search | arXiv e-print repository

PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

Authors: Yanli Zhao, Andrew Gu, Rohan Varma, Liang Luo, Chien-Chin Huang, Min Xu, Less Wright, Hamid Shojanazeri, Myle Ott, Sam Shleifer, Alban Desmaison, Can Balioglu, Pritam Damania, Bernard Nguyen, Geeta Chauhan, Yuchen Hao, Ajit Mathews, Shen Li

Abstract: It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite the remarkable progress made in the field of machine learning systems research, which has enabled the development and exploration of large models, such abilities remain confined to a small group of advanced users and industry leaders, resulting in an implicit tech… ▽ More It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite the remarkable progress made in the field of machine learning systems research, which has enabled the development and exploration of large models, such abilities remain confined to a small group of advanced users and industry leaders, resulting in an implicit technical barrier for the wider community to access and leverage these technologies. In this paper, we introduce PyTorch Fully Sharded Data Parallel (FSDP) as an industry-grade solution for large model training. FSDP has been closely co-designed with several key PyTorch core components including Tensor implementation, dispatcher system, and CUDA memory caching allocator, to provide non-intrusive user experiences and high training efficiency. Additionally, FSDP natively incorporates a range of techniques and settings to optimize resource utilization across a variety of hardware configurations. The experimental results demonstrate that FSDP is capable of achieving comparable performance to Distributed Data Parallel while providing support for significantly larger models with near-linear scalability in terms of TFLOPS. △ Less

Submitted 12 September, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

arXiv:2208.08481 [pdf, other]

doi 10.1016/j.physletb.2022.137361

Single neutron transfer on 23Ne and its relevance forthepathway ofnucleosynthesis in astrophysical X-ray bursts

Authors: G. Lotay, J. Henderson, W. N. Catford, F. A. Ali, J. Berean, N. Bernier, S. S. Bhattacharjee, M. Bowry, R. Caballero-Folch, B. Davids, T. E. Drake, A. B. Garnsworthy, F. GhaziMoradi, S. A. Gillespie, B. Greaves, G. Hackman, S. Hallam, D. Hymers, E. Kasanda, D. Levy, B. K. Luna, A. Mathews, Z. Meisel, M. Moukaddam, D. Muecher , et al. (10 additional authors not shown)

Abstract: We present new experimental measurements of resonance strengths in the astrophysical 23Al(p, γ)24Si reaction, constraining the pathway of nucleosynthesis beyond 22Mg in X-ray burster scenarios. Specifically, we have performed the first measurement of the (d, p) reaction using a radioactive beam of 23Ne to explore levels in 24Ne, the mirror analog of 24Si. Four strong single-particle states were ob… ▽ More We present new experimental measurements of resonance strengths in the astrophysical 23Al(p, γ)24Si reaction, constraining the pathway of nucleosynthesis beyond 22Mg in X-ray burster scenarios. Specifically, we have performed the first measurement of the (d, p) reaction using a radioactive beam of 23Ne to explore levels in 24Ne, the mirror analog of 24Si. Four strong single-particle states were observed and corresponding neutron spectroscopic factors were extracted with a precision of {\sim}20{\%}. Using these spectroscopic factors, together with mirror state identifications, we have reduced uncertainties in the strength of the key {\ell} = 0 resonance at Er= 157 keV, in the astrophysical 23Al(p, γ) reaction, by a factor of 4. Our results show that the 22Mg(p, γ)23Al(p, γ) pathway dominates over the competing 22Mg(α, p) reaction in all but the most energetic X-ray burster events (T>0.85GK), significantly affecting energy production and the preservation of hydrogen fuel. △ Less

Submitted 17 August, 2022; originally announced August 2022.

Comments: 5 pages, 3 figures

arXiv:2205.11467 [pdf, other]

A Question-Answer Driven Approach to Reveal Affirmative Interpretations from Verbal Negations

Authors: Md Mosharaf Hossain, Luke Holman, Anusha Kakileti, Tiffany Iris Kao, Nathan Raul Brito, Aaron Abraham Mathews, Eduardo Blanco

Abstract: This paper explores a question-answer driven approach to reveal affirmative interpretations from verbal negations (i.e., when a negation cue grammatically modifies a verb). We create a new corpus consisting of 4,472 verbal negations and discover that 67.1% of them convey that an event actually occurred. Annotators generate and answer 7,277 questions for the 3,001 negations that convey an affirmati… ▽ More This paper explores a question-answer driven approach to reveal affirmative interpretations from verbal negations (i.e., when a negation cue grammatically modifies a verb). We create a new corpus consisting of 4,472 verbal negations and discover that 67.1% of them convey that an event actually occurred. Annotators generate and answer 7,277 questions for the 3,001 negations that convey an affirmative interpretation. We first cast the problem of revealing affirmative interpretations from negations as a natural language inference (NLI) classification task. Experimental results show that state-of-the-art transformers trained with existing NLI corpora are insufficient to reveal affirmative interpretations. We also observe, however, that fine-tuning brings small improvements. In addition to NLI classification, we also explore the more realistic task of generating affirmative interpretations directly from negations with the T5 transformer. We conclude that the generation task remains a challenge as T5 substantially underperforms humans. △ Less

Submitted 23 May, 2022; originally announced May 2022.

Comments: Accepted at the Findings of NAACL 2022

arXiv:2205.07838 [pdf, other]

Physics-informed machine learning techniques for edge plasma turbulence modelling in computational theory and experiment

Authors: Abhilash Mathews

Abstract: Edge plasma turbulence is critical to the performance of magnetic confinement fusion devices. Towards better understanding edge turbulence in both theory and experiment, a custom-built physics-informed deep learning framework constrained by partial differential equations is developed to accurately learn turbulent fields consistent with the two-fluid theory from partial observations of electron pre… ▽ More Edge plasma turbulence is critical to the performance of magnetic confinement fusion devices. Towards better understanding edge turbulence in both theory and experiment, a custom-built physics-informed deep learning framework constrained by partial differential equations is developed to accurately learn turbulent fields consistent with the two-fluid theory from partial observations of electron pressure. This calculation is not otherwise possible using conventional equilibrium models. With this technique, the first direct quantitative comparisons of turbulent fields between electrostatic two-fluid theory and electromagnetic gyrokinetic modelling are demonstrated with good overall agreement found in magnetized helical plasmas at low normalized pressure. To translate these computational techniques to experimental fusion plasmas, a novel method to translate brightness measurements of HeI line radiation into local plasma fluctuations is demonstrated via a newly created deep learning framework that integrates neutral transport physics and collisional radiative theory for the $3^3 D - 2^3 P$ transition in atomic helium. Using fast camera data on the Alcator C-Mod tokamak, this thesis presents the first 2-dimensional time-dependent experimental measurements of the turbulent electron density, electron temperature, and neutral density in a fusion plasma using a single spectral line. With this experimentally inferred data, initial estimates of the 2-dimensional turbulent electric field consistent with drift-reduced Braginskii theory under the framework of an axisymmetric fusion plasma with purely toroidal field are calculated. The inclusion of atomic helium effects on particle and energy sources are found to strengthen correlations between the electric field and electron pressure while broadening turbulent field amplitudes which impact ${\bf E \times B}$ flows and shearing rates. △ Less

Submitted 16 May, 2022; originally announced May 2022.

Comments: PhD thesis, 172 pages, 38 figures, 4 tables

arXiv:2205.04235 [pdf, other]

Measuring Cognitive Workload Using Multimodal Sensors

Authors: Niraj Hirachan, Anita Mathews, Julio Romero, Raul Fernandez Rojas

Abstract: This study aims to identify a set of indicators to estimate cognitive workload using a multimodal sensing approach and machine learning. A set of three cognitive tests were conducted to induce cognitive workload in twelve participants at two levels of task difficulty (Easy and Hard). Four sensors were used to measure the participants' physiological change, including, Electrocardiogram (ECG), elect… ▽ More This study aims to identify a set of indicators to estimate cognitive workload using a multimodal sensing approach and machine learning. A set of three cognitive tests were conducted to induce cognitive workload in twelve participants at two levels of task difficulty (Easy and Hard). Four sensors were used to measure the participants' physiological change, including, Electrocardiogram (ECG), electrodermal activity (EDA), respiration (RESP), and blood oxygen saturation (SpO2). To understand the perceived cognitive workload, NASA-TLX was used after each test and analysed using Chi-Square test. Three well-know classifiers (LDA, SVM, and DT) were trained and tested independently using the physiological data. The statistical analysis showed that participants' perceived cognitive workload was significantly different (p<0.001) between the tests, which demonstrated the validity of the experimental conditions to induce different cognitive levels. Classification results showed that a fusion of ECG and EDA presented good discriminating power (acc=0.74) for cognitive workload detection. This study provides preliminary results in the identification of a possible set of indicators of cognitive workload. Future work needs to be carried out to validate the indicators using more realistic scenarios and with a larger population. △ Less

Submitted 5 May, 2022; originally announced May 2022.

arXiv:2204.11689 [pdf, other]

doi 10.1103/PhysRevLett.129.235002

Deep electric field predictions by drift-reduced Braginskii theory with plasma-neutral interactions based upon experimental images of boundary turbulence

Authors: Abhilash Mathews, Jerry Hughes, James Terry, Seung-Gyou Baek

Abstract: We present 2-dimensional turbulent electric field calculations via physics-informed deep learning consistent with (i) drift-reduced Braginskii theory under the framework of an axisymmetric fusion plasma with purely toroidal field and (ii) experimental estimates of the fluctuating electron density and temperature on open field lines obtained from analysis of gas puff imaging of a discharge on the A… ▽ More We present 2-dimensional turbulent electric field calculations via physics-informed deep learning consistent with (i) drift-reduced Braginskii theory under the framework of an axisymmetric fusion plasma with purely toroidal field and (ii) experimental estimates of the fluctuating electron density and temperature on open field lines obtained from analysis of gas puff imaging of a discharge on the Alcator C-Mod tokamak. The inclusion of effects from the locally puffed atomic helium on particle and energy sources within the reduced plasma turbulence model are found to strengthen correlations between the electric field and electron pressure. The neutrals are also directly associated with broadening the distribution of turbulent field amplitudes and increasing ${\bf E \times B}$ shearing rates. This demonstrates a novel approach in plasma experiments by solving for nonlinear dynamics consistent with partial differential equations and data without encoding explicit boundary nor initial conditions. △ Less

Submitted 28 November, 2022; v1 submitted 25 April, 2022; originally announced April 2022.

Comments: 6 pages, 3 figures, 2 tables

arXiv:2201.09988 [pdf, other]

doi 10.1063/5.0088216

Deep modelling of plasma and neutral fluctuations from gas puff turbulence imaging

Authors: A. Mathews, J. L. Terry, S. G. Baek, J. W. Hughes, A. Q. Kuang, B. LaBombard, M. A. Miller, D. Stotler, D. Reiter, W. Zholobenko, M. Goto

Abstract: The role of turbulence in setting boundary plasma conditions is presently a key uncertainty in projecting to fusion energy reactors. To robustly diagnose edge turbulence, we develop and demonstrate a technique to translate brightness measurements of HeI line radiation into local plasma fluctuations via a novel integrated deep learning framework that combines neutral transport physics and collision… ▽ More The role of turbulence in setting boundary plasma conditions is presently a key uncertainty in projecting to fusion energy reactors. To robustly diagnose edge turbulence, we develop and demonstrate a technique to translate brightness measurements of HeI line radiation into local plasma fluctuations via a novel integrated deep learning framework that combines neutral transport physics and collisional radiative theory for the $3^3 D - 2^3 P$ transition in atomic helium. The tenets for experimental validity are reviewed, illustrating that this turbulence analysis for ionized gases is transferable to both magnetized and unmagnetized environments with arbitrary geometries. Based upon fast camera data on the Alcator C-Mod tokamak, we present the first 2-dimensional time-dependent experimental measurements of the turbulent electron density, electron temperature, and neutral density revealing shadowing effects in a fusion plasma using a single spectral line. △ Less

Submitted 19 May, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

Comments: 16 pages, 14 figures

arXiv:2112.03256 [pdf, ps, other]

Impact of Target Word and Context on End-to-End Metonymy Detection

Authors: Kevin Alex Mathews, Michael Strube

Abstract: Metonymy is a figure of speech in which an entity is referred to by another related entity. The task of metonymy detection aims to distinguish metonymic tokens from literal ones. Until now, metonymy detection methods attempt to disambiguate only a single noun phrase in a sentence, typically location names or organization names. In this paper, we disambiguate every word in a sentence by reformulati… ▽ More Metonymy is a figure of speech in which an entity is referred to by another related entity. The task of metonymy detection aims to distinguish metonymic tokens from literal ones. Until now, metonymy detection methods attempt to disambiguate only a single noun phrase in a sentence, typically location names or organization names. In this paper, we disambiguate every word in a sentence by reformulating metonymy detection as a sequence labeling task. We also investigate the impact of target word and context on metonymy detection. We show that the target word is less useful for detecting metonymy in our dataset. On the other hand, the entity types that are associated with domain-specific words in their context are easier to solve. This shows that the context words are much more relevant for detecting metonymy. △ Less

Submitted 6 December, 2021; originally announced December 2021.

arXiv:2111.13802 [pdf, other]

Factorized Fourier Neural Operators

Authors: Alasdair Tran, Alexander Mathews, Lexing Xie, Cheng Soon Ong

Abstract: We propose the Factorized Fourier Neural Operator (F-FNO), a learning-based approach for simulating partial differential equations (PDEs). Starting from a recently proposed Fourier representation of flow fields, the F-FNO bridges the performance gap between pure machine learning approaches to that of the best numerical or hybrid solvers. This is achieved with new representations - separable spectr… ▽ More We propose the Factorized Fourier Neural Operator (F-FNO), a learning-based approach for simulating partial differential equations (PDEs). Starting from a recently proposed Fourier representation of flow fields, the F-FNO bridges the performance gap between pure machine learning approaches to that of the best numerical or hybrid solvers. This is achieved with new representations - separable spectral layers and improved residual connections - and a combination of training strategies such as the Markov assumption, Gaussian noise, and cosine learning rate decay. On several challenging benchmark PDEs on regular grids, structured meshes, and point clouds, the F-FNO can scale to deeper networks and outperform both the FNO and the geo-FNO, reducing the error by 83% on the Navier-Stokes problem, 31% on the elasticity problem, 57% on the airfoil flow problem, and 60% on the plastic forging problem. Compared to the state-of-the-art pseudo-spectral method, the F-FNO can take a step size that is an order of magnitude larger in time and achieve an order of magnitude speedup to produce the same solution quality. △ Less

Submitted 2 March, 2023; v1 submitted 26 November, 2021; originally announced November 2021.

Comments: Published in The Eleventh International Conference on Learning Representations (2023). Code is available at https://github.com/alasdairtran/fourierflow

arXiv:2107.09744 [pdf, other]

doi 10.1063/5.0066064

Turbulent field fluctuations in gyrokinetic and fluid plasmas

Authors: Abhilash Mathews, Noah Mandell, Manaure Francisquez, Jerry Hughes, Ammar Hakim

Abstract: A key uncertainty in the design and development of magnetic confinement fusion energy reactors is predicting edge plasma turbulence. An essential step in overcoming this uncertainty is the validation in accuracy of reduced turbulent transport models. Drift-reduced Braginskii two-fluid theory is one such set of reduced equations that has for decades simulated boundary plasmas in experiment, but sig… ▽ More A key uncertainty in the design and development of magnetic confinement fusion energy reactors is predicting edge plasma turbulence. An essential step in overcoming this uncertainty is the validation in accuracy of reduced turbulent transport models. Drift-reduced Braginskii two-fluid theory is one such set of reduced equations that has for decades simulated boundary plasmas in experiment, but significant questions exist regarding its predictive ability. To this end, using a novel physics-informed deep learning framework, we demonstrate the first ever direct quantitative comparisons of turbulent field fluctuations between electrostatic two-fluid theory and electromagnetic gyrokinetic modelling with good overall agreement found in magnetized helical plasmas at low normalized pressure. This framework is readily adaptable to experimental and astrophysical environments, and presents a new technique for the numerical validation and discovery of reduced global plasma turbulence models. △ Less

Submitted 6 October, 2021; v1 submitted 20 July, 2021; originally announced July 2021.

Comments: 13 pages, 5 figures

arXiv:2107.04140 [pdf, other]

First-Generation Inference Accelerator Deployment at Facebook

Authors: Michael Anderson, Benny Chen, Stephen Chen, Summer Deng, Jordan Fix, Michael Gschwind, Aravind Kalaiah, Changkyu Kim, Jaewon Lee, Jason Liang, Haixin Liu, Yinghai Lu, Jack Montgomery, Arun Moorthy, Satish Nadathur, Sam Naghshineh, Avinash Nayak, Jongsoo Park, Chris Petersen, Martin Schatz, Narayanan Sundaram, Bangsheng Tang, Peter Tang, Amy Yang, Jiecao Yu , et al. (90 additional authors not shown)

Abstract: In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the in… ▽ More In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the inference accelerator platform ecosystem we developed and deployed at Facebook: both hardware, through Open Compute Platform (OCP), and software framework and tooling, through Pytorch/Caffe2/Glow. A characteristic of this ecosystem from the start is its openness to enable a variety of AI accelerators from different vendors. This platform, with six low-power accelerator cards alongside a single-socket host CPU, allows us to serve models of high complexity that cannot be easily or efficiently run on CPUs. We describe various performance optimizations, at both platform and accelerator level, which enables this platform to serve production traffic at Facebook. We also share deployment challenges, lessons learned during performance optimization, as well as provide guidance for future inference hardware co-design. △ Less

Submitted 4 August, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

arXiv:2104.05158 [pdf, other]

Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

Authors: Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng , et al. (28 additional authors not shown)

Abstract: Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pa… ▽ More Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pair it with the new evolution of Zion platform, namely ZionEX. We demonstrate the capability to train very large DLRMs with up to 12 Trillion parameters and show that we can attain 40X speedup in terms of time to solution over previous systems. We achieve this by (i) designing the ZionEX platform with dedicated scale-out network, provisioned with high bandwidth, optimal topology and efficient transport (ii) implementing an optimized PyTorch-based training stack supporting both model and data parallelism (iii) develo** sharding algorithms capable of hierarchical partitioning of the embedding tables along row, column dimensions and load balancing them across multiple workers; (iv) adding high-performance core operators while retaining flexibility to support optimizers with fully deterministic updates (v) leveraging reduced precision communications, multi-level memory hierarchy (HBM+DDR+SSD) and pipelining. Furthermore, we develop and briefly comment on distributed data ingestion and other supporting services that are required for the robust and efficient end-to-end training in production environments. △ Less

Submitted 26 February, 2023; v1 submitted 11 April, 2021; originally announced April 2021.

arXiv:2103.01305 [pdf, other]

doi 10.1109/TPS.2021.3123046

Quantifying experimental edge plasma evolution via multidimensional adaptive Gaussian process regression

Authors: Abhilash Mathews, Jerry Hughes

Abstract: The edge density and temperature of tokamak plasmas are strongly correlated with energy and particle confinement and their quantification is fundamental to understanding edge dynamics. These quantities exhibit behaviours ranging from sharp plasma gradients and fast transient phenomena (e.g. transitions between low and high confinement regimes) to nominal stationary phases. Analysis of experimental… ▽ More The edge density and temperature of tokamak plasmas are strongly correlated with energy and particle confinement and their quantification is fundamental to understanding edge dynamics. These quantities exhibit behaviours ranging from sharp plasma gradients and fast transient phenomena (e.g. transitions between low and high confinement regimes) to nominal stationary phases. Analysis of experimental edge measurements therefore require robust fitting techniques to capture potentially stiff spatiotemporal evolution. Additionally, fusion plasma diagnostics inevitably involve measurement errors and data analysis requires a statistical framework to accurately quantify uncertainties. This paper outlines a generalized multidimensional adaptive Gaussian process routine capable of automatically handling noisy data and spatiotemporal correlations. We focus on the edge-pedestal region in order to underline advancements in quantifying time-dependent plasma profiles including transport barrier formation on the Alcator C-Mod tokamak. △ Less

Submitted 1 March, 2021; originally announced March 2021.

Comments: 7 pages, 7 figures

arXiv:2102.07289 [pdf, other]

doi 10.1145/3442381.3449945

Radflow: A Recurrent, Aggregated, and Decomposable Model for Networks of Time Series

Authors: Alasdair Tran, Alexander Mathews, Cheng Soon Ong, Lexing Xie

Abstract: We propose a new model for networks of time series that influence each other. Graph structures among time series are found in diverse domains, such as web traffic influenced by hyperlinks, product sales influenced by recommendation, or urban transport volume influenced by road networks and weather. There has been recent progress in graph modeling and in time series forecasting, respectively, but a… ▽ More We propose a new model for networks of time series that influence each other. Graph structures among time series are found in diverse domains, such as web traffic influenced by hyperlinks, product sales influenced by recommendation, or urban transport volume influenced by road networks and weather. There has been recent progress in graph modeling and in time series forecasting, respectively, but an expressive and scalable approach for a network of series does not yet exist. We introduce Radflow, a novel model that embodies three key ideas: a recurrent neural network to obtain node embeddings that depend on time, the aggregation of the flow of influence from neighboring nodes with multi-head attention, and the multi-layer decomposition of time series. Radflow naturally takes into account dynamic networks where nodes and edges change over time, and it can be used for prediction and data imputation tasks. On real-world datasets ranging from a few hundred to a few hundred thousand nodes, we observe that Radflow variants are the best performing model across a wide range of settings. The recurrent component in Radflow also outperforms N-BEATS, the state-of-the-art time series model. We show that Radflow can learn different trends and seasonal patterns, that it is robust to missing nodes and edges, and that correlated temporal patterns among network neighbors reflect influence strength. We curate WikiTraffic, the largest dynamic network of time series with 366K nodes and 22M time-dependent links spanning five years. This dataset provides an open benchmark for develo** models in this area, with applications that include optimizing resources for the web. More broadly, Radflow has the potential to improve forecasts in correlated time series networks such as the stock market, and impute missing measurements in geographically dispersed networks of natural phenomena. △ Less

Submitted 14 February, 2021; originally announced February 2021.

Comments: Published in The Web Conference 2021. Code is available at https://github.com/alasdairtran/radflow

Journal ref: Proceedings of The Web Conference 2021 (WWW '21)

arXiv:2102.01974 [pdf, other]

doi 10.1145/3437963.3441703

AttentionFlow: Visualising Influence in Networks of Time Series

Authors: Minjeong Shin, Alasdair Tran, Siqi Wu, Alexander Mathews, Rong Wang, Georgiana Lyall, Lexing Xie

Abstract: The collective attention on online items such as web pages, search terms, and videos reflects trends that are of social, cultural, and economic interest. Moreover, attention trends of different items exhibit mutual influence via mechanisms such as hyperlinks or recommendations. Many visualisation tools exist for time series, network evolution, or network influence; however, few systems connect all… ▽ More The collective attention on online items such as web pages, search terms, and videos reflects trends that are of social, cultural, and economic interest. Moreover, attention trends of different items exhibit mutual influence via mechanisms such as hyperlinks or recommendations. Many visualisation tools exist for time series, network evolution, or network influence; however, few systems connect all three. In this work, we present AttentionFlow, a new system to visualise networks of time series and the dynamic influence they have on one another. Centred around an ego node, our system simultaneously presents the time series on each node using two visual encodings: a tree ring for an overview and a line chart for details. AttentionFlow supports interactions such as overlaying time series of influence and filtering neighbours by time or flux. We demonstrate AttentionFlow using two real-world datasets, VevoMusic and WikiTraffic. We show that attention spikes in songs can be explained by external events such as major awards, or changes in the network such as the release of a new song. Separate case studies also demonstrate how an artist's influence changes over their career, and that correlated Wikipedia traffic is driven by cultural interests. More broadly, AttentionFlow can be generalised to visualise networks of time series on physical infrastructures such as road networks, or natural phenomena such as weather and geological measurements. △ Less

Submitted 3 February, 2021; originally announced February 2021.

Comments: Published in WSDM 2021. The demo is available at https://attentionflow.ml and code is available at https://github.com/alasdairtran/attentionflow

Journal ref: The Proceedings of the Fourteenth ACM International Conference on Web Search and Data Mining (WSDM), 2021

arXiv:2010.14041 [pdf, other]

doi 10.1093/mnras/stab2070

Evidence of a shared spectro-temporal law between sources of repeating fast radio bursts

Authors: Mohammed A. Chamma, Fereshteh Rajabi, Christopher M. Wyenberg, Abhilash Mathews, Martin Houde

Abstract: We study the spectro-temporal characteristics of two repeating fast radio bursts (FRBs), namely, FRB 20180916B and FRB 20180814A, and combine the results with those from our earlier analysis on FRB 20121102A. The relationship between the frequency drift rate, or slope, of individual sub-bursts and their temporal duration is investigated. We consider a broad sample of possible dispersion measure (D… ▽ More We study the spectro-temporal characteristics of two repeating fast radio bursts (FRBs), namely, FRB 20180916B and FRB 20180814A, and combine the results with those from our earlier analysis on FRB 20121102A. The relationship between the frequency drift rate, or slope, of individual sub-bursts and their temporal duration is investigated. We consider a broad sample of possible dispersion measure (DM) values for each source to understand the range of valid sub-burst slope and duration measurements for all bursts and to constrain our results. We find good agreement with an inverse scaling law between the two parameters previously predicted using a simple dynamical relativistic model. The remarkably similar behaviour observed in all sources provides strong evidence that a single and common underlying physical phenomenon is responsible for the emission of signals from these three FRBs, despite their associations with different types of host galaxies at various redshifts. It also opens up the possibility that this sub-burst slope law may be a universal property among repeating FRBs, or indicates a distinct subclass among them. △ Less

Submitted 15 July, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

Comments: 16 pages, 10 figures, accepted MNRAS

arXiv:2010.02568 [pdf, other]

SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy

Authors: Umanga Bista, Alexander Patrick Mathews, Aditya Krishna Menon, Lexing Xie

Abstract: Most work on multi-document summarization has focused on generic summarization of information present in each individual document set. However, the under-explored setting of update summarization, where the goal is to identify the new information present in each set, is of equal practical interest (e.g., presenting readers with updates on an evolving news topic). In this work, we present SupMMD, a… ▽ More Most work on multi-document summarization has focused on generic summarization of information present in each individual document set. However, the under-explored setting of update summarization, where the goal is to identify the new information present in each set, is of equal practical interest (e.g., presenting readers with updates on an evolving news topic). In this work, we present SupMMD, a novel technique for generic and update summarization based on the maximum mean discrepancy from kernel two-sample testing. SupMMD combines both supervised learning for salience and unsupervised learning for coverage and diversity. Further, we adapt multiple kernel learning to make use of similarity across multiple information sources (e.g., text features and knowledge based concepts). We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets. △ Less

Submitted 6 October, 2020; originally announced October 2020.

Comments: 15 pages

Journal ref: EMNLP 2020

arXiv:2009.05005 [pdf, other]

doi 10.1103/PhysRevE.104.025205

Uncovering turbulent plasma dynamics via deep learning from partial observations

Authors: Abhilash Mathews, Manaure Francisquez, Jerry Hughes, David Hatch, Ben Zhu, Barrett Rogers

Abstract: One of the most intensely studied aspects of magnetic confinement fusion is edge plasma turbulence which is critical to reactor performance and operation. Drift-reduced Braginskii two-fluid theory has for decades been widely applied to model boundary plasmas with varying success. Towards better understanding edge turbulence in both theory and experiment, we demonstrate that physics-informed neural… ▽ More One of the most intensely studied aspects of magnetic confinement fusion is edge plasma turbulence which is critical to reactor performance and operation. Drift-reduced Braginskii two-fluid theory has for decades been widely applied to model boundary plasmas with varying success. Towards better understanding edge turbulence in both theory and experiment, we demonstrate that physics-informed neural networks constrained by partial differential equations can accurately learn turbulent fields consistent with the two-fluid theory from just partial observations of a synthetic plasma's electron density and temperature in contrast with conventional equilibrium models. These techniques present a novel paradigm for the advanced design of plasma diagnostics and validation of magnetized plasma turbulence theories in challenging thermonuclear environments. △ Less

Submitted 5 April, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: 11 pages, 8 figures

Journal ref: Phys. Rev. E 104, 025205 (2021)

arXiv:2008.02395 [pdf, other]

doi 10.1093/mnras/staa2723

A simple relationship for the spectro-temporal structure of bursts from FRB 121102

Authors: Fereshteh Rajabi, Mohammed A. Chamma, Christopher M. Wyenberg, Abhilash Mathews, Martin Houde

Abstract: We consider a simple dynamical and relativistic model to explain the spectro-temporal structure often displayed by repeating fast radio bursts (FRBs). We show how this model can account for the downward frequency drift in a sequence of sub-bursts of increasing arrival time (the "sad trombone" effect) and their tendency for exhibiting a reduced pulse width with increasing frequency of observation.… ▽ More We consider a simple dynamical and relativistic model to explain the spectro-temporal structure often displayed by repeating fast radio bursts (FRBs). We show how this model can account for the downward frequency drift in a sequence of sub-bursts of increasing arrival time (the "sad trombone" effect) and their tendency for exhibiting a reduced pulse width with increasing frequency of observation. Most importantly, this model also predicts a systematic inverse relationship between the (steeper) slope of the frequency drift observed within a single sub-burst and its temporal duration. Using already published data for FRB 121102 we find and verify the relationship predicted by this model. We therefore argue that the overall behaviour observed for this object as a function of frequency is consistent with an underlying narrow-band emission process, where the wide-band nature of the measured FRB spectrum is due to relativistic motions. Although this scenario and the simple dynamics we consider could be applied to other theories, they are well-suited for a model based upon Dicke's superradiance as the physical process responsible for FRB radiation in this and similar sources. △ Less

Submitted 4 September, 2020; v1 submitted 5 August, 2020; originally announced August 2020.

Comments: 8 pages, 6 figures, accepted for publication MNRAS

arXiv:2007.14082 [pdf, other]

UNIPoint: Universally Approximating Point Processes Intensities

Authors: Alexander Soen, Alexander Mathews, Daniel Grixti-Cheng, Lexing Xie

Abstract: Point processes are a useful mathematical tool for describing events over time, and so there are many recent approaches for representing and learning them. One notable open question is how to precisely describe the flexibility of point process models and whether there exists a general model that can represent all point processes. Our work bridges this gap. Focusing on the widely used event intensi… ▽ More Point processes are a useful mathematical tool for describing events over time, and so there are many recent approaches for representing and learning them. One notable open question is how to precisely describe the flexibility of point process models and whether there exists a general model that can represent all point processes. Our work bridges this gap. Focusing on the widely used event intensity function representation of point processes, we provide a proof that a class of learnable functions can universally approximate any valid intensity function. The proof connects the well known Stone-Weierstrass Theorem for function approximation, the uniform density of non-negative continuous functions using a transfer functions, the formulation of the parameters of a piece-wise continuous functions as a dynamic system, and a recurrent neural network implementation for capturing the dynamics. Using these insights, we design and implement UNIPoint, a novel neural point process model, using recurrent neural networks to parameterise sums of basis function upon each event. Evaluations on synthetic and real world datasets show that this simpler representation performs better than Hawkes process variants and more complex neural network-based approaches. We expect this result will provide a practical basis for selecting and tuning models, as well as furthering theoretical work on representational complexity and learnability. △ Less

Submitted 2 March, 2021; v1 submitted 28 July, 2020; originally announced July 2020.

arXiv:2004.08070 [pdf, other]

Transform and Tell: Entity-Aware News Image Captioning

Authors: Alasdair Tran, Alexander Mathews, Lexing Xie

Abstract: We propose an end-to-end model which generates captions for images embedded in news articles. News images present two key challenges: they rely on real-world knowledge, especially about named entities; and they typically have linguistically rich captions that include uncommon words. We address the first challenge by associating words in the caption with faces and objects in the image, via a multi-… ▽ More We propose an end-to-end model which generates captions for images embedded in news articles. News images present two key challenges: they rely on real-world knowledge, especially about named entities; and they typically have linguistically rich captions that include uncommon words. We address the first challenge by associating words in the caption with faces and objects in the image, via a multi-modal, multi-head attention mechanism. We tackle the second challenge with a state-of-the-art transformer language model that uses byte-pair-encoding to generate captions as a sequence of word parts. On the GoodNews dataset, our model outperforms the previous state of the art by a factor of four in CIDEr score (13 to 54). This performance gain comes from a unique combination of language models, word representation, image embeddings, face embeddings, object embeddings, and improvements in neural network design. We also introduce the NYTimes800k dataset which is 70% larger than GoodNews, has higher article quality, and includes the locations of images within articles as an additional contextual cue. △ Less

Submitted 12 June, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

Comments: Published in CVPR 2020. Code is available at https://github.com/alasdairtran/transform-and-tell and demo is available at https://transform-and-tell.ml

ACM Class: I.4.0; I.2.7

Journal ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 13035-13045

arXiv:1912.05152 [pdf, other]

Differences in binarity between gluon and graviton scattering amplitudes

Authors: Abhilash Mathews

Abstract: Both 3- and 4-point scattering amplitudes for spin-1 massless particles (gluons) and spin-2 massless particles (gravitons) are reviewed through self-contained step-by-step derivation. Gluon and graviton interactions are computed from on-shell diagrams by starting from complex momenta and introducing spinor-helicity formalism. By Fourier transforming spinor variables in momentum space, binarity is… ▽ More Both 3- and 4-point scattering amplitudes for spin-1 massless particles (gluons) and spin-2 massless particles (gravitons) are reviewed through self-contained step-by-step derivation. Gluon and graviton interactions are computed from on-shell diagrams by starting from complex momenta and introducing spinor-helicity formalism. By Fourier transforming spinor variables in momentum space, binarity is revealed in pure Yang-Mills scattering amplitudes while absent to a certain extent for gravity indicating fundamental differences in the probability spaces spanned by the two interactions. △ Less

Submitted 11 December, 2019; originally announced December 2019.

Comments: 5 pages

arXiv:1812.02171 [pdf, other]

doi 10.1609/aaai.v33i01.330120

Comparative Document Summarisation via Classification

Authors: Umanga Bista, Alexander Mathews, Minjeong Shin, Aditya Krishna Menon, Lexing Xie

Abstract: This paper considers extractive summarisation in a comparative setting: given two or more document groups (e.g., separated by publication time), the goal is to select a small number of documents that are representative of each group, and also maximally distinguishable from other groups. We formulate a set of new objective functions for this problem that connect recent literature on document summar… ▽ More This paper considers extractive summarisation in a comparative setting: given two or more document groups (e.g., separated by publication time), the goal is to select a small number of documents that are representative of each group, and also maximally distinguishable from other groups. We formulate a set of new objective functions for this problem that connect recent literature on document summarisation, interpretable machine learning, and data subset selection. In particular, by casting the problem as a binary classification amongst different groups, we derive objectives based on the notion of maximum mean discrepancy, as well as a simple yet effective gradient-based optimisation strategy. Our new formulation allows scalable evaluations of comparative summarisation as a classification task, both automatically and via crowd-sourcing. To this end, we evaluate comparative summarisation methods on a newly curated collection of controversial news topics over 13 months. We observe that gradient-based optimisation outperforms discrete and baseline approaches in 14 out of 24 different automatic evaluation settings. In crowd-sourced evaluations, summaries from gradient optimisation elicit 7% more accurate classification from human workers than discrete optimisation. Our result contrasts with recent literature on submodular data subset selection that favours discrete optimisation. We posit that our formulation of comparative summarisation will prove useful in a diverse range of use cases such as comparing content sources, authors, related topics, or distinct view points. △ Less

Submitted 2 January, 2020; v1 submitted 5 December, 2018; originally announced December 2018.

Comments: Accepted for AAAI 2019

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019

arXiv:1810.04364 [pdf, other]

doi 10.1093/mnras/sty3046

Triggered superradiance and fast radio bursts

Authors: Martin Houde, Fereshteh Rajabi, B. M. Gaensler, A. Mathews, Victor Tranchant

Abstract: In this paper we develop a model for fast radio bursts (FRBs) based on triggered superradiance (SR) and apply it to previously published data of FRB 110220 and FRB 121102. We show how a young pulsar located at ~100 pc or more from an SR/FRB system could initiate the onset of a powerful burst of radiation detectable over cosmological distances. Our models using the OH$^2Π_{3/2}$… ▽ More In this paper we develop a model for fast radio bursts (FRBs) based on triggered superradiance (SR) and apply it to previously published data of FRB 110220 and FRB 121102. We show how a young pulsar located at ~100 pc or more from an SR/FRB system could initiate the onset of a powerful burst of radiation detectable over cosmological distances. Our models using the OH$^2Π_{3/2}$ $\left(J=3/2\right)$ 1612 MHz and $^2Π_{3/2}$ $\left(J=5/2\right)$ 6030 MHz spectral lines match the light curves well and suggest the entanglement of more than $10^{30}$ initially inverted molecules over lengths of approximately 300 au for a single SR sample. SR also accounts for the observed temporal narrowing of FRB pulses with increasing frequency for FRB 121102, and predicts a scaling of the FRB spectral bandwidth with the frequency of observation, which we found to be consistent with the existing data. △ Less

Submitted 6 November, 2018; v1 submitted 10 October, 2018; originally announced October 2018.

Comments: 9 pages, 5 figures, accepted MNRAS

arXiv:1809.07183 [pdf, other]

doi 10.1016/j.nima.2018.11.115

The GRIFFIN Facility for Decay-Spectroscopy Studies at TRIUMF-ISAC

Authors: A. B. Garnsworthy, C. E. Svensson, M. Bowry, R. Dunlop, A. D. MacLean, B. Olaizola, J. K. Smith, F. A. Ali, C. Andreoiu, J. E. Ash, W. H. Ashfield, G. C. Ball, T. Ballast, C. Bartlett, Z. Beadle, P. C. Bender, N. Bernier, S. S. Bhattacharjee, H. Bidaman, V. Bildstein, D. Bishop, P. Boubel, R. Braid, D. Brennan, T. Bruhn , et al. (79 additional authors not shown)

Abstract: Gamma-Ray Infrastructure For Fundamental Investigations of Nuclei, GRIFFIN, is a new high-efficiency $γ$-ray spectrometer designed for use in decay spectroscopy experiments with low-energy radioactive ion beams provided by TRIUMF's Isotope Separator and Accelerator (ISAC-I) facility. GRIFFIN is composed of sixteen Compton-suppressed large-volume clover-type high-purity germanium (HPGe) $γ$-ray det… ▽ More Gamma-Ray Infrastructure For Fundamental Investigations of Nuclei, GRIFFIN, is a new high-efficiency $γ$-ray spectrometer designed for use in decay spectroscopy experiments with low-energy radioactive ion beams provided by TRIUMF's Isotope Separator and Accelerator (ISAC-I) facility. GRIFFIN is composed of sixteen Compton-suppressed large-volume clover-type high-purity germanium (HPGe) $γ$-ray detectors combined with a suite of ancillary detection systems and coupled to a custom digital data acquisition system. The infrastructure and detectors of the spectrometer as well as the performance characteristics and the analysis techniques applied to the experimental data are described. △ Less

Submitted 6 December, 2018; v1 submitted 17 September, 2018; originally announced September 2018.

arXiv:1805.07030 [pdf, other]

SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text

Authors: Alexander Mathews, Lexing Xie, Xuming He

Abstract: Linguistic style is an essential part of written communication, with the power to affect both clarity and attractiveness. With recent advances in vision and language, we can start to tackle the problem of generating image captions that are both visually grounded and appropriately styled. Existing approaches either require styled training captions aligned to images or generate captions with low rel… ▽ More Linguistic style is an essential part of written communication, with the power to affect both clarity and attractiveness. With recent advances in vision and language, we can start to tackle the problem of generating image captions that are both visually grounded and appropriately styled. Existing approaches either require styled training captions aligned to images or generate captions with low relevance. We develop a model that learns to generate visually relevant styled captions from a large corpus of styled text without aligned images. The core idea of this model, called SemStyle, is to separate semantics and style. One key component is a novel and concise semantic term representation generated using natural language processing techniques and frame semantics. In addition, we develop a unified language model that decodes sentences with diverse word choices and syntax for different styles. Evaluations, both automatic and manual, show captions from SemStyle preserve image semantics, are descriptive, and are style shifted. More broadly, this work provides possibilities to learn richer image descriptions from the plethora of linguistic data available on the web. △ Less

Submitted 17 May, 2018; originally announced May 2018.

Comments: Accepted at CVPR 2018

arXiv:1805.05557 [pdf, other]

Simplifying Sentences with Sequence to Sequence Models

Authors: Alexander Mathews, Lexing Xie, Xuming He

Abstract: We simplify sentences with an attentive neural network sequence to sequence model, dubbed S4. The model includes a novel word-copy mechanism and loss function to exploit linguistic similarities between the original and simplified sentences. It also jointly uses pre-trained and fine-tuned word embeddings to capture the semantics of complex sentences and to mitigate the effects of limited data. When… ▽ More We simplify sentences with an attentive neural network sequence to sequence model, dubbed S4. The model includes a novel word-copy mechanism and loss function to exploit linguistic similarities between the original and simplified sentences. It also jointly uses pre-trained and fine-tuned word embeddings to capture the semantics of complex sentences and to mitigate the effects of limited data. When trained and evaluated on pairs of sentences from thousands of news articles, we observe a 8.8 point improvement in BLEU score over a sequence to sequence baseline; however, learning word substitutions remains difficult. Such sequence to sequence models are promising for other text generation tasks such as style transfer. △ Less

Submitted 15 May, 2018; originally announced May 2018.

arXiv:1801.03585 [pdf, other]

doi 10.1016/j.astropartphys.2018.06.001

Indications of an unexpected signal associated with the GW170817 binary neutron star inspiral

Authors: E. Fischbach, V. E. Barnes, N. Cinko, J. Heim, H. B. Kaplan, D. E. Krause, J. R. Leeman, S. A. Mathews, M. J. Mueterthies, D. Neff, M. Pattermann

Abstract: We report experimental evidence at the 2.5$σ$ level for an unexpected signal associated with the GW170817 binary neutron star inspiral. This evidence derives from a laboratory experiment simultaneously measuring the $β$-decay rates of Si-32 and Cl-36 in a common detector. Whereas the Si-32 and Cl-36 decay rates show no statistical correlation before or after the inspiral, they are highly correlate… ▽ More We report experimental evidence at the 2.5$σ$ level for an unexpected signal associated with the GW170817 binary neutron star inspiral. This evidence derives from a laboratory experiment simultaneously measuring the $β$-decay rates of Si-32 and Cl-36 in a common detector. Whereas the Si-32 and Cl-36 decay rates show no statistical correlation before or after the inspiral, they are highly correlated ($\sim 95\%$) in the 5 hour time interval immediately following the inspiral. If we interpret this correlation as arising from the influence of particles emitted during the inspiral, then we can estimate the mass $m_{x}$ of these particles from the time delay between the gravity-wave signal and a peak in the $β$-decay data. We find for particles of energy 10 MeV, $m_{x}$ $\lesssim$ 16 eV which includes the neutrino mass region $m_ν$ $\lesssim$ 2 eV. The latter is based on existing limits for the masses $m_{i}$ of the three known neutrino flavors. Additionally, we find that the correlation is even stronger if we include data in the 80 minute period before the arrival of the gravity wave signal. Given the large number of radionuclides whose decays are being monitored at any given time, we conjecture that other groups may also be in a position to search for statistically suggestive fluctuations of radionuclide decay rates associated with the GW170817 inspiral, and possibly with other future inspirals. △ Less

Submitted 21 June, 2018; v1 submitted 10 January, 2018; originally announced January 2018.

Comments: 11 pages, 3 figures (minor edits and reformatting to bring it closer to the published version)

Journal ref: Astroparticle Physics 103 (2018) 1-6

arXiv:1710.00401 [pdf, ps, other]

doi 10.1093/mnras/stx3205

Explaining fast radio bursts through Dicke's superradiance

Authors: Martin Houde, Abhilash Mathews, Fereshteh Rajabi

Abstract: Fast Radio Bursts (FRBs), characterized by strong bursts of radiation intensity at radio wavelengths lasting on the order of a millisecond, have yet to be firmly associated with a family, or families, of astronomical sources. It follows that despite the large number of proposed models no well-defined physical process has been identified to explain this phenomenon. In this paper, we demonstrate how… ▽ More Fast Radio Bursts (FRBs), characterized by strong bursts of radiation intensity at radio wavelengths lasting on the order of a millisecond, have yet to be firmly associated with a family, or families, of astronomical sources. It follows that despite the large number of proposed models no well-defined physical process has been identified to explain this phenomenon. In this paper, we demonstrate how Dicke's superradiance, for which evidence has recently been found in the interstellar medium, can account for the characteristics associated to FRBs. Our analysis and modelling of previously detected FRBs suggest they could originate from regions in many ways similar to those known to harbor masers or megamasers, and result from the coherent radiation emanating from populations of molecules associated with large-scale entangled quantum mechanical states. We estimate this entanglement to involve as many as ~10^(30) to ~10^(32) molecules over distances spanning 100 to 1000 AU. △ Less

Submitted 8 December, 2017; v1 submitted 1 October, 2017; originally announced October 2017.

Comments: 9 pages, 6 figures, accepted for publication in the MNRAS

arXiv:1709.08448 [pdf, ps, other]

Extracting Ontological Knowledge from Textual Descriptions

Authors: Kevin Alex Mathews, P Sreenivasa Kumar

Abstract: Authoring of OWL-DL ontologies is intellectually challenging and to make this process simpler, many systems accept natural language text as input. A text-based ontology authoring approach can be successful only when it is combined with an effective method for extracting ontological axioms from text. Extracting axioms from unrestricted English input is a substantially challenging task due to the ri… ▽ More Authoring of OWL-DL ontologies is intellectually challenging and to make this process simpler, many systems accept natural language text as input. A text-based ontology authoring approach can be successful only when it is combined with an effective method for extracting ontological axioms from text. Extracting axioms from unrestricted English input is a substantially challenging task due to the richness of the language. Controlled natural languages (CNLs) have been proposed in this context and these tend to be highly restrictive. In this paper, we propose a new CNL called TEDEI (TExtual DEscription Identifier) whose grammar is inspired by the different ways OWL-DL constructs are expressed in English. We built a system that transforms TEDEI sentences into corresponding OWL-DL axioms. Now, ambiguity due to different possible lexicalizations of sentences and semantic ambiguity present in sentences are challenges in this context. We find that the best way to handle these challenges is to construct axioms corresponding to alternative formalizations of the sentence so that the end-user can make an appropriate choice. The output is compared against human-authored axioms and in substantial number of cases, human-authored axiom is indeed one of the alternatives given by the system. The proposed system substantially enhances the types of sentence structures that can be used for ontology authoring. △ Less

Submitted 28 September, 2017; v1 submitted 25 September, 2017; originally announced September 2017.

Comments: 8 pages

arXiv:1510.01431 [pdf, other]

SentiCap: Generating Image Descriptions with Sentiments

Authors: Alexander Mathews, Lexing Xie, Xuming He

Abstract: The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a sys… ▽ More The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88% were confirmed by the crowd-sourced workers as having the appropriate sentiment. △ Less

Submitted 13 December, 2015; v1 submitted 6 October, 2015; originally announced October 2015.

ACM Class: I.2.10; I.2.7; I.2.6

arXiv:1501.00242 [pdf, ps, other]

Combined 3D PET and Optical Projection Tomography Techniques for Plant Root Phenoty**

Authors: Qiang Wang, Sergey Komarov, Aswin J. Mathews, Ke Li, Christopher Topp, Joseph A. O'Sullivan, Yuan-Chuan Tai

Abstract: New imaging techniques are in great demand for investigating underground plant roots systems which play an important role in crop production. Compared with other non-destructive imaging modalities, PET can image plant roots in natural soil and produce dynamic 3D functional images which reveal the temporal dynamics of plant-environment interactions. In this study, we combined PET with optical proje… ▽ More New imaging techniques are in great demand for investigating underground plant roots systems which play an important role in crop production. Compared with other non-destructive imaging modalities, PET can image plant roots in natural soil and produce dynamic 3D functional images which reveal the temporal dynamics of plant-environment interactions. In this study, we combined PET with optical projection tomography (OPT) to evaluate its potential for plant root phenoty**. We used a dedicated high resolution plant PET imager that has a 14 cm transaxial and 10 cm axial field of views, and multi-bed imaging capability. The image resolution is around 1.25 mm using ML-EM reconstruction algorithm. B73 inbred maize seeds were germinated and then grown in a sealed jar with transparent gel-based media. PET scanning started on the day when the first green leaf appeared, and was carried out once a day for 5 days. Each morning, around 10 mCi of 11CO2 was administrated into a custom built plant labeling chamber. After 10 minutes, residual activity was flushed out with fresh air before a 2-h PET scan started. For the OPT imaging, the jar was placed inside an acrylic cubic container filled with water, illuminated with a uniform surface light source, and imaged by a DSLR camera from 72 angles to acquire optical images for OPT reconstruction. The same plant was imaged 3 times a day by the OPT system. Plant roots growth is measured from the optical images. Co-registered PET and optical images indicate that most of the hot spots appeared in later time points of the PET images correspond to the most actively growing root tips. The strong linear correlation between 11C allocation at root tips measured by PET and eventual root growth measured by OPT suggests that we can use PET as a phenoty** tool to measure how a plant makes subterranean carbon allocation decisions in different environmental scenarios. △ Less

Submitted 31 December, 2014; originally announced January 2015.

Comments: 5 pages, 10 figures

arXiv:1401.3374 [pdf, ps, other]

doi 10.1088/0031-9155/59/19/5613

A dedicated high resolution PET imager for plant sciences

Authors: Qiang Wang, Aswin J. Mathews, Ke Li, Jie Wen, Sergey Komarov, Joseph A. O'Sullivan, Yuan-Chuan Tai

Abstract: PET provides in vivo molecular and functional imaging capability that is crucial to studying the interaction of plant with changing environment at the whole-plant level. We have developed a dedicated plant PET imager that features high spatial resolution, housed in a fully controlled environment provided by a plant growth chamber (PGC). The system currently contains two types of detector modules:… ▽ More PET provides in vivo molecular and functional imaging capability that is crucial to studying the interaction of plant with changing environment at the whole-plant level. We have developed a dedicated plant PET imager that features high spatial resolution, housed in a fully controlled environment provided by a plant growth chamber (PGC). The system currently contains two types of detector modules: 84 microPET R4 block detectors with 2.2 mm crystals to provide a large detecting area; and 32 Inveon block detectors with 1.5 mm crystals to provide higher spatial resolution. Outputs of the four microPET block detectors in a modular housing are concatenated by a custom printed circuit board to match the output characteristics of an Inveon detector. All the detectors are read out by QuickSilver electronics. The detector modules are configured to full rings with a 15 cm diameter trans-axial field of view (FOV) for dynamic tomographic imaging of small plants. Potentially, the Inveon detectors can be reconfigured to quarter-rings to get a 25 cm FOV using step-and-shoot motion. The imager contains 2 linear stages to position detectors at different heights for multi-bed scanning, and 2 rotation stages to collect coincidence events from all angles. The PET system has been built and integrated into the PGC. The system has a typical energy resolution of 15% for Inveon blocks and 24% for R4 blocks; timing resolution of 1.8 ns; and sensitivity of 1.3%,1.4%,3.0% measured at center of FOV, 5 cm off to R4 half-ring and 5 cm off to Inveon half-ring, respectively(with a 350-650 KeV energy and 3.1 ns timing window). System spatial resolution is similar to that of commercial microPET sytems, with 1.25 mm rod sources in the micro-Derenzo phantom resolved using ML-EM algorithm. Preliminary imaging experiments using different plants labeled with 11C-CO2 produced high-quality dynamic PET images. △ Less

Submitted 14 January, 2014; originally announced January 2014.

Comments: 19 pages

Journal ref: 2014 Phys. Med. Biol. 59 5613

arXiv:astro-ph/0501018 [pdf, ps, other]

Packet Loss in High Data Rate Internet Data Transfer for eVLBI

Authors: R. Spencer, R. Hughes-Jones, A. Mathews, S. O'Toole

Abstract: VLBI is gradually moving to the point where Gbps data rates are becoming routine. A number of experiments have shown that the internet can be used at data rates of several hundred Mbps on production networks. However use of the network is accompanied by packet loss. The paper discusses the statistics of packet loss as found by recent tests and investigates the expected effect of packet loss on c… ▽ More VLBI is gradually moving to the point where Gbps data rates are becoming routine. A number of experiments have shown that the internet can be used at data rates of several hundred Mbps on production networks. However use of the network is accompanied by packet loss. The paper discusses the statistics of packet loss as found by recent tests and investigates the expected effect of packet loss on correlator performance and signal to noise ratio on eVLBI observations. The relative merits of UDP versus TCP are also discussed. △ Less

Submitted 3 January, 2005; originally announced January 2005.

Comments: 4 pages. 6 figures. Proceedings of the 7th European VLBI Network Symposium held in Toledo, Spain on October 12-15, 2004. Editors: R. Bachiller, F. Colomer, J.-F. Desmurs, P. de Vicente (Observatorio Astronomico Nacional), p. 261-264. Needs evn2004.cls

Showing 1–34 of 34 results for author: Mathews, A