Skip to main content

Showing 1–20 of 20 results for author: Mathews, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2304.11277  [pdf, other

    cs.DC cs.AI cs.LG cs.PF

    PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

    Authors: Yanli Zhao, Andrew Gu, Rohan Varma, Liang Luo, Chien-Chin Huang, Min Xu, Less Wright, Hamid Shojanazeri, Myle Ott, Sam Shleifer, Alban Desmaison, Can Balioglu, Pritam Damania, Bernard Nguyen, Geeta Chauhan, Yuchen Hao, Ajit Mathews, Shen Li

    Abstract: It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite the remarkable progress made in the field of machine learning systems research, which has enabled the development and exploration of large models, such abilities remain confined to a small group of advanced users and industry leaders, resulting in an implicit tech… ▽ More

    Submitted 12 September, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

  2. arXiv:2205.11467  [pdf, other

    cs.CL

    A Question-Answer Driven Approach to Reveal Affirmative Interpretations from Verbal Negations

    Authors: Md Mosharaf Hossain, Luke Holman, Anusha Kakileti, Tiffany Iris Kao, Nathan Raul Brito, Aaron Abraham Mathews, Eduardo Blanco

    Abstract: This paper explores a question-answer driven approach to reveal affirmative interpretations from verbal negations (i.e., when a negation cue grammatically modifies a verb). We create a new corpus consisting of 4,472 verbal negations and discover that 67.1% of them convey that an event actually occurred. Annotators generate and answer 7,277 questions for the 3,001 negations that convey an affirmati… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: Accepted at the Findings of NAACL 2022

  3. arXiv:2205.07838  [pdf, other

    physics.plasm-ph cs.LG math.NA stat.ML

    Physics-informed machine learning techniques for edge plasma turbulence modelling in computational theory and experiment

    Authors: Abhilash Mathews

    Abstract: Edge plasma turbulence is critical to the performance of magnetic confinement fusion devices. Towards better understanding edge turbulence in both theory and experiment, a custom-built physics-informed deep learning framework constrained by partial differential equations is developed to accurately learn turbulent fields consistent with the two-fluid theory from partial observations of electron pre… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

    Comments: PhD thesis, 172 pages, 38 figures, 4 tables

  4. arXiv:2205.04235  [pdf, other

    q-bio.NC cs.AI cs.HC

    Measuring Cognitive Workload Using Multimodal Sensors

    Authors: Niraj Hirachan, Anita Mathews, Julio Romero, Raul Fernandez Rojas

    Abstract: This study aims to identify a set of indicators to estimate cognitive workload using a multimodal sensing approach and machine learning. A set of three cognitive tests were conducted to induce cognitive workload in twelve participants at two levels of task difficulty (Easy and Hard). Four sensors were used to measure the participants' physiological change, including, Electrocardiogram (ECG), elect… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

  5. arXiv:2204.11689  [pdf, other

    physics.plasm-ph cs.LG stat.ML

    Deep electric field predictions by drift-reduced Braginskii theory with plasma-neutral interactions based upon experimental images of boundary turbulence

    Authors: Abhilash Mathews, Jerry Hughes, James Terry, Seung-Gyou Baek

    Abstract: We present 2-dimensional turbulent electric field calculations via physics-informed deep learning consistent with (i) drift-reduced Braginskii theory under the framework of an axisymmetric fusion plasma with purely toroidal field and (ii) experimental estimates of the fluctuating electron density and temperature on open field lines obtained from analysis of gas puff imaging of a discharge on the A… ▽ More

    Submitted 28 November, 2022; v1 submitted 25 April, 2022; originally announced April 2022.

    Comments: 6 pages, 3 figures, 2 tables

  6. arXiv:2112.03256  [pdf, ps, other

    cs.CL

    Impact of Target Word and Context on End-to-End Metonymy Detection

    Authors: Kevin Alex Mathews, Michael Strube

    Abstract: Metonymy is a figure of speech in which an entity is referred to by another related entity. The task of metonymy detection aims to distinguish metonymic tokens from literal ones. Until now, metonymy detection methods attempt to disambiguate only a single noun phrase in a sentence, typically location names or organization names. In this paper, we disambiguate every word in a sentence by reformulati… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

  7. arXiv:2111.13802  [pdf, other

    cs.LG cs.CE

    Factorized Fourier Neural Operators

    Authors: Alasdair Tran, Alexander Mathews, Lexing Xie, Cheng Soon Ong

    Abstract: We propose the Factorized Fourier Neural Operator (F-FNO), a learning-based approach for simulating partial differential equations (PDEs). Starting from a recently proposed Fourier representation of flow fields, the F-FNO bridges the performance gap between pure machine learning approaches to that of the best numerical or hybrid solvers. This is achieved with new representations - separable spectr… ▽ More

    Submitted 2 March, 2023; v1 submitted 26 November, 2021; originally announced November 2021.

    Comments: Published in The Eleventh International Conference on Learning Representations (2023). Code is available at https://github.com/alasdairtran/fourierflow

  8. arXiv:2107.09744  [pdf, other

    physics.plasm-ph cs.LG physics.comp-ph stat.ML

    Turbulent field fluctuations in gyrokinetic and fluid plasmas

    Authors: Abhilash Mathews, Noah Mandell, Manaure Francisquez, Jerry Hughes, Ammar Hakim

    Abstract: A key uncertainty in the design and development of magnetic confinement fusion energy reactors is predicting edge plasma turbulence. An essential step in overcoming this uncertainty is the validation in accuracy of reduced turbulent transport models. Drift-reduced Braginskii two-fluid theory is one such set of reduced equations that has for decades simulated boundary plasmas in experiment, but sig… ▽ More

    Submitted 6 October, 2021; v1 submitted 20 July, 2021; originally announced July 2021.

    Comments: 13 pages, 5 figures

  9. arXiv:2107.04140  [pdf, other

    cs.AR

    First-Generation Inference Accelerator Deployment at Facebook

    Authors: Michael Anderson, Benny Chen, Stephen Chen, Summer Deng, Jordan Fix, Michael Gschwind, Aravind Kalaiah, Changkyu Kim, Jaewon Lee, Jason Liang, Haixin Liu, Yinghai Lu, Jack Montgomery, Arun Moorthy, Satish Nadathur, Sam Naghshineh, Avinash Nayak, Jongsoo Park, Chris Petersen, Martin Schatz, Narayanan Sundaram, Bangsheng Tang, Peter Tang, Amy Yang, Jiecao Yu , et al. (90 additional authors not shown)

    Abstract: In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the in… ▽ More

    Submitted 4 August, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

  10. arXiv:2104.05158  [pdf, other

    cs.DC cs.AI cs.LG cs.PF

    Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

    Authors: Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng , et al. (28 additional authors not shown)

    Abstract: Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pa… ▽ More

    Submitted 26 February, 2023; v1 submitted 11 April, 2021; originally announced April 2021.

  11. Radflow: A Recurrent, Aggregated, and Decomposable Model for Networks of Time Series

    Authors: Alasdair Tran, Alexander Mathews, Cheng Soon Ong, Lexing Xie

    Abstract: We propose a new model for networks of time series that influence each other. Graph structures among time series are found in diverse domains, such as web traffic influenced by hyperlinks, product sales influenced by recommendation, or urban transport volume influenced by road networks and weather. There has been recent progress in graph modeling and in time series forecasting, respectively, but a… ▽ More

    Submitted 14 February, 2021; originally announced February 2021.

    Comments: Published in The Web Conference 2021. Code is available at https://github.com/alasdairtran/radflow

    Journal ref: Proceedings of The Web Conference 2021 (WWW '21)

  12. arXiv:2102.01974  [pdf, other

    cs.SI cs.HC cs.LG cs.MM

    AttentionFlow: Visualising Influence in Networks of Time Series

    Authors: Minjeong Shin, Alasdair Tran, Siqi Wu, Alexander Mathews, Rong Wang, Georgiana Lyall, Lexing Xie

    Abstract: The collective attention on online items such as web pages, search terms, and videos reflects trends that are of social, cultural, and economic interest. Moreover, attention trends of different items exhibit mutual influence via mechanisms such as hyperlinks or recommendations. Many visualisation tools exist for time series, network evolution, or network influence; however, few systems connect all… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

    Comments: Published in WSDM 2021. The demo is available at https://attentionflow.ml and code is available at https://github.com/alasdairtran/attentionflow

    Journal ref: The Proceedings of the Fourteenth ACM International Conference on Web Search and Data Mining (WSDM), 2021

  13. arXiv:2010.02568  [pdf, other

    cs.CL

    SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy

    Authors: Umanga Bista, Alexander Patrick Mathews, Aditya Krishna Menon, Lexing Xie

    Abstract: Most work on multi-document summarization has focused on generic summarization of information present in each individual document set. However, the under-explored setting of update summarization, where the goal is to identify the new information present in each set, is of equal practical interest (e.g., presenting readers with updates on an evolving news topic). In this work, we present SupMMD, a… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: 15 pages

    Journal ref: EMNLP 2020

  14. arXiv:2007.14082  [pdf, other

    cs.LG stat.ML

    UNIPoint: Universally Approximating Point Processes Intensities

    Authors: Alexander Soen, Alexander Mathews, Daniel Grixti-Cheng, Lexing Xie

    Abstract: Point processes are a useful mathematical tool for describing events over time, and so there are many recent approaches for representing and learning them. One notable open question is how to precisely describe the flexibility of point process models and whether there exists a general model that can represent all point processes. Our work bridges this gap. Focusing on the widely used event intensi… ▽ More

    Submitted 2 March, 2021; v1 submitted 28 July, 2020; originally announced July 2020.

  15. arXiv:2004.08070  [pdf, other

    cs.CV cs.CL

    Transform and Tell: Entity-Aware News Image Captioning

    Authors: Alasdair Tran, Alexander Mathews, Lexing Xie

    Abstract: We propose an end-to-end model which generates captions for images embedded in news articles. News images present two key challenges: they rely on real-world knowledge, especially about named entities; and they typically have linguistically rich captions that include uncommon words. We address the first challenge by associating words in the caption with faces and objects in the image, via a multi-… ▽ More

    Submitted 12 June, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

    Comments: Published in CVPR 2020. Code is available at https://github.com/alasdairtran/transform-and-tell and demo is available at https://transform-and-tell.ml

    ACM Class: I.4.0; I.2.7

    Journal ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 13035-13045

  16. arXiv:1812.02171  [pdf, other

    cs.IR cs.LG stat.ML

    Comparative Document Summarisation via Classification

    Authors: Umanga Bista, Alexander Mathews, Minjeong Shin, Aditya Krishna Menon, Lexing Xie

    Abstract: This paper considers extractive summarisation in a comparative setting: given two or more document groups (e.g., separated by publication time), the goal is to select a small number of documents that are representative of each group, and also maximally distinguishable from other groups. We formulate a set of new objective functions for this problem that connect recent literature on document summar… ▽ More

    Submitted 2 January, 2020; v1 submitted 5 December, 2018; originally announced December 2018.

    Comments: Accepted for AAAI 2019

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019

  17. arXiv:1805.07030  [pdf, other

    cs.CV

    SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text

    Authors: Alexander Mathews, Lexing Xie, Xuming He

    Abstract: Linguistic style is an essential part of written communication, with the power to affect both clarity and attractiveness. With recent advances in vision and language, we can start to tackle the problem of generating image captions that are both visually grounded and appropriately styled. Existing approaches either require styled training captions aligned to images or generate captions with low rel… ▽ More

    Submitted 17 May, 2018; originally announced May 2018.

    Comments: Accepted at CVPR 2018

  18. arXiv:1805.05557  [pdf, other

    cs.CL

    Simplifying Sentences with Sequence to Sequence Models

    Authors: Alexander Mathews, Lexing Xie, Xuming He

    Abstract: We simplify sentences with an attentive neural network sequence to sequence model, dubbed S4. The model includes a novel word-copy mechanism and loss function to exploit linguistic similarities between the original and simplified sentences. It also jointly uses pre-trained and fine-tuned word embeddings to capture the semantics of complex sentences and to mitigate the effects of limited data. When… ▽ More

    Submitted 15 May, 2018; originally announced May 2018.

  19. arXiv:1709.08448  [pdf, ps, other

    cs.AI cs.CL

    Extracting Ontological Knowledge from Textual Descriptions

    Authors: Kevin Alex Mathews, P Sreenivasa Kumar

    Abstract: Authoring of OWL-DL ontologies is intellectually challenging and to make this process simpler, many systems accept natural language text as input. A text-based ontology authoring approach can be successful only when it is combined with an effective method for extracting ontological axioms from text. Extracting axioms from unrestricted English input is a substantially challenging task due to the ri… ▽ More

    Submitted 28 September, 2017; v1 submitted 25 September, 2017; originally announced September 2017.

    Comments: 8 pages

  20. arXiv:1510.01431  [pdf, other

    cs.CV cs.CL

    SentiCap: Generating Image Descriptions with Sentiments

    Authors: Alexander Mathews, Lexing Xie, Xuming He

    Abstract: The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a sys… ▽ More

    Submitted 13 December, 2015; v1 submitted 6 October, 2015; originally announced October 2015.

    ACM Class: I.2.10; I.2.7; I.2.6