Search | arXiv e-print repository

Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently

Authors: Sergio Calo, Anders Jonsson, Gergely Neu, Ludovic Schwartz, Javier Segovia-Aguas

Abstract: We propose a new framework for formulating optimal transport distances between Markov chains. Previously known formulations studied couplings between the entire joint distribution induced by the chains, and derived solutions via a reduction to dynamic programming (DP) in an appropriately defined Markov decision process. This formulation has, however, not led to particularly efficient algorithms so… ▽ More We propose a new framework for formulating optimal transport distances between Markov chains. Previously known formulations studied couplings between the entire joint distribution induced by the chains, and derived solutions via a reduction to dynamic programming (DP) in an appropriately defined Markov decision process. This formulation has, however, not led to particularly efficient algorithms so far, since computing the associated DP operators requires fully solving a static optimal transport problem, and these operators need to be applied numerous times during the overall optimization process. In this work, we develop an alternative perspective by considering couplings between a flattened version of the joint distributions that we call discounted occupancy couplings, and show that calculating optimal transport distances in the full space of joint distributions can be equivalently formulated as solving a linear program (LP) in this reduced space. This LP formulation allows us to port several algorithmic ideas from other areas of optimal transport theory. In particular, our formulation makes it possible to introduce an appropriate notion of entropy regularization into the optimization problem, which in turn enables us to directly calculate optimal transport distances via a Sinkhorn-like method we call Sinkhorn Value Iteration (SVI). We show both theoretically and empirically that this method converges quickly to an optimal coupling, essentially at the same computational cost of running vanilla Sinkhorn in each pair of states. Along the way, we point out that our optimal transport distance exactly matches the common notion of bisimulation metrics between Markov chains, and thus our results also apply to computing such metrics, and in fact our algorithm turns out to be significantly more efficient than the best known methods developed so far for this purpose. △ Less

Submitted 11 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

arXiv:2403.05575 [pdf]

doi 10.2196/51727

Enhancing Health Care Accessibility and Equity Through a Geoprocessing Toolbox for Spatial Accessibility Analysis: Development and Case Study

Authors: Soheil Hashtarkhani, David L Schwartz, Arash Shaban-Nejad

Abstract: Access to health care services is a critical determinant of population health and well-being. Measuring spatial accessibility to health services is essential for understanding health care distribution and addressing potential inequities. In this study, we developed a geoprocessing toolbox including Python script tools for the ArcGIS Pro environment to measure the spatial accessibility of health se… ▽ More Access to health care services is a critical determinant of population health and well-being. Measuring spatial accessibility to health services is essential for understanding health care distribution and addressing potential inequities. In this study, we developed a geoprocessing toolbox including Python script tools for the ArcGIS Pro environment to measure the spatial accessibility of health services using both classic and enhanced versions of the 2-step floating catchment area method. Each of our tools incorporated both distance buffers and travel time catchments to calculate accessibility scores based on users' choices. Additionally, we developed a separate tool to create travel time catchments that is compatible with both locally available network data sets and ArcGIS Online data sources. We conducted a case study focusing on the accessibility of hemodialysis services in the state of Tennessee using the 4 versions of the accessibility tools. Notably, the calculation of the target population considered age as a significant nonspatial factor influencing hemodialysis service accessibility. Weighted populations were calculated using end-stage renal disease incidence rates in different age groups. The implemented tools are made accessible through ArcGIS Online for free use by the research community. The case study revealed disparities in the accessibility of hemodialysis services, with urban areas demonstrating higher scores compared to rural and suburban regions. These geoprocessing tools can serve as valuable decision-support resources for health care providers, organizations, and policy makers to improve equitable access to health care services. This comprehensive approach to measuring spatial accessibility can empower health care stakeholders to address health care distribution challenges effectively. △ Less

Submitted 26 February, 2024; originally announced March 2024.

Comments: 11 pages, 5 figures

MSC Class: 68U05

Journal ref: JMIR Form Res JMIR Formative Research. 2024 Feb 21:8:e51727

arXiv:2402.15411 [pdf, other]

Optimistic Information Directed Sampling

Authors: Gergely Neu, Matteo Papini, Ludovic Schwartz

Abstract: We study the problem of online learning in contextual bandit problems where the loss function is assumed to belong to a known parametric function class. We propose a new analytic framework for this setting that bridges the Bayesian theory of information-directed sampling due to Russo and Van Roy (2018) and the worst-case theory of Foster, Kakade, Qian, and Rakhlin (2021) based on the decision-esti… ▽ More We study the problem of online learning in contextual bandit problems where the loss function is assumed to belong to a known parametric function class. We propose a new analytic framework for this setting that bridges the Bayesian theory of information-directed sampling due to Russo and Van Roy (2018) and the worst-case theory of Foster, Kakade, Qian, and Rakhlin (2021) based on the decision-estimation coefficient. Drawing from both lines of work, we propose a algorithmic template called Optimistic Information-Directed Sampling and show that it can achieve instance-dependent regret guarantees similar to the ones achievable by the classic Bayesian IDS method, but with the major advantage of not requiring any Bayesian assumptions. The key technical innovation of our analysis is introducing an optimistic surrogate model for the regret and using it to define a frequentist version of the Information Ratio of Russo and Van Roy (2018), and a less conservative version of the Decision Estimation Coefficient of Foster et al. (2021). Keywords: Contextual bandits, information-directed sampling, decision estimation coefficient, first-order regret bounds. △ Less

Submitted 27 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

arXiv:2401.09324 [pdf, other]

doi 10.1145/3544549.3585830

Establishing Awareness through Pointing Gestures during Collaborative Decision-Making in a Wall-Display Environment

Authors: Valérie Maquil, Dimitra Anastasiou, Hoorieh Afkari, Adrien Coppens, Johannes Hermen, Lou Schwartz

Abstract: Sharing a physical environment, such as that of a wall-display, facilitates gaining awareness of others' actions and intentions, thereby bringing benefits for collaboration. Previous studies have provided first insights on awareness in the context of tabletops or smaller vertical displays. This paper seeks to advance the current understanding on how users share awareness information in wall-displa… ▽ More Sharing a physical environment, such as that of a wall-display, facilitates gaining awareness of others' actions and intentions, thereby bringing benefits for collaboration. Previous studies have provided first insights on awareness in the context of tabletops or smaller vertical displays. This paper seeks to advance the current understanding on how users share awareness information in wall-display environments and focusses on mid-air pointing gestures as a foundational part of communication. We present a scenario dealing with the organization of medical supply chains in crisis situations, and report on the results of a user study with 24 users, split into 6 groups of 4, performing several tasks. We investigate pointing gestures and identify three subtypes used as awareness cues during face-to-face collaboration: narrative pointing, loose pointing, and sharp pointing. Our observations show that reliance on gesture subtypes varies across participants and groups, and that sometimes vague pointing is sufficient to support verbal negotiations. △ Less

Submitted 17 January, 2024; originally announced January 2024.

Comments: \c{opyright} Authors | ACM 2023. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in the CHI'23 proceedings, http://dx.doi.org/10.1145/3544549.3585830

arXiv:2303.13544 [pdf]

Empower Children in Nigeria to Design the Future of Artificial Intelligence (AI) through Writing

Authors: Cornelius Adejoro, Luise Arn, Larissa Schwartz, Tom Yeh

Abstract: This paper presents a new approach to engaging children in Nigeria to share their views of AI. This approach is centered on an inclusive writing contest for children in a secondary school in Abuja to write about AI to compete for prizes and share their writings with others. A preliminary analysis of the first 11 articles we received exhibits diverse gender and ethnic representation that conveys cu… ▽ More This paper presents a new approach to engaging children in Nigeria to share their views of AI. This approach is centered on an inclusive writing contest for children in a secondary school in Abuja to write about AI to compete for prizes and share their writings with others. A preliminary analysis of the first 11 articles we received exhibits diverse gender and ethnic representation that conveys cultural values and perspectives distinct from those of the children in the western countries. This finding suggests future work to conduct in-depth cross-cultural analysis of the articles and to replicate similar writing contests to engage children in other underrepresented countries. △ Less

Submitted 15 March, 2023; originally announced March 2023.

arXiv:2302.14177 [pdf, other]

Soft-Search: Two Datasets to Study the Identification and Production of Research Software

Authors: Eva Maxfield Brown, Lindsey Schwartz, Richard Lewei Huang, Nicholas Weber

Abstract: Software is an important tool for scholarly work, but software produced for research is in many cases not easily identifiable or discoverable. A potential first step in linking research and software is software identification. In this paper we present two datasets to study the identification and production of research software. The first dataset contains almost 1000 human labeled annotations of so… ▽ More Software is an important tool for scholarly work, but software produced for research is in many cases not easily identifiable or discoverable. A potential first step in linking research and software is software identification. In this paper we present two datasets to study the identification and production of research software. The first dataset contains almost 1000 human labeled annotations of software production from National Science Foundation (NSF) awarded research projects. We use this dataset to train models that predict software production. Our second dataset is created by applying the trained predictive models across the abstracts and project outcomes reports for all NSF funded projects between the years of 2010 and 2023. The result is an inferred dataset of software production for over 150,000 NSF awards. We release the Soft-Search dataset to aid in identifying and understanding research software production: https://github.com/si2-urssi/eager △ Less

Submitted 27 February, 2023; originally announced February 2023.

arXiv:2205.13924 [pdf, ps, other]

Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits

Authors: Gergely Neu, Julia Olkhovskaya, Matteo Papini, Ludovic Schwartz

Abstract: We study the Bayesian regret of the renowned Thompson Sampling algorithm in contextual bandits with binary losses and adversarially-selected contexts. We adapt the information-theoretic perspective of \cite{RvR16} to the contextual setting by considering a lifted version of the information ratio defined in terms of the unknown model parameter instead of the optimal action or optimal policy as done… ▽ More We study the Bayesian regret of the renowned Thompson Sampling algorithm in contextual bandits with binary losses and adversarially-selected contexts. We adapt the information-theoretic perspective of \cite{RvR16} to the contextual setting by considering a lifted version of the information ratio defined in terms of the unknown model parameter instead of the optimal action or optimal policy as done in previous works on the same setting. This allows us to bound the regret in terms of the entropy of the prior distribution through a remarkably simple proof, and with no structural assumptions on the likelihood or the prior. The extension to priors with infinite entropy only requires a Lipschitz assumption on the log-likelihood. An interesting special case is that of logistic bandits with $d$-dimensional parameters, $K$ actions, and Lipschitz logits, for which we provide a $\widetilde{O}(\sqrt{dKT})$ regret upper-bound that does not depend on the smallest slope of the sigmoid link function. △ Less

Submitted 6 March, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

arXiv:2203.08765 [pdf, other]

Efficient conditioned face animation using frontally-viewed embedding

Authors: Maxime Oquab, Daniel Haziza, Ludovic Schwartz, Tao Xu, Katayoun Zand, Rui Wang, Peirong Liu, Camille Couprie

Abstract: As the quality of few shot facial animation from landmarks increases, new applications become possible, such as ultra low bandwidth video chat compression with a high degree of realism. However, there are some important challenges to tackle in order to improve the experience in real world conditions. In particular, the current approaches fail to represent profile views without distortions, while r… ▽ More As the quality of few shot facial animation from landmarks increases, new applications become possible, such as ultra low bandwidth video chat compression with a high degree of realism. However, there are some important challenges to tackle in order to improve the experience in real world conditions. In particular, the current approaches fail to represent profile views without distortions, while running in a low compute regime. We focus on this key problem by introducing a multi-frames embedding dubbed Frontalizer to improve profile views rendering. In addition to this core improvement, we explore the learning of a latent code conditioning generations along with landmarks to better convey facial expressions. Our dense models achieves 22% of improvement in perceptual quality and 73% reduction of landmark error over the first order model baseline on a subset of DFDC videos containing head movements. Declined with mobile architectures, our models outperform the previous state-of-the-art (improving perceptual quality by more than 16% and reducing landmark error by more than 47% on two datasets) while running on real time on iPhone 8 with very low bandwidth requirements. △ Less

Submitted 16 March, 2022; originally announced March 2022.

arXiv:2112.12297 [pdf]

High Throughput Multi-Channel Parallelized Diffraction Convolutional Neural Network Accelerator

Authors: Zibo Hu, Shurui Li, Russell L. T. Schwartz, Maria Solyanik-Gorgone, Mario Miscuglio, Puneet Gupta, Volker J. Sorger

Abstract: Convolutional neural networks are paramount in image and signal processing including the relevant classification and training tasks alike and constitute for the majority of machine learning compute demand today. With convolution operations being computationally intensive, next generation hardware accelerators need to offer parallelization and algorithmic-hardware homomorphism. Fortunately, diffrac… ▽ More Convolutional neural networks are paramount in image and signal processing including the relevant classification and training tasks alike and constitute for the majority of machine learning compute demand today. With convolution operations being computationally intensive, next generation hardware accelerators need to offer parallelization and algorithmic-hardware homomorphism. Fortunately, diffractive display optics is capable of million-channel parallel data processing at low latency, however, thus far only showed tens of Hertz slow single image and kernel capability, thereby significantly underdelivering from its performance potential. Here, we demonstrate an operation-parallelized high-throughput Fourier optic convolutional neural network accelerator. For the first time simultaneously processing of multiple kernels in Fourier domain enabled by optical diffraction has been achieved alongside with already conventional in the field input parallelism. Additionally, we show an about one hundred times system speed up over existing optical diffraction-based processors and this demonstration rivals performance of modern electronic solutions. Therefore, this system is capable of processing large-scale matrices about ten times faster than state of art electronic systems. △ Less

Submitted 7 July, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

Comments: 13 pages, 4 figures

arXiv:2104.07780 [pdf, other]

Impact of gender on the formation and outcome of mentoring relationships in academic research

Authors: Leah P. Schwartz, Jean Liénard, Stephen V. David

Abstract: Despite increasing representation in graduate training programs, a disproportionate number of women leave academic research before obtaining an independent position. To understand factors underlying this trend, we analyzed a multidisciplinary database of Ph.D. and postdoctoral mentoring relationships covering the years 2000-2020, focusing on data from the life sciences. Student and mentor gender a… ▽ More Despite increasing representation in graduate training programs, a disproportionate number of women leave academic research before obtaining an independent position. To understand factors underlying this trend, we analyzed a multidisciplinary database of Ph.D. and postdoctoral mentoring relationships covering the years 2000-2020, focusing on data from the life sciences. Student and mentor gender are both associated with differences in rates of student's continuation to independent mentor positions of their own. Although trainees of women mentors are less likely to take on independent positions than trainees of men mentors, this effect is reduced substantially after controlling for several measurements of mentor status. Thus the effect of mentor gender can be explained at least partially by gender disparities in social and financial resources available to mentors. Because trainees and mentors tend to be of the same gender, this association between mentor gender and academic continuation disproportionately impacts women trainees. On average, gender homophily in graduate training is unrelated to mentor status. A notable exception to this trend is the special case of scientists having been granted an outstanding distinction, evidenced by membership in the National Academy of Sciences, being a grantee of the Howard Hughes Medical Institute, or having been awarded the Nobel Prize. This group of mentors trains men graduate students at higher rates than their most successful colleagues. These results suggest that, in addition to other factors that limit career choices for women trainees, gender inequities in mentors' access to resources and prestige contribute to women's attrition from independent research positions. △ Less

Submitted 4 May, 2022; v1 submitted 15 April, 2021; originally announced April 2021.

Comments: 57 pages, 17 figures

arXiv:2101.10496 [pdf, other]

A Digital Corpus of St. Lawrence Island Yupik

Authors: Lane Schwartz, Emily Chen, Hyunji Hayley Park, Edward Jahn, Sylvia L. R. Schreiner

Abstract: St. Lawrence Island Yupik (ISO 639-3: ess) is an endangered polysynthetic language in the Inuit-Yupik language family indigenous to Alaska and Chukotka. This work presents a step-by-step pipeline for the digitization of written texts, and the first publicly available digital corpus for St. Lawrence Island Yupik, created using that pipeline. This corpus has great potential for future linguistic inq… ▽ More St. Lawrence Island Yupik (ISO 639-3: ess) is an endangered polysynthetic language in the Inuit-Yupik language family indigenous to Alaska and Chukotka. This work presents a step-by-step pipeline for the digitization of written texts, and the first publicly available digital corpus for St. Lawrence Island Yupik, created using that pipeline. This corpus has great potential for future linguistic inquiry and research in NLP. It was also developed for use in Yupik language education and revitalization, with a primary goal of enabling easy access to Yupik texts by educators and by members of the Yupik community. A secondary goal is to support development of language technology such as spell-checkers, text-completion systems, interactive e-books, and language learning apps for use by the Yupik community. △ Less

Submitted 25 January, 2021; originally announced January 2021.

Comments: ComputEL-4

arXiv:2012.06262 [pdf, other]

doi 10.1162/tacl_a_00365

Morphology Matters: A Multilingual Language Modeling Analysis

Authors: Hyunji Hayley Park, Katherine J. Zhang, Coleman Haley, Kenneth Steimel, Han Liu, Lane Schwartz

Abstract: Prior studies in multilingual language modeling (e.g., Cotterell et al., 2018; Mielke et al., 2019) disagree on whether or not inflectional morphology makes languages harder to model. We attempt to resolve the disagreement and extend those studies. We compile a larger corpus of 145 Bible translations in 92 languages and a larger number of typological features. We fill in missing typological data f… ▽ More Prior studies in multilingual language modeling (e.g., Cotterell et al., 2018; Mielke et al., 2019) disagree on whether or not inflectional morphology makes languages harder to model. We attempt to resolve the disagreement and extend those studies. We compile a larger corpus of 145 Bible translations in 92 languages and a larger number of typological features. We fill in missing typological data for several languages and consider corpus-based measures of morphological complexity in addition to expert-produced typological features. We find that several morphological measures are significantly associated with higher surprisal when LSTM models are trained with BPE-segmented data. We also investigate linguistically-motivated subword segmentation strategies like Morfessor and Finite-State Transducers (FSTs) and find that these segmentation strategies yield better performance and reduce the impact of a language's morphology on language modeling. △ Less

Submitted 11 December, 2020; originally announced December 2020.

Comments: To appear in TACL, a pre-MIT Press publication version; 15 pages, 3 figures; for the datasets, see https://github.com/hayleypark/MorphologyMatters

Journal ref: Transactions of the Association for Computational Linguistics 9 (2021) 261-276

arXiv:2005.05477 [pdf, other]

Neural Polysynthetic Language Modelling

Authors: Lane Schwartz, Francis Tyers, Lori Levin, Christo Kirov, Patrick Littell, Chi-kiu Lo, Emily Prud'hommeaux, Hyunji Hayley Park, Kenneth Steimel, Rebecca Knowles, Jeffrey Micher, Lonny Strunk, Han Liu, Coleman Haley, Katherine J. Zhang, Robbie Jimmerson, Vasilisa Andriyanets, Aldrian Obaja Muis, Naoki Otani, Jong Hyuk Park, Zhisong Zhang

Abstract: Research in natural language processing commonly assumes that approaches that work well for English and and other widely-used languages are "language agnostic". In high-resource languages, especially those that are analytic, a common approach is to treat morphologically-distinct variants of a common root as completely independent word types. This assumes, that there are limited morphological infle… ▽ More Research in natural language processing commonly assumes that approaches that work well for English and and other widely-used languages are "language agnostic". In high-resource languages, especially those that are analytic, a common approach is to treat morphologically-distinct variants of a common root as completely independent word types. This assumes, that there are limited morphological inflections per root, and that the majority will appear in a large enough corpus, so that the model can adequately learn statistics about each form. Approaches like stemming, lemmatization, or subword segmentation are often used when either of those assumptions do not hold, particularly in the case of synthetic languages like Spanish or Russian that have more inflection than English. In the literature, languages like Finnish or Turkish are held up as extreme examples of complexity that challenge common modelling assumptions. Yet, when considering all of the world's languages, Finnish and Turkish are closer to the average case. When we consider polysynthetic languages (those at the extreme of morphological complexity), approaches like stemming, lemmatization, or subword modelling may not suffice. These languages have very high numbers of hapax legomena, showing the need for appropriate morphological handling of words, without which it is not possible for a model to capture enough word statistics. We examine the current state-of-the-art in language modelling, machine translation, and text prediction for four polysynthetic languages: Guaraní, St. Lawrence Island Yupik, Central Alaskan Yupik, and Inuktitut. We then propose a novel framework for language modelling that combines knowledge representations from finite-state morphological analyzers with Tensor Product Representations in order to enable neural language models capable of handling the full range of typologically variant languages. △ Less

Submitted 13 May, 2020; v1 submitted 11 May, 2020; originally announced May 2020.

arXiv:1809.03112 [pdf, other]

Depth-bounding is effective: Improvements and evaluation of unsupervised PCFG induction

Authors: Lifeng **, Finale Doshi-Velez, Timothy Miller, William Schuler, Lane Schwartz

Abstract: There have been several recent attempts to improve the accuracy of grammar induction systems by bounding the recursive complexity of the induction model (Ponvert et al., 2011; Noji and Johnson, 2016; Shain et al., 2016; ** et al., 2018). Modern depth-bounded grammar inducers have been shown to be more accurate than early unbounded PCFG inducers, but this technique has never been compared against… ▽ More There have been several recent attempts to improve the accuracy of grammar induction systems by bounding the recursive complexity of the induction model (Ponvert et al., 2011; Noji and Johnson, 2016; Shain et al., 2016; ** et al., 2018). Modern depth-bounded grammar inducers have been shown to be more accurate than early unbounded PCFG inducers, but this technique has never been compared against unbounded induction within the same system, in part because most previous depth-bounding models are built around sequence models, the complexity of which grows exponentially with the maximum allowed depth. The present work instead applies depth bounds within a chart-based Bayesian PCFG inducer (Johnson et al., 2007b), where bounding can be switched on and off, and then samples trees with and without bounding. Results show that depth-bounding is indeed significantly effective in limiting the search space of the inducer and thereby increasing the accuracy of the resulting parsing model. Moreover, parsing results on English, Chinese and German show that this bounded model with a new inference technique is able to produce parse trees more accurately than or competitively with state-of-the-art constituency-based grammar induction models. △ Less

Submitted 9 September, 2018; originally announced September 2018.

Comments: EMNLP 2018

arXiv:1802.08545 [pdf, ps, other]

Unsupervised Grammar Induction with Depth-bounded PCFG

Authors: Lifeng **, Finale Doshi-Velez, Timothy Miller, William Schuler, Lane Schwartz

Abstract: There has been recent interest in applying cognitively or empirically motivated bounds on recursion depth to limit the search space of grammar induction models (Ponvert et al., 2011; Noji and Johnson, 2016; Shain et al., 2016). This work extends this depth-bounding approach to probabilistic context-free grammar induction (DB-PCFG), which has a smaller parameter space than hierarchical sequence mod… ▽ More There has been recent interest in applying cognitively or empirically motivated bounds on recursion depth to limit the search space of grammar induction models (Ponvert et al., 2011; Noji and Johnson, 2016; Shain et al., 2016). This work extends this depth-bounding approach to probabilistic context-free grammar induction (DB-PCFG), which has a smaller parameter space than hierarchical sequence models, and therefore more fully exploits the space reductions of depth-bounding. Results for this model on grammar acquisition from transcribed child-directed speech and newswire text exceed or are competitive with those of other models when evaluated on parse accuracy. Moreover, gram- mars acquired from this model demonstrate a consistent use of category labels, something which has not been demonstrated by other acquisition models. △ Less

Submitted 25 February, 2018; v1 submitted 23 February, 2018; originally announced February 2018.

Comments: Accepted by Transactions of the Association for Computational Linguistics

arXiv:1711.03016 [pdf, other]

DLVM: A modern compiler infrastructure for deep learning systems

Authors: Richard Wei, Lane Schwartz, Vikram Adve

Abstract: Deep learning software demands reliability and performance. However, many of the existing deep learning frameworks are software libraries that act as an unsafe DSL in Python and a computation graph interpreter. We present DLVM, a design and implementation of a compiler infrastructure with a linear algebra intermediate representation, algorithmic differentiation by adjoint code generation, domain-s… ▽ More Deep learning software demands reliability and performance. However, many of the existing deep learning frameworks are software libraries that act as an unsafe DSL in Python and a computation graph interpreter. We present DLVM, a design and implementation of a compiler infrastructure with a linear algebra intermediate representation, algorithmic differentiation by adjoint code generation, domain-specific optimizations and a code generator targeting GPU via LLVM. Designed as a modern compiler infrastructure inspired by LLVM, DLVM is more modular and more generic than existing deep learning compiler frameworks, and supports tensor DSLs with high expressivity. With our prototypical staged DSL embedded in Swift, we argue that the DLVM system enables a form of modular, safe and performant frameworks for deep learning. △ Less

Submitted 2 February, 2018; v1 submitted 8 November, 2017; originally announced November 2017.

arXiv:1610.04265 [pdf, other]

Fast, Scalable Phrase-Based SMT Decoding

Authors: Hieu Hoang, Nikolay Bogoychev, Lane Schwartz, Marcin Junczys-Dowmunt

Abstract: The utilization of statistical machine translation (SMT) has grown enormously over the last decade, many using open-source software developed by the NLP community. As commercial use has increased, there is need for software that is optimized for commercial requirements, in particular, fast phrase-based decoding and more efficient utilization of modern multicore servers. In this paper we re-exami… ▽ More The utilization of statistical machine translation (SMT) has grown enormously over the last decade, many using open-source software developed by the NLP community. As commercial use has increased, there is need for software that is optimized for commercial requirements, in particular, fast phrase-based decoding and more efficient utilization of modern multicore servers. In this paper we re-examine the major components of phrase-based decoding and decoder implementation with particular emphasis on speed and scalability on multicore machines. The result is a drop-in replacement for the Moses decoder which is up to fifteen times faster and scales monotonically with the number of cores. △ Less

Submitted 18 October, 2016; v1 submitted 13 October, 2016; originally announced October 2016.

arXiv:1603.01207 [pdf]

doi 10.46298/jdmdh.1395

From manuscript catalogues to a handbook of Syriac literature: Modeling an infrastructure for Syriaca.org

Authors: Nathan P. Gibson, David A. Michelson, Daniel L. Schwartz

Abstract: Despite increasing interest in Syriac studies and growing digital availability of Syriac texts, there is currently no up-to-date infrastructure for discovering, identifying, classifying, and referencing works of Syriac literature. The standard reference work (Baumstark's Geschichte) is over ninety years old, and the perhaps 20,000 Syriac manuscripts extant worldwide can be accessed only through di… ▽ More Despite increasing interest in Syriac studies and growing digital availability of Syriac texts, there is currently no up-to-date infrastructure for discovering, identifying, classifying, and referencing works of Syriac literature. The standard reference work (Baumstark's Geschichte) is over ninety years old, and the perhaps 20,000 Syriac manuscripts extant worldwide can be accessed only through disparate catalogues and databases. The present article proposes a tentative data model for Syriaca.org's New Handbook of Syriac Literature, an open-access digital publication that will serve as both an authority file for Syriac works and a guide to accessing their manuscript representations, editions, and translations. The authors hope that by publishing a draft data model they can receive feedback and incorporate suggestions into the next stage of the project. △ Less

Submitted 3 March, 2016; originally announced March 2016.

Comments: Part of special issue: Computer-Aided Processing of Intertextuality in Ancient Languages. 15 pages, 4 figures

Journal ref: Journal of Data Mining & Digital Humanities, Special Issue on Computer-Aided Processing of Intertextuality in Ancient Languages (May 30, 2017) jdmdh:1395

Showing 1–18 of 18 results for author: Schwartz, L