Skip to main content

Showing 1–22 of 22 results for author: Clark, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.09605  [pdf, other

    cs.CL cs.AI cs.LG

    Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models

    Authors: Anna A. Ivanova, Aalok Sathe, Benjamin Lipkin, Unnathi Kumar, Setayesh Radkani, Thomas H. Clark, Carina Kauf, Jennifer Hu, R. T. Pramod, Gabriel Grand, Vivian Paulun, Maria Ryskina, Ekin Akyürek, Ethan Wilcox, Nafisa Rashid, Leshem Choshen, Roger Levy, Evelina Fedorenko, Joshua Tenenbaum, Jacob Andreas

    Abstract: The ability to build and leverage world models is essential for a general-purpose AI agent. Testing such capabilities is hard, in part because the building blocks of world models are ill-defined. We present Elements of World Knowledge (EWOK), a framework for evaluating world modeling in language models by testing their ability to use knowledge of a concept to match a target text with a plausible/i… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 21 pages (11 main), 7 figures. Authors Anna Ivanova, Aalok Sathe, Benjamin Lipkin contributed equally

  2. Knowledge Engineering for Wind Energy

    Authors: Yuriy Marykovskiy, Thomas Clark, Justin Day, Marcus Wiens, Charles Henderson, Julian Quick, Imad Abdallah, Anna Maria Sempreviva, Jean-Paul Calbimonte, Eleni Chatzi, Sarah Barber

    Abstract: With the rapid evolution of the wind energy sector, there is an ever-increasing need to create value from the vast amounts of data made available both from within the domain, as well as from other sectors. This article addresses the challenges faced by wind energy domain experts in converting data into domain knowledge, connecting and integrating it with other sources of knowledge, and making it a… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

    Journal ref: Wind Energ. Sci. 9 (2024) 883-917

  3. arXiv:2306.03734  [pdf, other

    cs.CL

    A Cross-Linguistic Pressure for Uniform Information Density in Word Order

    Authors: Thomas Hikaru Clark, Clara Meister, Tiago Pimentel, Michael Hahn, Ryan Cotterell, Richard Futrell, Roger Levy

    Abstract: While natural languages differ widely in both canonical word order and word order flexibility, their word orders still follow shared cross-linguistic statistical patterns, often attributed to functional pressures. In the effort to identify these pressures, prior work has compared real and counterfactual word orders. Yet one functional pressure has been overlooked in such investigations: the unifor… ▽ More

    Submitted 9 July, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  4. arXiv:2303.12873  [pdf, other

    physics.acc-ph cs.SE physics.plasm-ph

    From Compact Plasma Particle Sources to Advanced Accelerators with Modeling at Exascale

    Authors: Axel Huebl, Remi Lehe, Edoardo Zoni, Olga Shapoval, Ryan T. Sandberg, Marco Garten, Arianna Formenti, Revathi Jambunathan, Prabhat Kumar, Kevin Gott, Andrew Myers, Weiqun Zhang, Ann Almgren, Chad E. Mitchell, Ji Qiang, David Grote, Alexander Sinn, Severin Diederichs, Maxence Thevenet, Luca Fedeli, Thomas Clark, Neil Zaim, Henri Vincenti, Jean-Luc Vay

    Abstract: Develo** complex, reliable advanced accelerators requires a coordinated, extensible, and comprehensive approach in modeling, from source to the end of beam lifetime. We present highlights in Exascale Computing to scale accelerator modeling software to the requirements set for contemporary science drivers. In particular, we present the first laser-plasma modeling on an exaflop supercomputer using… ▽ More

    Submitted 18 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

    Comments: 4 pages, 3 figures, presented at the 20th Advanced Accelerator Concepts Workshop (AAC22)

  5. arXiv:2301.01017  [pdf, other

    cs.LG eess.SY

    Through-life Monitoring of Resource-constrained Systems and Fleets

    Authors: Felipe Montana, Adam Hartwell, Will Jacobs, Visakan Kadirkamanathan, Andrew R Mills, Tom Clark

    Abstract: A Digital Twin (DT) is a simulation of a physical system that provides information to make decisions that add economic, social or commercial value. The behaviour of a physical system changes over time, a DT must therefore be continually updated with data from the physical systems to reflect its changing behaviour. For resource-constrained systems, updating a DT is non-trivial because of challenges… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

  6. arXiv:2203.17213  [pdf, other

    cs.CL

    Analyzing Wrap-Up Effects through an Information-Theoretic Lens

    Authors: Clara Meister, Tiago Pimentel, Thomas Hikaru Clark, Ryan Cotterell, Roger Levy

    Abstract: Numerous analyses of reading time (RT) data have been implemented -- all in an effort to better understand the cognitive processes driving reading comprehension. However, data measured on words at the end of a sentence -- or even at the end of a clause -- is often omitted due to the confounding factors introduced by so-called "wrap-up effects," which manifests as a skewed distribution of RTs for t… ▽ More

    Submitted 5 January, 2024; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: ACL 2022 (main conference)

  7. arXiv:2112.03765  [pdf, other

    cs.LG

    In-flight Novelty Detection with Convolutional Neural Networks

    Authors: Adam Hartwell, Felipe Montana, Will Jacobs, Visakan Kadirkamanathan, Andrew R Mills, Tom Clark

    Abstract: Gas turbine engines are complex machines that typically generate a vast amount of data, and require careful monitoring to allow for cost-effective preventative maintenance. In aerospace applications, returning all measured data to ground is prohibitively expensive, often causing useful, high value, data to be discarded. The ability to detect, prioritise, and return useful data in real-time is ther… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

  8. arXiv:2109.04810  [pdf, other

    cs.CL

    Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT

    Authors: Zaiqiao Meng, Fangyu Liu, Thomas Hikaru Clark, Ehsan Shareghi, Nigel Collier

    Abstract: Infusing factual knowledge into pre-trained models is fundamental for many knowledge-intensive tasks. In this paper, we proposed Mixture-of-Partitions (MoP), an infusion approach that can handle a very large knowledge graph (KG) by partitioning it into smaller sub-graphs and infusing their specific knowledge into various BERT models using lightweight adapters. To leverage the overall factual knowl… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 camera-ready version

  9. arXiv:1905.08674  [pdf

    cs.CY cs.DL

    Software Citation Implementation Challenges

    Authors: Daniel S. Katz, Daina Bouquin, Neil P. Chue Hong, Jessica Hausman, Catherine Jones, Daniel Chivvis, Tim Clark, Mercè Crosas, Stephan Druskat, Martin Fenner, Tom Gillespie, Alejandra Gonzalez-Beltran, Morane Gruenpeter, Ted Habermann, Robert Haines, Melissa Harrison, Edwin Henneken, Lorraine Hwang, Matthew B. Jones, Alastair A. Kelly, David N. Kennedy, Katrin Leinweber, Fernando Rios, Carly B. Robinson, Ilian Todorov , et al. (2 additional authors not shown)

    Abstract: The main output of the FORCE11 Software Citation working group (https://www.force11.org/group/software-citation-working-group) was a paper on software citation principles (https://doi.org/10.7717/peerj-cs.86) published in September 2016. This paper laid out a set of six high-level principles for software citation (importance, credit and attribution, unique identification, persistence, accessibilit… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.

  10. arXiv:1804.07273  [pdf, ps

    cs.SE

    A Basic Model of KBS Software

    Authors: Tony Clark

    Abstract: The Euclid 6.2 project MOSES addressed quality issues in the development of military KBS. A contribution to this project was to develop a computational model of KBS that could be used to define and analyze aspects of KBS quality. Since a key characteristic of KBS is search, a computational model based on non-determinism was developed and used to express terms relating to quality. This research rep… ▽ More

    Submitted 17 April, 2018; originally announced April 2018.

  11. arXiv:1804.07272  [pdf, ps

    cs.SE

    Metaclasses and Reflection in Smalltalk

    Authors: Tony Clark

    Abstract: Many Object Oriented Programming Languages provide reflective features which may be used to control the interpretive mechanism of the language. Often these features are defined with respect to a golden braid consisting of objects classes and meta-classes. This report reviews the Smalltalk golden braid and generalize it for multiple inheritance leading to choices between many different inheritance… ▽ More

    Submitted 17 April, 2018; originally announced April 2018.

  12. arXiv:1804.07271  [pdf, ps

    cs.PL

    EBG: A Lazy Functional Programming Language Implemented on the Java Virtual Machine

    Authors: Tony Clark

    Abstract: This technical report describes the implementation of a lazy functional programming language on the Java VM.

    Submitted 17 April, 2018; originally announced April 2018.

  13. arXiv:1506.03398  [pdf, other

    cs.SE

    A General Architecture for Heterogeneous Language Engineering and Projectional Editor Support

    Authors: Tony Clark

    Abstract: Tool support for language engineering has typically prioritises concrete syntax over abstract syntax by providing meta-languages for expressing concrete syntax and then map** concrete to abstract structures. Text-based languages are usually specified using a BNF-like language used to generate a syntax-aware editor that includes features such as keyword completion. Similarly, graphical languages… ▽ More

    Submitted 10 June, 2015; originally announced June 2015.

  14. arXiv:1506.03381  [pdf, other

    cs.SE

    Meta-Packages: Painless Domain Specific Languages

    Authors: Tony Clark

    Abstract: Domain Specific Languages are used to provide a tailored modelling notation for a specific application domain. There are currently two main approaches to DSLs: standard notations that are tailored by adding simple properties; new notations that are designed from scratch. There are problems with both of these approaches which can be addressed by providing access to a small meta-language based on pa… ▽ More

    Submitted 10 June, 2015; originally announced June 2015.

  15. arXiv:1506.03380  [pdf, other

    cs.SE cs.PL

    Model Driven Reactive Applications

    Authors: Tony Clark, Dean Kramer, Samia Oussena

    Abstract: Reactive applications (rapps) are of interest because of the explosion of mobile, tablet and web-based platforms. The complexity and proliferation of implementation technologies makes it attractive to use model-driven techniques to develop rapp systems. This article proposes a domain specific language for rapps consisting of stereotyped class models for the structure of the application and state m… ▽ More

    Submitted 10 June, 2015; originally announced June 2015.

  16. arXiv:1506.03366  [pdf, ps, other

    cs.FL cs.PL

    Processing XML for Domain Specific Languages

    Authors: Tony Clark

    Abstract: XML is a standard and universal language for representing information. XML processing is supported by two key frameworks: DOM and SAX. SAX is efficient, but leaves the developer to encode much of the processing. This paper introduces a language for expressing XML-based languages via grammars that can be used to process XML documents and synthesize arbitrary values. The language is declarative and… ▽ More

    Submitted 10 June, 2015; originally announced June 2015.

  17. arXiv:1506.03363  [pdf, other

    cs.SE cs.PL

    Super-Languages: Develo** Languages and Applications with XMF (Second Edition)

    Authors: Tony Clark, Paul Sammut, James Willans

    Abstract: The aim of this book is to introduce the language XMF. This is done by defining the language, providing some examples of applications that can be written directly in the XOCL language that comes with XMF, and then by showing how XMF can be used for language engineering. The main focus of this book is on language engineering by example.

    Submitted 10 June, 2015; originally announced June 2015.

  18. arXiv:1505.00149  [pdf, other

    cs.SE

    Applied Metamodelling: A Foundation for Language Driven Development (Third Edition)

    Authors: Tony Clark, Paul Sammut, James Willans

    Abstract: Modern day system developers have some serious problems to contend with. The systems they develop are becoming increasingly complex as customers demand richer functionality delivered in ever shorter timescales. They have to manage a huge diversity of implementation technologies, design techniques and development processes: everything from scripting languages to web-services to the latest 'silver b… ▽ More

    Submitted 1 May, 2015; originally announced May 2015.

  19. arXiv:1408.5698  [pdf

    cs.SE

    Report on the Aachen OCL Meeting

    Authors: Achim D. Brucker, Dan Chiorean, Tony Clark, Birgit Demuth, Martin Gogolla, Dimitri Plotnikov, Bernhard Rumpe, Edward D. Willink, Burkhart Wolff

    Abstract: As a continuation of the OCL workshop during the MODELS 2013 conference in October 2013, a number of OCL experts decided to meet in November 2013 in Aachen for two days to discuss possible short term improvements of OCL for an upcoming OMG meeting and to envision possible future long-term developments of the language. This paper is a sort of \minutes of the meeting" and intended to quickly inform… ▽ More

    Submitted 25 August, 2014; originally announced August 2014.

    Comments: 9 pages, 6 figures

    Journal ref: Proceedings of the MODELS 2013 OCL Workshop (OCL 2013), Miami, Florida (USA), Volume 1092 of CEUR Workshop Proceedings, Eds.: J. Cabot, M. Gogolla, I. Rath, E. Willink, pages 103-111, CEUR-WS.org, 2013

  20. Web Annotation as a First Class Object

    Authors: Paolo Ciccarese, Stian Soiland-Reyes, Tim Clark

    Abstract: Scholars have made handwritten notes and comments in books and manuscripts for centuries. Today's blogs and news sites typically invite users to express their opinions on the published content; URLs allow web resources to be shared with accompanying annotations and comments using third-party services like Twitter or Facebook. These contributions have until recently been constrained within specific… ▽ More

    Submitted 19 March, 2019; v1 submitted 24 October, 2013; originally announced October 2013.

    Comments: This is authors' Accepted version as of 2013-08-09, reformatted as HTML with Linked Data. For the final, published version, see IEEE Explore: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6682930

    Report number: uk-ac-man-scw:211608 ACM Class: H.3.5; H.3.7; H.5.4

    Journal ref: IEEE Internet Computing 2013; 17(6) 71-75

  21. arXiv:1305.3506  [pdf

    cs.DL

    Micropublications: a Semantic Model for Claims, Evidence, Arguments and Annotations in Biomedical Communications

    Authors: Tim Clark, Paolo N. Ciccarese, Carole A. Goble

    Abstract: The Micropublications semantic model for scientific claims, evidence, argumentation and annotation in biomedical publications, is a metadata model of scientific argumentation, designed to support several key requirements for exchange and value-addition of semantic metadata across the biomedical publications ecosystem. Micropublications allow formalizing the argument structure of scientific publi… ▽ More

    Submitted 2 February, 2014; v1 submitted 15 May, 2013; originally announced May 2013.

    Comments: Version 4. Minor revisions

  22. PAV ontology: Provenance, Authoring and Versioning

    Authors: Paolo Ciccarese, Stian Soiland-Reyes, Khalid Belhajjame, Alasdair J G Gray, Carole Goble, Tim Clark

    Abstract: Provenance is a critical ingredient for establishing trust of published scientific content. This is true whether we are considering a data set, a computational workflow, a peer-reviewed publication or a simple scientific claim with supportive evidence. Existing vocabularies such as DC Terms and the W3C PROV-O are domain-independent and general-purpose and they allow and encourage for extensions to… ▽ More

    Submitted 6 December, 2013; v1 submitted 26 April, 2013; originally announced April 2013.

    Comments: 22 pages (incl 5 tables and 19 figures). Submitted to Journal of Biomedical Semantics 2013-04-26 (#1858276535979415). Revised article submitted 2013-08-30. Second revised article submitted 2013-10-06. Accepted 2013-10-07. Author proofs sent 2013-10-09 and 2013-10-16. Published 2013-11-22. Final version 2013-12-06. http://www.jbiomedsem.com/content/4/1/37

    Report number: University of Manchester eScholar: uk-ac-man-scw:193385 ACM Class: I.2.4; H.2.1; H.3.7; I.7.4

    Journal ref: Journal of Biomedical Semantics 2013, 4:37