Skip to main content

Showing 1–24 of 24 results for author: Harris, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.04127  [pdf, other

    cs.CL cs.AI

    Are We Done with MMLU?

    Authors: Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino, Rohit Saxena, Xuanli He, Yu Zhao, Xiaotang Du, Mohammad Reza Ghasemi Madani, Claire Barale, Robert McHardy, Joshua Harris, Jean Kaddour, Emile van Krieken, Pasquale Minervini

    Abstract: Maybe not. We identify and analyse errors in the popular Massive Multitask Language Understanding (MMLU) benchmark. Even though MMLU is widely adopted, our analysis demonstrates numerous ground truth errors that obscure the true capabilities of LLMs. For example, we find that 57% of the analysed questions in the Virology subset contain errors. To address this issue, we introduce a comprehensive fr… ▽ More

    Submitted 7 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  2. arXiv:2405.14766  [pdf, other

    cs.CL cs.LG

    Evaluating Large Language Models for Public Health Classification and Extraction Tasks

    Authors: Joshua Harris, Timothy Laurence, Leo Loman, Fan Grayson, Toby Nonnenmacher, Harry Long, Loes WalsGriffith, Amy Douglas, Holly Fountain, Stelios Georgiou, Jo Hardstaff, Kathryn Hopkins, Y-Ling Chi, Galena Kuyumdzhieva, Lesley Larkin, Samuel Collins, Hamish Mohammed, Thomas Finnie, Luke Hounsome, Steven Riley

    Abstract: Advances in Large Language Models (LLMs) have led to significant interest in their potential to support human experts across a range of domains, including public health. In this work we present automated evaluations of LLMs for public health tasks involving the classification and extraction of free text. We combine six externally annotated datasets with seven new internally annotated datasets to e… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 33 pages. Feedback and comments are highly appreciated

    MSC Class: 68T50

  3. arXiv:2403.09405  [pdf, other

    cs.HC

    Which Artificial Intelligences Do People Care About Most? A Conjoint Experiment on Moral Consideration

    Authors: Ali Ladak, Jamie Harris, Jacy Reese Anthis

    Abstract: Many studies have identified particular features of artificial intelligences (AI), such as their autonomy and emotion expression, that affect the extent to which they are treated as subjects of moral consideration. However, there has not yet been a comparison of the relative importance of features as is necessary to design and understand increasingly capable, multi-faceted AI systems. We conducted… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 11 pages, 2 figures. Accepted to 2024 CHI Conference on Human Factors in Computing Systems (CHI '24)

  4. arXiv:2307.10169  [pdf, other

    cs.CL cs.AI cs.LG

    Challenges and Applications of Large Language Models

    Authors: Jean Kaddour, Joshua Harris, Maximilian Mozes, Herbie Bradley, Roberta Raileanu, Robert McHardy

    Abstract: Large Language Models (LLMs) went from non-existent to ubiquitous in the machine learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify the remaining challenges and already fruitful application areas. In this paper, we aim to establish a systematic set of open problems and application successes so that ML researchers can comprehend the field's current… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: 72 pages. v01. Work in progress. Feedback and comments are highly appreciated!

  5. Perception, performance, and detectability of conversational artificial intelligence across 32 university courses

    Authors: Hazem Ibrahim, Fengyuan Liu, Rohail Asim, Balaraju Battu, Sidahmed Benabderrahmane, Bashar Alhafni, Wifag Adnan, Tuka Alhanai, Bedoor AlShebli, Riyadh Baghdadi, Jocelyn J. Bélanger, Elena Beretta, Kemal Celik, Moumena Chaqfeh, Mohammed F. Daqaq, Zaynab El Bernoussi, Daryl Fougnie, Borja Garcia de Soto, Alberto Gandolfi, Andras Gyorgy, Nizar Habash, J. Andrew Harris, Aaron Kaufman, Lefteris Kirousis, Korhan Kocak , et al. (14 additional authors not shown)

    Abstract: The emergence of large language models has led to the development of powerful tools such as ChatGPT that can produce text indistinguishable from human-generated work. With the increasing accessibility of such technology, students across the globe may utilize it to help with their school work -- a possibility that has sparked discussions on the integrity of student evaluations in the age of artific… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

    Comments: 17 pages, 4 figures

  6. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  7. arXiv:2209.09731  [pdf

    cs.DC cs.AR

    Application Experiences on a GPU-Accelerated Arm-based HPC Testbed

    Authors: Wael Elwasif, William Godoy, Nick Hagerty, J. Austin Harris, Oscar Hernandez, Balint Joo, Paul Kent, Damien Lebrun-Grandie, Elijah Maccarthy, Veronica G. Melesse Vergara, Bronson Messer, Ross Miller, Sarp Opal, Sergei Bastrakov, Michael Bussmann, Alexander Debus, Klaus Steinger, Jan Stephan, Rene Widera, Spencer H. Bryngelson, Henry Le Berre, Anand Radhakrishnan, Jefferey Young, Sunita Chandrasekaran, Florina Ciorba , et al. (6 additional authors not shown)

    Abstract: This paper assesses and reports the experience of ten teams working to port,validate, and benchmark several High Performance Computing applications on a novel GPU-accelerated Arm testbed system. The testbed consists of eight NVIDIA Arm HPC Developer Kit systems built by GIGABYTE, each one equipped with a server-class Arm CPU from Ampere Computing and A100 data center GPU from NVIDIA Corp. The syst… ▽ More

    Submitted 19 December, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

  8. arXiv:2209.02446  [pdf, other

    cs.CY cs.MM

    Web3 Challenges and Opportunities for the Market

    Authors: Dan Sheridan, James Harris, Frank Wear, Jerry Cowell Jr, Easton Wong, Abbas Yazdinejad

    Abstract: The inability of a computer to think has been a limiter in its usefulness and a point of reassurance for humanity since the first computers were created. The semantic web is the first step toward removing that barrier, enabling computers to operate based on conceptual understanding, and AI and ML are the second. Both semantic knowledge and the ability to learn are fundamental to web3, as are block… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

  9. arXiv:2208.11630  [pdf, other

    physics.comp-ph astro-ph.IM cs.MS

    Flash-X, a multiphysics simulation software instrument

    Authors: Anshu Dubey, Klaus Weide, Jared O'Neal, Akash Dhruv, Sean Couch, J. Austin Harris, Tom Klosterman, Rajeev Jain, Johann Rudi, Bronson Messer, Michael Pajkos, Jared Carlson, Ran Chu, Mohamed Wahib, Saurabh Chawdhary, Paul M. Ricker, Dongwook Lee, Katie Antypas, Katherine M. Riley, Christopher Daley, Murali Ganapathy, Francis X. Timmes, Dean M. Townsley, Marcos Vanella, John Bachan , et al. (6 additional authors not shown)

    Abstract: Flash-X is a highly composable multiphysics software system that can be used to simulate physical phenomena in several scientific domains. It derives some of its solvers from FLASH, which was first released in 2000. Flash-X has a new framework that relies on abstractions and asynchronous communications for performance portability across a range of increasingly heterogeneous hardware platforms. Fla… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: 16 pages, 5 Figures, published open access in SoftwareX

    Journal ref: SoftwareX, Volume 19, 2022, 101168,ISSN 2352-7110

  10. arXiv:2208.04714  [pdf

    cs.CY cs.AI cs.HC

    The History of AI Rights Research

    Authors: Jamie Harris

    Abstract: This report documents the history of research on AI rights and other moral consideration of artificial entities. It highlights key intellectual influences on this literature as well as research and academic discussion addressing the topic more directly. We find that researchers addressing AI rights have often seemed to be unaware of the work of colleagues whose interests overlap with their own. Ac… ▽ More

    Submitted 27 August, 2022; v1 submitted 6 July, 2022; originally announced August 2022.

    Comments: 68 pages, 2 figures

  11. arXiv:2207.05194  [pdf, other

    cs.CL

    Towards Neural Numeric-To-Text Generation From Temporal Personal Health Data

    Authors: Jonathan Harris, Mohammed J. Zaki

    Abstract: With an increased interest in the production of personal health technologies designed to track user data (e.g., nutrient intake, step counts), there is now more opportunity than ever to surface meaningful behavioral insights to everyday users in the form of natural language. This knowledge can increase their behavioral awareness and allow them to take action to meet their health goals. It can also… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: 5 pages, 2 figures, 1 table

  12. arXiv:2205.04362  [pdf, other

    cs.RO

    FC$^3$: Feasibility-Based Control Chain Coordination

    Authors: Jason Harris, Danny Driess, Marc Toussaint

    Abstract: Hierarchical coordination of controllers often uses symbolic state representations that fully abstract their underlying low-level controllers, treating them as "black boxes" to the symbolic action abstraction. This paper proposes a framework to realize robust behavior, which we call Feasibility-based Control Chain Coordination (FC$^3$). Our controllers expose the geometric features and constraints… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  13. arXiv:2203.05390  [pdf, other

    cs.RO

    Sequence-of-Constraints MPC: Reactive Timing-Optimal Control of Sequential Manipulation

    Authors: Marc Toussaint, Jason Harris, Jung-Su Ha, Danny Driess, Wolfgang Hönig

    Abstract: Task and Motion Planning has made great progress in solving hard sequential manipulation problems. However, a gap between such planning formulations and control methods for reactive execution remains. In this paper we propose a model predictive control approach dedicated to robustly execute a single sequence of constraints, which corresponds to a discrete decision sequence of a TAMP plan. We decom… ▽ More

    Submitted 22 September, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: IROS 2022 - Int. Conf. on Intelligent Robots and Systems

  14. RBO Hand 3 -- A Platform for Soft Dexterous Manipulation

    Authors: Steffen Puhlmann, Jason Harris, Oliver Brock

    Abstract: We present the RBO Hand 3, a highly capable and versatile anthropomorphic soft hand based on pneumatic actuation. The RBO Hand 3 is designed to enable dexterous manipulation, to facilitate transfer of insights about human dexterity, and to serve as a robust research platform for extensive real-world experiments. It achieves these design goals by combining many degrees of actuation with intrinsic c… ▽ More

    Submitted 26 January, 2022; originally announced January 2022.

    Comments: This paper is currently under revision in IEEE Transactions on Robotics

    Journal ref: IEEE Transactions on Robotics, 2022

  15. arXiv:2110.10131  [pdf, other

    cs.HC

    Personal Health Knowledge Graph for Clinically Relevant Diet Recommendations

    Authors: Oshani Seneviratne, Jonathan Harris, Ching-Hua Chen, Deborah L. McGuinness

    Abstract: We propose a knowledge model for capturing dietary preferences and personal context to provide personalized dietary recommendations. We develop a knowledge model called the Personal Health Ontology, which is grounded in semantic technologies, and represents a patient's combined medical information, social determinants of health, and observations of daily living elicited from interviews with diabet… ▽ More

    Submitted 19 October, 2021; originally announced October 2021.

  16. The Moral Consideration of Artificial Entities: A Literature Review

    Authors: Jamie Harris, Jacy Reese Anthis

    Abstract: Ethicists, policy-makers, and the general public have questioned whether artificial entities such as robots warrant rights or other forms of moral consideration. There is little synthesis of the research on this topic so far. We identify 294 relevant research or discussion items in our literature review of this topic. There is widespread agreement among scholars that some artificial entities could… ▽ More

    Submitted 26 January, 2021; originally announced February 2021.

    Comments: 27 pages, 1 figure

    ACM Class: J.4; K.4.1; K.4.2; I.2

    Journal ref: Sci Eng Ethics 27, 53 (2021)

  17. Analysis of Models for Decentralized and Collaborative AI on Blockchain

    Authors: Justin D. Harris

    Abstract: Machine learning has recently enabled large advances in artificial intelligence, but these results can be highly centralized. The large datasets required are generally proprietary; predictions are often sold on a per-query basis; and published models can quickly become out of date without effort to acquire more data and maintain them. Published proposals to provide models and data for free for cer… ▽ More

    Submitted 21 September, 2020; v1 submitted 14 September, 2020; originally announced September 2020.

    Comments: Accepted to ICBC 2020

  18. arXiv:2006.08513  [pdf, other

    cs.CR

    Flood & Loot: A Systemic Attack On The Lightning Network

    Authors: Jona Harris, Aviv Zohar

    Abstract: The Lightning Network promises to alleviate Bitcoin's known scalability problems. The operation of such second layer approaches relies on the ability of participants to turn to the blockchain to claim funds at any time, which is assumed to happen rarely. One of the risks that was identified early on is that of a wide systemic attack on the protocol, in which an attacker triggers the closure of man… ▽ More

    Submitted 27 August, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

  19. arXiv:2003.09530  [pdf, other

    cs.CL cs.DB

    A Framework for Generating Explanations from Temporal Personal Health Data

    Authors: Jonathan J. Harris, Ching-Hua Chen, Mohammed J. Zaki

    Abstract: Whereas it has become easier for individuals to track their personal health data (e.g., heart rate, step count, food log), there is still a wide chasm between the collection of data and the generation of meaningful explanations to help users better understand what their data means to them. With an increased comprehension of their data, users will be able to act upon the newfound information and wo… ▽ More

    Submitted 9 March, 2021; v1 submitted 20 March, 2020; originally announced March 2020.

    Comments: 41 pages, 24 figures. To appear in ACM Transactions on Computing for Healthcare

  20. arXiv:1909.03716  [pdf, ps, other

    cs.CL

    Improving Neural Question Generation using World Knowledge

    Authors: Deepak Gupta, Kaheer Suleman, Mahmoud Adada, Andrew McNamara, Justin Harris

    Abstract: In this paper, we propose a method for incorporating world knowledge (linked entities and fine-grained entity types) into a neural question generation model. This world knowledge helps to encode additional information related to the entities present in the passage required to generate human-like questions. We evaluate our models on both SQuAD and MS MARCO to demonstrate the usefulness of the world… ▽ More

    Submitted 10 September, 2019; v1 submitted 9 September, 2019; originally announced September 2019.

  21. Decentralized & Collaborative AI on Blockchain

    Authors: Justin D. Harris, Bo Waggoner

    Abstract: Machine learning has recently enabled large advances in artificial intelligence, but these tend to be highly centralized. The large datasets required are generally proprietary; predictions are often sold on a per-query basis; and published models can quickly become out of date without effort to acquire more data and re-train them. We propose a framework for participants to collaboratively build a… ▽ More

    Submitted 16 July, 2019; originally announced July 2019.

    Comments: Accepted to 2019 IEEE International Conference on Blockchain

  22. arXiv:1707.04330  [pdf, other

    cs.SE

    Open Chemistry: RESTful Web APIs, JSON, NWChem and the Modern Web Application

    Authors: Marcus D. Hanwell, Wibe A. de Jong, Christopher J. Harris

    Abstract: An end-to-end platform for chemical science research has been developed that integrates data from computational and experimental approaches through a modern web-based interface. The platform offers a highly interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices. It offers pragmatic solutions to ensure that large and complex data sets are more… ▽ More

    Submitted 13 July, 2017; originally announced July 2017.

  23. arXiv:1704.00057  [pdf, other

    cs.CL

    Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems

    Authors: Layla El Asri, Hannes Schulz, Shikhar Sharma, Jeremie Zumer, Justin Harris, Emery Fine, Rahul Mehrotra, Kaheer Suleman

    Abstract: This paper presents the Frames dataset (Frames is available at http://datasets.maluuba.com/Frames), a corpus of 1369 human-human dialogues with an average of 15 turns per dialogue. We developed this dataset to study the role of memory in goal-oriented dialogue systems. Based on Frames, we introduce a task called frame tracking, which extends state tracking to a setting where several states are tra… ▽ More

    Submitted 13 April, 2017; v1 submitted 31 March, 2017; originally announced April 2017.

  24. arXiv:1611.09830  [pdf, other

    cs.CL cs.AI

    NewsQA: A Machine Comprehension Dataset

    Authors: Adam Trischler, Tong Wang, Xingdi Yuan, Justin Harris, Alessandro Sordoni, Philip Bachman, Kaheer Suleman

    Abstract: We present NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs. Crowdworkers supply questions and answers based on a set of over 10,000 news articles from CNN, with answers consisting of spans of text from the corresponding articles. We collect this dataset through a four-stage process designed to solicit exploratory questions that require reas… ▽ More

    Submitted 7 February, 2017; v1 submitted 29 November, 2016; originally announced November 2016.