Skip to main content

Showing 1–46 of 46 results for author: Hernandez, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02616  [pdf

    eess.IV cs.CV

    Deep Learning Based Apparent Diffusion Coefficient Map Generation1 from Multi-parametric MR Images for Patients with Diffuse Gliomas

    Authors: Zach Eidex, Mojtaba Safari, Jacob Wynne, Richard L. J. Qiu, Tonghe Wang, David Viar Hernandez, Hui-Kuo Shu, Hui Mao, Xiaofeng Yang

    Abstract: Purpose: Apparent diffusion coefficient (ADC) maps derived from diffusion weighted (DWI) MRI provides functional measurements about the water molecules in tissues. However, DWI is time consuming and very susceptible to image artifacts, leading to inaccurate ADC measurements. This study aims to develop a deep learning framework to synthesize ADC maps from multi-parametric MR images. Methods: We pro… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2311.15044

  2. arXiv:2405.02366  [pdf, other

    astro-ph.IM astro-ph.GA cs.LG

    Bayesian and Convolutional Networks for Hierarchical Morphological Classification of Galaxies

    Authors: Jonathan Serrano-Pérez, Raquel Díaz Hernández, L. Enrique Sucar

    Abstract: This work is focused on the morphological classification of galaxies following the Hubble sequence in which the different classes are arranged in a hierarchy. The proposed method, BCNN, is composed of two main modules. First, a convolutional neural network (CNN) is trained with images of the different classes of galaxies (image augmentation is carried out to balance some classes); the CNN outputs… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  3. arXiv:2402.08509  [pdf, other

    cs.DB cs.AI cs.LO

    From Shapes to Shapes: Inferring SHACL Shapes for Results of SPARQL CONSTRUCT Queries (Extended Version)

    Authors: Philipp Seifer, Daniel Hernández, Ralf Lämmel, Steffen Staab

    Abstract: SPARQL CONSTRUCT queries allow for the specification of data processing pipelines that transform given input graphs into new output graphs. It is now common to constrain graphs through SHACL shapes allowing users to understand which data they can expect and which not. However, it becomes challenging to understand what graph data can be expected at the end of a data processing pipeline without know… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 19 pages, 5 figures

    Journal ref: WWW '24: Proceedings of the ACM Web Conference 2024. ACM, 2024, pp. 2064-2074

  4. arXiv:2310.14398  [pdf, other

    cs.RO cs.AI

    Learning to bag with a simulation-free reinforcement learning framework for robots

    Authors: Francisco Munguia-Galeano, Jihong Zhu, Juan David Hernández, Ze Ji

    Abstract: Bagging is an essential skill that humans perform in their daily activities. However, deformable objects, such as bags, are complex for robots to manipulate. This paper presents an efficient learning-based framework that enables robots to learn bagging. The novelty of this framework is its ability to perform bagging without relying on simulations. The learning process is accomplished through a rei… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: IET Cyber-Systems and Robotics

  5. arXiv:2307.15494  [pdf, other

    cs.CL cs.AI cs.HC

    ETHER: Aligning Emergent Communication for Hindsight Experience Replay

    Authors: Kevin Denamganaï, Daniel Hernandez, Ozan Vardal, Sondess Missaoui, James Alfred Walker

    Abstract: Natural language instruction following is paramount to enable collaboration between artificial agents and human beings. Natural language-conditioned reinforcement learning (RL) agents have shown how natural languages' properties, such as compositionality, can provide a strong inductive bias to learn complex policies. Previous architectures like HIGhER combine the benefit of language-conditioning w… ▽ More

    Submitted 17 December, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: work in progress

  6. arXiv:2307.13702  [pdf, other

    cs.AI cs.CL cs.LG

    Measuring Faithfulness in Chain-of-Thought Reasoning

    Authors: Tamera Lanham, Anna Chen, Ansh Radhakrishnan, Benoit Steiner, Carson Denison, Danny Hernandez, Dustin Li, Esin Durmus, Evan Hubinger, Jackson Kernion, Kamilė Lukošiūtė, Karina Nguyen, Newton Cheng, Nicholas Joseph, Nicholas Schiefer, Oliver Rausch, Robin Larson, Sam McCandlish, Sandipan Kundu, Saurav Kadavath, Shannon Yang, Thomas Henighan, Timothy Maxwell, Timothy Telleen-Lawton, Tristan Hume , et al. (5 additional authors not shown)

    Abstract: Large language models (LLMs) perform better when they produce step-by-step, "Chain-of-Thought" (CoT) reasoning before answering a question, but it is unclear if the stated reasoning is a faithful explanation of the model's actual reasoning (i.e., its process for answering the question). We investigate hypotheses for how CoT reasoning may be unfaithful, by examining how the model predictions change… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

  7. arXiv:2307.11768  [pdf, other

    cs.CL cs.AI cs.LG

    Question Decomposition Improves the Faithfulness of Model-Generated Reasoning

    Authors: Ansh Radhakrishnan, Karina Nguyen, Anna Chen, Carol Chen, Carson Denison, Danny Hernandez, Esin Durmus, Evan Hubinger, Jackson Kernion, Kamilė Lukošiūtė, Newton Cheng, Nicholas Joseph, Nicholas Schiefer, Oliver Rausch, Sam McCandlish, Sheer El Showk, Tamera Lanham, Tim Maxwell, Venkatesa Chandrasekaran, Zac Hatfield-Dodds, Jared Kaplan, Jan Brauner, Samuel R. Bowman, Ethan Perez

    Abstract: As large language models (LLMs) perform more difficult tasks, it becomes harder to verify the correctness and safety of their behavior. One approach to help with this issue is to prompt LLMs to externalize their reasoning, e.g., by having them generate step-by-step reasoning as they answer a question (Chain-of-Thought; CoT). The reasoning may enable us to check the process that models use to perfo… ▽ More

    Submitted 25 July, 2023; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: For few-shot examples and prompts, see https://github.com/anthropics/DecompositionFaithfulnessPaper

  8. arXiv:2307.02332  [pdf, other

    cs.HC cs.CY

    Co-creating a Transdisciplinary Map of Technology-mediated Harms, Risks and Vulnerabilities: Challenges, Ambivalences and Opportunities

    Authors: Andrés Domínguez Hernández, Kopo M. Ramokapane, Partha Das Chowdhury, Ola Michalec, Emily Johnstone, Emily Godwin, Alicia G Cork, Awais Rashid

    Abstract: The phrase "online harms" has emerged in recent years out of a growing political willingness to address the ethical and social issues associated with the use of the Internet and digital technology at large. The broad landscape that surrounds online harms gathers a multitude of disciplinary, sectoral and organizational efforts while raising myriad challenges and opportunities for the crossing entre… ▽ More

    Submitted 19 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: 21 pages, 8 figures, to appear in The 26th ACM Conference On Computer-Supported Cooperative Work And Social Computing. October 13-18, 2023. Minneapolis, MN USA

  9. arXiv:2306.16388  [pdf, other

    cs.CL cs.AI

    Towards Measuring the Representation of Subjective Global Opinions in Language Models

    Authors: Esin Durmus, Karina Nguyen, Thomas I. Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, Liane Lovitt, Sam McCandlish, Orowa Sikder, Alex Tamkin, Janel Thamkul, Jared Kaplan, Jack Clark, Deep Ganguli

    Abstract: Large language models (LLMs) may not equitably represent diverse global perspectives on societal issues. In this paper, we develop a quantitative framework to evaluate whose opinions model-generated responses are more similar to. We first build a dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across dif… ▽ More

    Submitted 11 April, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

  10. arXiv:2306.07372  [pdf, other

    cs.LG cs.AI cs.GT

    Composing Efficient, Robust Tests for Policy Selection

    Authors: Dustin Morrill, Thomas J. Walsh, Daniel Hernandez, Peter R. Wurman, Peter Stone

    Abstract: Modern reinforcement learning systems produce many high-quality policies throughout the learning process. However, to choose which policy to actually deploy in the real world, they must be tested under an intractable number of environmental conditions. We introduce RPOSST, an algorithm to select a small set of test cases from a larger pool based on a relatively small number of sample evaluations.… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 26 pages, 13 figures. To appear in Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI 2023)

    ACM Class: B.8.1; I.2.6

  11. A toolkit of dilemmas: Beyond debiasing and fairness formulas for responsible AI/ML

    Authors: Andrés Domínguez Hernández, Vassilis Galanos

    Abstract: Approaches to fair and ethical AI have recently fell under the scrutiny of the emerging, chiefly qualitative, field of critical data studies, placing emphasis on the lack of sensitivity to context and complex social phenomena of such interventions. We employ some of these lessons to introduce a tripartite decision-making toolkit, informed by dilemmas encountered in the pursuit of responsible AI/ML… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: 4 pages, 1 table. Accepted in IEEE International Symposium on Technology and Society 2022

    Journal ref: 2022 IEEE International Symposium on Technology and Society (ISTAS)

  12. arXiv:2302.07459  [pdf, other

    cs.CL

    The Capacity for Moral Self-Correction in Large Language Models

    Authors: Deep Ganguli, Amanda Askell, Nicholas Schiefer, Thomas I. Liao, Kamilė Lukošiūtė, Anna Chen, Anna Goldie, Azalia Mirhoseini, Catherine Olsson, Danny Hernandez, Dawn Drain, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jackson Kernion, Jamie Kerr, Jared Mueller, Joshua Landau, Kamal Ndousse, Karina Nguyen, Liane Lovitt, Michael Sellitto, Nelson Elhage, Noemi Mercado, Nova DasSarma , et al. (24 additional authors not shown)

    Abstract: We test the hypothesis that language models trained with reinforcement learning from human feedback (RLHF) have the capability to "morally self-correct" -- to avoid producing harmful outputs -- if instructed to do so. We find strong evidence in support of this hypothesis across three different experiments, each of which reveal different facets of moral self-correction. We find that the capability… ▽ More

    Submitted 18 February, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

  13. Link Prediction with Attention Applied on Multiple Knowledge Graph Embedding Models

    Authors: Cosimo Gregucci, Mojtaba Nayyeri, Daniel Hernández, Steffen Staab

    Abstract: Predicting missing links between entities in a knowledge graph is a fundamental task to deal with the incompleteness of data on the Web. Knowledge graph embeddings map nodes into a vector space to predict new links, scoring them according to geometric criteria. Relations in the graph may follow patterns that can be learned, e.g., some relations might be symmetric and others might be hierarchical.… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: ACM Web Conference 2023

  14. arXiv:2212.09251  [pdf, other

    cs.CL cs.AI cs.LG

    Discovering Language Model Behaviors with Model-Written Evaluations

    Authors: Ethan Perez, Sam Ringer, Kamilė Lukošiūtė, Karina Nguyen, Edwin Chen, Scott Heiner, Craig Pettit, Catherine Olsson, Sandipan Kundu, Saurav Kadavath, Andy Jones, Anna Chen, Ben Mann, Brian Israel, Bryan Seethor, Cameron McKinnon, Christopher Olah, Da Yan, Daniela Amodei, Dario Amodei, Dawn Drain, Dustin Li, Eli Tran-Johnson, Guro Khundadze, Jackson Kernion , et al. (38 additional authors not shown)

    Abstract: As language models (LMs) scale, they develop many novel behaviors, good and bad, exacerbating the need to evaluate how they behave. Prior work creates evaluations with crowdwork (which is time-consuming and expensive) or existing data sources (which are not always available). Here, we automatically generate evaluations with LMs. We explore approaches with varying amounts of human effort, from inst… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: for associated data visualizations, see https://www.evals.anthropic.com/model-written/ for full datasets, see https://github.com/anthropics/evals

  15. arXiv:2212.08073  [pdf, other

    cs.CL cs.AI

    Constitutional AI: Harmlessness from AI Feedback

    Authors: Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile Lukosuite , et al. (26 additional authors not shown)

    Abstract: As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs. The only human oversight is provided through a list of rules or principles, and so we refer to the method as 'Constitutional AI'. The process involves both a supe… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

  16. arXiv:2210.10572  [pdf, other

    cs.DC

    Distributed Ledger Technologies for Managing Heterogenous Computing Systems at the Edge

    Authors: Daniel Montero Hernández, Jorge Peña Queralta, Tomi Westerlund

    Abstract: The increased use of Internet of Things (IoT) devices -- from basic sensors to robust embedded computers -- has boosted the demand for information processing and storing solutions closer to these devices. Edge computing has been established as a standard architecture for develo** IoT solutions, since it can optimize the workload and capacity of systems that depend on cloud services by deploying… ▽ More

    Submitted 14 November, 2022; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: 8 pages, 8 figures, 5 tables and 2 algorithms

  17. arXiv:2210.09014  [pdf

    cs.CY cs.AI cs.LG cs.SI

    Addressing contingency in algorithmic (mis)information classification: Toward a responsible machine learning agenda

    Authors: Andrés Domínguez Hernández, Richard Owen, Dan Saattrup Nielsen, Ryan McConville

    Abstract: Machine learning (ML) enabled classification models are becoming increasingly popular for tackling the sheer volume and speed of online misinformation and other content that could be identified as harmful. In building these models, data scientists need to take a stance on the legitimacy, authoritativeness and objectivity of the sources of ``truth" used for model training and testing. This has poli… ▽ More

    Submitted 13 April, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Andrés Domínguez Hernández, Richard Owen, Dan Saattrup Nielsen and Ryan McConville. 2023. Addressing contingency in algorithmic (mis)information classification: Toward a responsible machine learning agenda. Accepted in 2023 ACM Conference on Fairness, Accountability, and Transparency (FAccT '23), June 12-15, 2023, Chicago, United States of America. ACM, New York, NY, USA, 16 pages

  18. arXiv:2209.11895  [pdf

    cs.LG

    In-context Learning and Induction Heads

    Authors: Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott Johnston, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish , et al. (1 additional authors not shown)

    Abstract: "Induction heads" are attention heads that implement a simple algorithm to complete token sequences like [A][B] ... [A] -> [B]. In this work, we present preliminary and indirect evidence for a hypothesis that induction heads might constitute the mechanism for the majority of all "in-context learning" in large transformer models (i.e. decreasing loss at increasing token indices). We find that induc… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

  19. arXiv:2209.07858  [pdf, other

    cs.CL cs.AI cs.CY

    Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

    Authors: Deep Ganguli, Liane Lovitt, Jackson Kernion, Amanda Askell, Yuntao Bai, Saurav Kadavath, Ben Mann, Ethan Perez, Nicholas Schiefer, Kamal Ndousse, Andy Jones, Sam Bowman, Anna Chen, Tom Conerly, Nova DasSarma, Dawn Drain, Nelson Elhage, Sheer El-Showk, Stanislav Fort, Zac Hatfield-Dodds, Tom Henighan, Danny Hernandez, Tristan Hume, Josh Jacobson, Scott Johnston , et al. (11 additional authors not shown)

    Abstract: We describe our early efforts to red team language models in order to simultaneously discover, measure, and attempt to reduce their potentially harmful outputs. We make three main contributions. First, we investigate scaling behaviors for red teaming across 3 model sizes (2.7B, 13B, and 52B parameters) and 4 model types: a plain language model (LM); an LM prompted to be helpful, honest, and harmle… ▽ More

    Submitted 22 November, 2022; v1 submitted 23 August, 2022; originally announced September 2022.

  20. arXiv:2207.05221  [pdf, other

    cs.CL cs.AI cs.LG

    Language Models (Mostly) Know What They Know

    Authors: Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Dawn Drain, Ethan Perez, Nicholas Schiefer, Zac Hatfield-Dodds, Nova DasSarma, Eli Tran-Johnson, Scott Johnston, Sheer El-Showk, Andy Jones, Nelson Elhage, Tristan Hume, Anna Chen, Yuntao Bai, Sam Bowman, Stanislav Fort, Deep Ganguli, Danny Hernandez, Josh Jacobson, Jackson Kernion, Shauna Kravec, Liane Lovitt , et al. (11 additional authors not shown)

    Abstract: We study whether language models can evaluate the validity of their own claims and predict which questions they will be able to answer correctly. We first show that larger models are well-calibrated on diverse multiple choice and true/false questions when they are provided in the right format. Thus we can approach self-evaluation on open-ended sampling tasks by asking models to first propose answe… ▽ More

    Submitted 21 November, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: 23+17 pages; refs added, typos fixed

  21. arXiv:2207.00962  [pdf, other

    physics.data-an cs.IT math.ST

    Low probability states, data statistics, and entropy estimation

    Authors: Damián G. Hernández, Ahmed Roman, Ilya Nemenman

    Abstract: A fundamental problem in analysis of complex systems is getting a reliable estimate of entropy of their probability distributions over the state space. This is difficult because unsampled states can contribute substantially to the entropy, while they do not contribute to the Maximum Likelihood estimator of entropy, which replaces probabilities by the observed frequencies. Bayesian estimators overc… ▽ More

    Submitted 3 July, 2022; originally announced July 2022.

  22. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  23. arXiv:2206.00113  [pdf, other

    cs.AI cs.GT

    BRExIt: On Opponent Modelling in Expert Iteration

    Authors: Daniel Hernandez, Hendrik Baier, Michael Kaisers

    Abstract: Finding a best response policy is a central objective in game theory and multi-agent learning, with modern population-based training approaches employing reinforcement learning algorithms as best-response oracles to improve play against candidate opponents (typically previously learnt policies). We propose Best Response Expert Iteration (BRExIt), which accelerates learning in games by incorporatin… ▽ More

    Submitted 25 April, 2023; v1 submitted 31 May, 2022; originally announced June 2022.

  24. arXiv:2205.10487  [pdf, other

    cs.LG cs.AI

    Scaling Laws and Interpretability of Learning from Repeated Data

    Authors: Danny Hernandez, Tom Brown, Tom Conerly, Nova DasSarma, Dawn Drain, Sheer El-Showk, Nelson Elhage, Zac Hatfield-Dodds, Tom Henighan, Tristan Hume, Scott Johnston, Ben Mann, Chris Olah, Catherine Olsson, Dario Amodei, Nicholas Joseph, Jared Kaplan, Sam McCandlish

    Abstract: Recent large language models have been trained on vast datasets, but also often on repeated data, either intentionally for the purpose of upweighting higher quality data, or unintentionally because data deduplication is not perfect and the model is exposed to repeated data at the sentence, paragraph, or document level. Some works have reported substantial negative performance effects of this repea… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

    Comments: 23 pages, 22 figures

  25. arXiv:2204.05862  [pdf, other

    cs.CL cs.LG

    Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

    Authors: Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, Nicholas Joseph, Saurav Kadavath, Jackson Kernion, Tom Conerly, Sheer El-Showk, Nelson Elhage, Zac Hatfield-Dodds, Danny Hernandez, Tristan Hume, Scott Johnston, Shauna Kravec, Liane Lovitt, Neel Nanda, Catherine Olsson, Dario Amodei , et al. (6 additional authors not shown)

    Abstract: We apply preference modeling and reinforcement learning from human feedback (RLHF) to finetune language models to act as helpful and harmless assistants. We find this alignment training improves performance on almost all NLP evaluations, and is fully compatible with training for specialized skills such as python coding and summarization. We explore an iterated online mode of training, where prefer… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: Data available at https://github.com/anthropics/hh-rlhf

  26. Predictability and Surprise in Large Generative Models

    Authors: Deep Ganguli, Danny Hernandez, Liane Lovitt, Nova DasSarma, Tom Henighan, Andy Jones, Nicholas Joseph, Jackson Kernion, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Nelson Elhage, Sheer El Showk, Stanislav Fort, Zac Hatfield-Dodds, Scott Johnston, Shauna Kravec, Neel Nanda, Kamal Ndousse, Catherine Olsson, Daniela Amodei, Dario Amodei , et al. (5 additional authors not shown)

    Abstract: Large-scale pre-training has recently emerged as a technique for creating capable, general purpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many others. In this paper, we highlight a counterintuitive property of such models and discuss the policy implications of this property. Namely, these generative models have an unusual combination of predictable loss on a broad train… ▽ More

    Submitted 3 October, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

    Comments: Updated to reflect the version submitted (and accepted) to ACM FAccT '22. This update incorporates feedback from peer-review and fixes minor typos. See open access FAccT conference version at: https://dl.acm.org/doi/abs/10.1145/3531146.3533229

  27. arXiv:2112.03277  [pdf

    eess.IV cs.AI cs.CV stat.ML

    Automatic quality control framework for more reliable integration of machine learning-based image segmentation into medical workflows

    Authors: Elena Williams, Sebastian Niehaus, Janis Reinelt, Alberto Merola, Paul Glad Mihai, Kersten Villringer, Konstantin Thierbach, Evelyn Medawar, Daniel Lichterfeld, Ingo Roeder, Nico Scherf, Maria del C. Valdés Hernández

    Abstract: Machine learning algorithms underpin modern diagnostic-aiding software, which has proved valuable in clinical practice, particularly in radiology. However, inaccuracies, mainly due to the limited availability of clinical samples for training these algorithms, hamper their wider applicability, acceptance, and recognition amongst clinicians. We present an analysis of state-of-the-art automatic quali… ▽ More

    Submitted 19 December, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: 19 pages

  28. arXiv:2112.00861  [pdf, other

    cs.CL cs.LG

    A General Language Assistant as a Laboratory for Alignment

    Authors: Amanda Askell, Yuntao Bai, Anna Chen, Dawn Drain, Deep Ganguli, Tom Henighan, Andy Jones, Nicholas Joseph, Ben Mann, Nova DasSarma, Nelson Elhage, Zac Hatfield-Dodds, Danny Hernandez, Jackson Kernion, Kamal Ndousse, Catherine Olsson, Dario Amodei, Tom Brown, Jack Clark, Sam McCandlish, Chris Olah, Jared Kaplan

    Abstract: Given the broad capabilities of large language models, it should be possible to work towards a general-purpose, text-based assistant that is aligned with human values, meaning that it is helpful, honest, and harmless. As an initial foray in this direction we study simple baseline techniques and evaluations, such as prompting. We find that the benefits from modest interventions increase with model… ▽ More

    Submitted 9 December, 2021; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: 26+19 pages; v2 typos fixed, refs added, figure scale / colors fixed; v3 correct very non-standard TruthfulQA formatting and metric, alignment implications slightly improved

  29. arXiv:2106.04034  [pdf, other

    cs.NE cs.LG cs.PF

    GSGP-CUDA -- a CUDA framework for Geometric Semantic Genetic Programming

    Authors: Leonardo Trujillo, Jose Manuel Muñoz Contreras, Daniel E Hernandez, Mauro Castelli, Juan J Tapia

    Abstract: Geometric Semantic Genetic Programming (GSGP) is a state-of-the-art machine learning method based on evolutionary computation. GSGP performs search operations directly at the level of program semantics, which can be done more efficiently then operating at the syntax level like most GP systems. Efficient implementations of GSGP in C++ exploit this fact, but not to its full potential. This paper pre… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: 14 pages, 3 figures

    ACM Class: I.2.2; I.5.5

  30. arXiv:2102.01293  [pdf, other

    cs.LG

    Scaling Laws for Transfer

    Authors: Danny Hernandez, Jared Kaplan, Tom Henighan, Sam McCandlish

    Abstract: We study empirical scaling laws for transfer learning between distributions in an unsupervised, fine-tuning setting. When we train increasingly large neural networks from-scratch on a fixed-size dataset, they eventually become data-limited and stop improving in performance (cross-entropy loss). When we do the same for models pre-trained on a large language dataset, the slope in performance gains i… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: 19 pages, 15 figures

  31. arXiv:2006.04471  [pdf, ps, other

    cs.AI cs.GT

    A Comparison of Self-Play Algorithms Under a Generalized Framework

    Authors: Daniel Hernandez, Kevin Denamganai, Sam Devlin, Spyridon Samothrakis, James Alfred Walker

    Abstract: Throughout scientific history, overarching theoretical frameworks have allowed researchers to grow beyond personal intuitions and culturally biased theories. They allow to verify and replicate existing findings, and to link is connected results. The notion of self-play, albeit often cited in multiagent Reinforcement Learning, has never been grounded in a formal model. We present a formalized frame… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  32. arXiv:2006.04419  [pdf, other

    cs.AI cs.HC

    Metagame Autobalancing for Competitive Multiplayer Games

    Authors: Daniel Hernandez, Charles Takashi Toyin Gbadamosi, James Goodman, James Alfred Walker

    Abstract: Automated game balancing has often focused on single-agent scenarios. In this paper we present a tool for balancing multi-player games during game design. Our approach requires a designer to construct an intuitive graphical representation of their meta-game target, representing the relative scores that high-level strategies (or decks, or character types) should experience. This permits more sophis… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  33. arXiv:2005.04305  [pdf

    cs.LG cs.CV stat.ML

    Measuring the Algorithmic Efficiency of Neural Networks

    Authors: Danny Hernandez, Tom B. Brown

    Abstract: Three factors drive the advance of AI: algorithmic innovation, data, and the amount of compute available for training. Algorithmic progress has traditionally been more difficult to quantify than compute and data. In this work, we argue that algorithmic progress has an aspect that is both straightforward to measure and interesting: reductions over time in the compute needed to reach past capabiliti… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: 20 pages, 5 figures

  34. arXiv:2004.12317  [pdf, other

    cs.RO cs.AI

    Online Map** and Motion Planning under Uncertainty for Safe Navigation in Unknown Environments

    Authors: Èric Pairet, Juan David Hernández, Marc Carreras, Yvan Petillot, Morteza Lahijanian

    Abstract: Safe autonomous navigation is an essential and challenging problem for robots operating in highly unstructured or completely unknown environments. Under these conditions, not only robotic systems must deal with limited localisation information, but also their manoeuvrability is constrained by their dynamics and often suffer from uncertainty. In order to cope with these constraints, this manuscript… ▽ More

    Submitted 26 May, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

    Comments: The International Journal of Robotics Research (under review)

  35. arXiv:2003.03052  [pdf, other

    cs.CR

    Combining GHOST and Casper

    Authors: Vitalik Buterin, Diego Hernandez, Thor Kamphefner, Khiem Pham, Zhi Qiao, Danny Ryan, Juhyeok Sin, Ying Wang, Yan X Zhang

    Abstract: We present "Gasper," a proof-of-stake-based consensus protocol, which is an idealized version of the proposed Ethereum 2.0 beacon chain. The protocol combines Casper FFG, a finality tool, with LMD GHOST, a fork-choice rule. We prove safety, plausible liveness, and probabilistic liveness under different sets of assumptions.

    Submitted 11 May, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    MSC Class: 68W15

  36. arXiv:1811.02459  [pdf, other

    stat.ML cs.LG q-bio.NC q-bio.QM

    Nonlinear Evolution via Spatially-Dependent Linear Dynamics for Electrophysiology and Calcium Data

    Authors: Daniel Hernandez, Antonio Khalil Moretti, Ziqiang Wei, Shreya Saxena, John Cunningham, Liam Paninski

    Abstract: Latent variable models have been widely applied for the analysis of time series resulting from experimental neuroscience techniques. In these datasets, observations are relatively smooth and possibly nonlinear. We present Variational Inference for Nonlinear Dynamics (VIND), a variational inference framework that is able to uncover nonlinear, smooth latent dynamics from sequential data. The framewo… ▽ More

    Submitted 16 June, 2020; v1 submitted 6 November, 2018; originally announced November 2018.

    Comments: 8 figs, Accepted at NBDT

  37. arXiv:1810.06979  [pdf, other

    cs.RO cs.HC cs.LG

    Learning Socially Appropriate Robot Approaching Behavior Toward Groups using Deep Reinforcement Learning

    Authors: Yuan Gao, Fangkai Yang, Martin Frisk, Daniel Hernandez, Christopher Peters, Ginevra Castellano

    Abstract: Deep reinforcement learning has recently been widely applied in robotics to study tasks such as locomotion and gras**, but its application to social human-robot interaction (HRI) remains a challenge. In this paper, we present a deep learning scheme that acquires a prior model of robot approaching behavior in simulation and applies it to real-world interaction with a physical robot approaching gr… ▽ More

    Submitted 12 August, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

    Comments: accepted for The 28th IEEE International Conference on Robot & Human Interactive Communication (Ro-Man)

  38. arXiv:1804.01961  [pdf

    q-bio.NC cs.CV

    Machine learning of neuroimaging to diagnose cognitive impairment and dementia: a systematic review and comparative analysis

    Authors: Enrico Pellegrini, Lucia Ballerini, Maria del C. Valdes Hernandez, Francesca M. Chappell, Victor González-Castro, Devasuda Anblagan, Samuel Danso, Susana Muñoz Maniega, Dominic Job, Cyril Pernet, Grant Mair, Tom MacGillivray, Emanuele Trucco, Joanna Wardlaw

    Abstract: INTRODUCTION: Advanced machine learning methods might help to identify dementia risk from neuroimaging, but their accuracy to date is unclear. METHODS: We systematically reviewed the literature, 2006 to late 2016, for machine learning studies differentiating healthy ageing through to dementia of various types, assessing study quality, and comparing accuracy at different disease boundaries. RES… ▽ More

    Submitted 11 April, 2018; v1 submitted 5 April, 2018; originally announced April 2018.

  39. arXiv:1801.04387  [pdf, ps, other

    cs.LO

    The Problem of Correlation and Substitution in SPARQL -- Extended Version

    Authors: Daniel Hernández, Claudio Gutierrez, Renzo Angles

    Abstract: Implementations of a standard language are expected to give same outputs to identical queries. In this paper we study why different implementations of SPARQL (Fuseki, Virtuoso, Blazegraph and rdf4j) behave differently when evaluating queries with correlated variables. We show that at the core of this problem lies the historically troubling notion of logical substitution. We present a formal framew… ▽ More

    Submitted 22 February, 2018; v1 submitted 13 January, 2018; originally announced January 2018.

  40. arXiv:1710.10363  [pdf, other

    cs.LG cs.MA math.OC stat.ML

    Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

    Authors: Sergio Valcarcel Macua, Aleksi Tukiainen, Daniel García-Ocaña Hernández, David Baldazo, Enrique Munoz de Cote, Santiago Zazo

    Abstract: We propose a fully distributed actor-critic algorithm approximated by deep neural networks, named \textit{Diff-DAC}, with application to single-task and to average multitask reinforcement learning (MRL). Each agent has access to data from its local task only, but it aims to learn a policy that performs well on average for the whole set of tasks. During the learning process, agents communicate thei… ▽ More

    Submitted 25 October, 2020; v1 submitted 27 October, 2017; originally announced October 2017.

    Journal ref: Presented at Adaptive Learning Agents workshop (ALA2018), July 14th, 2018, Stockholm, Sweden

  41. arXiv:1607.07788  [pdf

    cs.CL cs.DL

    How scientific literature has been evolving over the time? A novel statistical approach using tracking verbal-based methods

    Authors: Daria Micaela Hernandez, Monica Becue-Bertaut, Igor Barahona

    Abstract: This paper provides a global vision of the scientific publications related with the Systemic Lupus Erythematosus (SLE), taking as starting point abstracts of articles. Through the time, abstracts have been evolving towards higher complexity on used terminology, which makes necessary the use of sophisticated statistical methods and answering questions including: how vocabulary is evolving through t… ▽ More

    Submitted 5 February, 2016; originally announced July 2016.

    Journal ref: JSM Proceedings (2014), Section on Statistical Learning and Data Mining. Alexandria, VA. American Statistical Association. 1121-1131

  42. arXiv:1606.03963  [pdf

    cs.DL cs.CY

    How marketing vocabulary was evolving from 2005 to 2014? An illustrative application of statistical methods on text mining

    Authors: Igor Barahona, Daria Micaela Hernandez, Hector Hugo Perez-Villarreal

    Abstract: Here a collection of 1169 abstracts, which corresponds to articles that the Journal of Marketing Research has published from 2005 to 2014, are analysed under a novel approach. We apply several statistical methods, such as Principal Components Analysis and Correspondence Analysis to identify the way Marketing vocabulary is evolving. Similarly those articles that introduce new vocabulary are identif… ▽ More

    Submitted 5 February, 2016; originally announced June 2016.

    Comments: Marketing, Textual Statistics, Vocabulary evolving, Influential articles, Correspondence analysis

    Journal ref: JSM Proceedings (2015) Section on Statistics in Marketing. Alexandria, VA. American Statistical Association. 1121-1131

  43. arXiv:1606.01441  [pdf, ps, other

    cs.DB

    Correlation and Substitution in SPARQL

    Authors: Daniel Hernández, Claudio Gutierrez, Renzo Angles

    Abstract: In the current SPARQL specification the notion of correlation and substitution are not well defined. This problem triggers several ambiguities in the semantics. In fact, implementations as Fuseki and Virtuoso assume different semantics. In this technical report, we provide a semantics of correlation and substitution following the classic philosophy of substitution and correlation in logic, progr… ▽ More

    Submitted 11 July, 2016; v1 submitted 4 June, 2016; originally announced June 2016.

  44. arXiv:1508.03530  [pdf, ps, other

    cs.CL physics.data-an physics.soc-ph

    Information-theoretical analysis of the statistical dependencies among three variables: Applications to written language

    Authors: Damián G. Hernández, Damián H. Zanette, Inés Samengo

    Abstract: We develop the information-theoretical concepts required to study the statistical dependencies among three variables. Some of such dependencies are pure triple interactions, in the sense that they cannot be explained in terms of a combination of pairwise correlations. We derive bounds for triple dependencies, and characterize the shape of the joint probability distribution of three binary variable… ▽ More

    Submitted 29 July, 2015; originally announced August 2015.

  45. Unifying the Linear Time-Branching Time Spectrum of Process Semantics

    Authors: David de Frutos Escrig, Carlos Gregorio-Rodríguez, Miguel Palomino, David Romero Hernández

    Abstract: Van Glabbeek's linear time-branching time spectrum is one of the most relevant work on comparative study on process semantics, in which semantics are partially ordered by their discrimination power. In this paper we bring forward a refinement of this classification and show how the process semantics can be dealt with in a uniform way: based on the very natural concept of constrained simulation we… ▽ More

    Submitted 24 June, 2013; v1 submitted 24 April, 2013; originally announced April 2013.

    Journal ref: Logical Methods in Computer Science, Volume 9, Issue 2 (June 27, 2013) lmcs:983

  46. arXiv:1302.1546  [pdf

    cs.AI

    Inference with Idempotent Valuations

    Authors: Luis D. Hernandez, Serafin Moral

    Abstract: Valuation based systems verifying an idempotent property are studied. A partial order is defined between the valuations giving them a lattice structure. Then, two different strategies are introduced to represent valuations: as infimum of the most informative valuations or as supremum of the least informative ones. It is studied how to carry out computations with both representations in an effic… ▽ More

    Submitted 6 February, 2013; originally announced February 2013.

    Comments: Appears in Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI1997)

    Report number: UAI-P-1997-PG-229-237