Search | arXiv e-print repository

What is it for a Machine Learning Model to Have a Capability?

Authors: Jacqueline Harding, Nathaniel Sharadin

Abstract: What can contemporary machine learning (ML) models do? Given the proliferation of ML models in society, answering this question matters to a variety of stakeholders, both public and private. The evaluation of models' capabilities is rapidly emerging as a key subfield of modern ML, buoyed by regulatory attention and government grants. Despite this, the notion of an ML model possessing a capability… ▽ More What can contemporary machine learning (ML) models do? Given the proliferation of ML models in society, answering this question matters to a variety of stakeholders, both public and private. The evaluation of models' capabilities is rapidly emerging as a key subfield of modern ML, buoyed by regulatory attention and government grants. Despite this, the notion of an ML model possessing a capability has not been interrogated: what are we saying when we say that a model is able to do something? And what sorts of evidence bear upon this question? In this paper, we aim to answer these questions, using the capabilities of large language models (LLMs) as a running example. Drawing on the large philosophical literature on abilities, we develop an account of ML models' capabilities which can be usefully applied to the nascent science of model evaluation. Our core proposal is a conditional analysis of model abilities (CAMA): crudely, a machine learning model has a capability to X just when it would reliably succeed at doing X if it 'tried'. The main contribution of the paper is making this proposal precise in the context of ML, resulting in an operationalisation of CAMA applicable to LLMs. We then put CAMA to work, showing that it can help make sense of various features of ML model evaluation practice, as well as suggest procedures for performing fair inter-model comparisons. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: forthcoming in the British Journal for the Philosophy of Science (BJPS)

arXiv:2307.02627 [pdf, ps, other]

doi 10.1007/s00355-021-01345-8

Proxy Selection in Transitive Proxy Voting

Authors: Jacqueline Harding

Abstract: Transitive proxy voting (or "liquid democracy") is a novel form of collective decision making, often framed as an attractive hybrid of direct and representative democracy. Although the ideas behind liquid democracy have garnered widespread support, there have been relatively few attempts to model it formally. This paper makes three main contributions. First, it proposes a new social choice-theoret… ▽ More Transitive proxy voting (or "liquid democracy") is a novel form of collective decision making, often framed as an attractive hybrid of direct and representative democracy. Although the ideas behind liquid democracy have garnered widespread support, there have been relatively few attempts to model it formally. This paper makes three main contributions. First, it proposes a new social choice-theoretic model of liquid democracy, which is distinguished by taking a richer formal perspective on the process by which a voter chooses a proxy. Second, it examines the model from an axiomatic perspective, proving (a) a proxy vote analogue of May's Theorem and (b) an impossibility result concerning monotonicity properties in a proxy vote setting. Third, it explores the topic of manipulation in transitive proxy votes. Two forms of manipulation specific to the proxy vote setting are defined, and it is shown that manipulation occurs in strictly more cases in proxy votes than in classical votes. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Journal ref: Social Choice and Welfare 58, 69-99 (2022)

arXiv:2306.08193 [pdf, other]

doi 10.1086/728685

Operationalising Representation in Natural Language Processing

Authors: Jacqueline Harding

Abstract: Despite its centrality in the philosophy of cognitive science, there has been little prior philosophical work engaging with the notion of representation in contemporary NLP practice. This paper attempts to fill that lacuna: drawing on ideas from cognitive science, I introduce a framework for evaluating the representational claims made about components of neural NLP models, proposing three criteria… ▽ More Despite its centrality in the philosophy of cognitive science, there has been little prior philosophical work engaging with the notion of representation in contemporary NLP practice. This paper attempts to fill that lacuna: drawing on ideas from cognitive science, I introduce a framework for evaluating the representational claims made about components of neural NLP models, proposing three criteria with which to evaluate whether a component of a model represents a property and operationalising these criteria using probing classifiers, a popular analysis technique in NLP (and deep learning more broadly). The project of operationalising a philosophically-informed notion of representation should be of interest to both philosophers of science and NLP practitioners. It affords philosophers a novel testing-ground for claims about the nature of representation, and helps NLPers organise the large literature on probing experiments, suggesting novel avenues for empirical research. △ Less

Submitted 7 October, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: Forthcoming in the British Journal for the Philosophy of Science (BJPS)

arXiv:2008.06908 [pdf, other]

Visually Aware Skip-Gram for Image Based Recommendations

Authors: Parth Tiwari, Yash Jain, Shivansh Mundra, Jenny Harding, Manoj Kumar Tiwari

Abstract: The visual appearance of a product significantly influences purchase decisions on e-commerce websites. We propose a novel framework VASG (Visually Aware Skip-Gram) for learning user and product representations in a common latent space using product image features. Our model is an amalgamation of the Skip-Gram architecture and a deep neural network based Decoder. Here the Skip-Gram attempts to capt… ▽ More The visual appearance of a product significantly influences purchase decisions on e-commerce websites. We propose a novel framework VASG (Visually Aware Skip-Gram) for learning user and product representations in a common latent space using product image features. Our model is an amalgamation of the Skip-Gram architecture and a deep neural network based Decoder. Here the Skip-Gram attempts to capture user preference by optimizing user-product co-occurrence in a Heterogeneous Information Network while the Decoder simultaneously learns a map** to transform product image features to the Skip-Gram embedding space. This architecture is jointly optimized in an end-to-end, multitask fashion. The proposed framework enables us to make personalized recommendations for cold-start products which have no purchase history. Experiments conducted on large real-world datasets show that the learned embeddings can generate effective recommendations using nearest neighbour searches. △ Less

Submitted 16 August, 2020; originally announced August 2020.

Comments: 8 pages, 5 figures

arXiv:1808.08079 [pdf, other]

Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information

Authors: Mario Giulianelli, Jacqueline Harding, Florian Mohnert, Dieuwke Hupkes, Willem Zuidema

Abstract: How do neural language models keep track of number agreement between subject and verb? We show that `diagnostic classifiers', trained to predict number from the internal states of a language model, provide a detailed understanding of how, when, and where this information is represented. Moreover, they give us insight into when and where number information is corrupted in cases where the language m… ▽ More How do neural language models keep track of number agreement between subject and verb? We show that `diagnostic classifiers', trained to predict number from the internal states of a language model, provide a detailed understanding of how, when, and where this information is represented. Moreover, they give us insight into when and where number information is corrupted in cases where the language model ends up making agreement errors. To demonstrate the causal role played by the representations we find, we then use agreement information to influence the course of the LSTM during the processing of difficult sentences. Results from such an intervention reveal a large increase in the language model's accuracy. Together, these results show that diagnostic classifiers give us an unrivalled detailed look into the representation of linguistic information in neural models, and demonstrate that this knowledge can be used to improve their performance. △ Less

Submitted 18 November, 2021; v1 submitted 24 August, 2018; originally announced August 2018.

Comments: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

arXiv:1706.09450 [pdf]

The application of deep convolutional neural networks to ultrasound for modelling of dynamic states within human skeletal muscle

Authors: Ryan J. Cunningham, Peter J. Harding, Ian D. Loram

Abstract: This paper concerns the fully automatic direct in vivo measurement of active and passive dynamic skeletal muscle states using ultrasound imaging. Despite the long standing medical need (myopathies, neuropathies, pain, injury, ageing), currently technology (electromyography, dynamometry, shear wave imaging) provides no general, non-invasive method for online estimation of skeletal intramuscular sta… ▽ More This paper concerns the fully automatic direct in vivo measurement of active and passive dynamic skeletal muscle states using ultrasound imaging. Despite the long standing medical need (myopathies, neuropathies, pain, injury, ageing), currently technology (electromyography, dynamometry, shear wave imaging) provides no general, non-invasive method for online estimation of skeletal intramuscular states. Ultrasound provides a technology in which static and dynamic muscle states can be observed non-invasively, yet current computational image understanding approaches are inadequate. We propose a new approach in which deep learning methods are used for understanding the content of ultrasound images of muscle in terms of its measured state. Ultrasound data synchronized with electromyography of the calf muscles, with measures of joint torque/angle were recorded from 19 healthy participants (6 female, ages: 30 +- 7.7). A segmentation algorithm previously developed by our group was applied to extract a region of interest of the medial gastrocnemius. Then a deep convolutional neural network was trained to predict the measured states (joint angle/torque, electromyography) directly from the segmented images. Results revealed for the first time that active and passive muscle states can be measured directly from standard b-mode ultrasound images, accurately predicting for a held out test participant changes in the joint angle, electromyography, and torque with as little error as 0.022°, 0.0001V, 0.256Nm (root mean square error) respectively. △ Less

Submitted 28 June, 2017; originally announced June 2017.

Comments: paper in preparation for submission to IEEE TMI

arXiv:1008.2160 [pdf, other]

An early warning method for crush

Authors: Peter J. Harding, Steve M. V. Gwynne, Martyn Amos

Abstract: Fatal crush conditions occur in crowds with tragic frequency. Event organisers and architects are often criticised for failing to consider the causes and implications of crush, but the reality is that the prediction and mitigation of such conditions offers a significant technical challenge. Full treatment of physical force within crowd simulations is precise but computationally expensive; the more… ▽ More Fatal crush conditions occur in crowds with tragic frequency. Event organisers and architects are often criticised for failing to consider the causes and implications of crush, but the reality is that the prediction and mitigation of such conditions offers a significant technical challenge. Full treatment of physical force within crowd simulations is precise but computationally expensive; the more common method of human interpretation of results is computationally "cheap" but subjective and time-consuming. In this paper we propose an alternative method for the analysis of crowd behaviour, which uses information theory to measure crowd disorder. We show how this technique may be easily incorporated into an existing simulation framework, and validate it against an historical event. Our results show that this method offers an effective and efficient route towards automatic detection of crush. △ Less

Submitted 12 August, 2010; originally announced August 2010.

Comments: Submitted

arXiv:0805.0360 [pdf, ps, other]

Prediction and Mitigation of Crush Conditions in Emergency Evacuations

Authors: Peter J. Harding, Martyn Amos, Steve Gwynne

Abstract: Several simulation environments exist for the simulation of large-scale evacuations of buildings, ships, or other enclosed spaces. These offer sophisticated tools for the study of human behaviour, the recreation of environmental factors such as fire or smoke, and the inclusion of architectural or structural features, such as elevators, pillars and exits. Although such simulation environments can… ▽ More Several simulation environments exist for the simulation of large-scale evacuations of buildings, ships, or other enclosed spaces. These offer sophisticated tools for the study of human behaviour, the recreation of environmental factors such as fire or smoke, and the inclusion of architectural or structural features, such as elevators, pillars and exits. Although such simulation environments can provide insights into crowd behaviour, they lack the ability to examine potentially dangerous forces building up within a crowd. These are commonly referred to as crush conditions, and are a common cause of death in emergency evacuations. In this paper, we describe a methodology for the prediction and mitigation of crush conditions. The paper is organised as follows. We first establish the need for such a model, defining the main factors that lead to crush conditions, and describing several exemplar case studies. We then examine current methods for studying crush, and describe their limitations. From this, we develop a three-stage hybrid approach, using a combination of techniques. We conclude with a brief discussion of the potential benefits of our approach. △ Less

Submitted 3 May, 2008; originally announced May 2008.

Showing 1–8 of 8 results for author: Harding, J