Search | arXiv e-print repository

doi 10.1016/j.compag.2023.108500

CowScreeningDB: A public benchmark dataset for lameness detection in dairy cows

Authors: Shahid Ismail, Moises Diaz, Cristina Carmona-Duarte, Jose Manuel Vilar, Miguel A. Ferrer

Abstract: Lameness is one of the costliest pathological problems affecting dairy animals. It is usually assessed by trained veterinary clinicians who observe features such as gait symmetry or gait parameters as step counts in real-time. With the development of artificial intelligence, various modular systems have been proposed to minimize subjectivity in lameness assessment. However, the major limitation in… ▽ More Lameness is one of the costliest pathological problems affecting dairy animals. It is usually assessed by trained veterinary clinicians who observe features such as gait symmetry or gait parameters as step counts in real-time. With the development of artificial intelligence, various modular systems have been proposed to minimize subjectivity in lameness assessment. However, the major limitation in their development is the unavailability of a public dataset which is currently either commercial or privately held. To tackle this limitation, we have introduced CowScreeningDB which was created using sensory data. This dataset was sourced from 43 cows at a dairy located in Gran Canaria, Spain. It consists of a multi-sensor dataset built on data collected using an Apple Watch 6 during the normal daily routine of a dairy cow. Thanks to the collection environment, sampling technique, information regarding the sensors, the applications used for data conversion and storage make the dataset a transparent one. This transparency of data can thus be used for further development of techniques for lameness detection for dairy cows which can be objectively compared. Aside from the public sharing of the dataset, we have also shared a machine-learning technique which classifies the caws in healthy and lame by using the raw sensory data. Hence validating the major objective which is to establish the relationship between sensor data and lameness. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Journal ref: Computers and Electronics in Agriculture, vol.216, pp.108500, 2024

arXiv:2209.07802 [pdf]

doi 10.1126/sciadv.adf0673

Dynamics-informed deconvolutional neural networks for super-resolution identification of regime changes in epidemiological time series

Authors: Jose M. G. Vilar, Leonor Saiz

Abstract: Inferring the timing and amplitude of perturbations in epidemiological systems from their stochastically spread low-resolution outcomes is as relevant as challenging. It is a requirement for current approaches to overcome the need to know the details of the perturbations to proceed with the analyses. However, the general problem of connecting epidemiological curves with the underlying incidence la… ▽ More Inferring the timing and amplitude of perturbations in epidemiological systems from their stochastically spread low-resolution outcomes is as relevant as challenging. It is a requirement for current approaches to overcome the need to know the details of the perturbations to proceed with the analyses. However, the general problem of connecting epidemiological curves with the underlying incidence lacks the highly effective methodology present in other inverse problems, such as super-resolution and dehazing from computer vision. Here, we develop an unsupervised physics-informed convolutional neural network approach in reverse to connect death records with incidence that allows the identification of regime changes at single-day resolution. Applied to COVID-19 data with proper regularization and model-selection criteria, the approach can identify the implementation and removal of lockdowns and other nonpharmaceutical interventions with 0.93-day accuracy over the time span of a year. △ Less

Submitted 16 September, 2022; originally announced September 2022.

Comments: 18 pages, 5 figures

Journal ref: Science Advances 9, eadf0673 (2023)

arXiv:2204.12801 [pdf, other]

Speeding Hirschberg Algorithm for Sequence Alignment

Authors: David Llorens, Juan Miguel Vilar

Abstract: The use of Hirschberg algorithm reduces the spatial cost of recovering the Longest Common Subsequence to linear space. The same technique can be applied to similar problems like Sequence Alignment. However, the price to pay is a duplication of temporal cost. We present here a technique to reduce this time overhead to a negligible amount. The use of Hirschberg algorithm reduces the spatial cost of recovering the Longest Common Subsequence to linear space. The same technique can be applied to similar problems like Sequence Alignment. However, the price to pay is a duplication of temporal cost. We present here a technique to reduce this time overhead to a negligible amount. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: 15 pages. Submitted to Fundamenta Informaticae

ACM Class: F.2.2

arXiv:1902.04337 [pdf]

Winning the Big Data Technologies Horizon Prize: Fast and reliable forecasting of electricity grid traffic by identification of recurrent fluctuations

Authors: Jose M. G. Vilar

Abstract: This paper provides a description of the approach and methodology I used in winning the European Union Big Data Technologies Horizon Prize on data-driven prediction of electricity grid traffic. The methodology relies on identifying typical short-term recurrent fluctuations, which is subsequently refined through a regression-of-fluctuations approach. The key points and strategic considerations that… ▽ More This paper provides a description of the approach and methodology I used in winning the European Union Big Data Technologies Horizon Prize on data-driven prediction of electricity grid traffic. The methodology relies on identifying typical short-term recurrent fluctuations, which is subsequently refined through a regression-of-fluctuations approach. The key points and strategic considerations that led to selecting or discarding different methodological aspects are also discussed. The criteria include adaptability to changing conditions, reliability with outliers and missing data, robustness to noise, and efficiency in implementation. △ Less

Submitted 12 February, 2019; originally announced February 2019.

Comments: Approach and methodology used in winning the European Union Big Data Technologies Horizon Prize (https://ec.europa.eu/research/horizonprize/index.cfm?pg=prizes)

arXiv:1012.3607 [pdf]

doi 10.1016/j.bpj.2010.08.006

Accurate prediction of gene expression by integration of DNA sequence statistics with detailed modeling of transcription regulation

Authors: Jose M. G. Vilar

Abstract: Gene regulation involves a hierarchy of events that extend from specific protein-DNA interactions to the combinatorial assembly of nucleoprotein complexes. The effects of DNA sequence on these processes have typically been studied based either on its quantitative connection with single-domain binding free energies or on empirical rules that combine different DNA motifs to predict gene expression t… ▽ More Gene regulation involves a hierarchy of events that extend from specific protein-DNA interactions to the combinatorial assembly of nucleoprotein complexes. The effects of DNA sequence on these processes have typically been studied based either on its quantitative connection with single-domain binding free energies or on empirical rules that combine different DNA motifs to predict gene expression trends on a genomic scale. The middle-point approach that quantitatively bridges these two extremes, however, remains largely unexplored. Here, we provide an integrated approach to accurately predict gene expression from statistical sequence information in combination with detailed biophysical modeling of transcription regulation by multidomain binding on multiple DNA sites. For the regulation of the prototypical lac operon, this approach predicts within 0.3-fold accuracy transcriptional activity over a 10,000-fold range from DNA sequence statistics for different intracellular conditions. △ Less

Submitted 16 December, 2010; originally announced December 2010.

Comments: 15 pages, 5 figures

Journal ref: Biophys. J. 99, 2408-2413 (2010)

arXiv:1011.1212 [pdf]

doi 10.1093/bioinformatics/btq328

CplexA: a Mathematica package to study macromolecular-assembly control of gene expression

Authors: J. M. G. Vilar, L. Saiz

Abstract: Summary: Macromolecular assembly vertebrates essential cellular processes, such as gene regulation and signal transduction. A major challenge for conventional computational methods to study these processes is tackling the exponential increase of the number of configurational states with the number of components. CplexA is a Mathematica package that uses functional programming to efficiently comput… ▽ More Summary: Macromolecular assembly vertebrates essential cellular processes, such as gene regulation and signal transduction. A major challenge for conventional computational methods to study these processes is tackling the exponential increase of the number of configurational states with the number of components. CplexA is a Mathematica package that uses functional programming to efficiently compute probabilities and average properties over such exponentially large number of states from the energetics of the interactions. The package is particularly suited to study gene expression at complex promoters controlled by multiple, local and distal, DNA binding sites for transcription factors. Availability: CplexA is freely available together with documentation at http://sourceforge.net/projects/cplexa/. △ Less

Submitted 6 January, 2013; v1 submitted 4 November, 2010; originally announced November 2010.

Comments: 28 pages. Includes Mathematica, Matlab, and Python implementation tutorials. Software can be downloaded at http://cplexa.sourceforge.net/

Showing 1–6 of 6 results for author: Vilar, J M