Search | arXiv e-print repository

arXiv:2406.19650 [pdf, other]

DECOR: Improving Coherence in L2 English Writing with a Novel Benchmark for Incoherence Detection, Reasoning, and Rewriting

Authors: Xuanming Zhang, Anthony Diaz, Zixun Chen, Qingyang Wu, Kun Qian, Erik Voss, Zhou Yu

Abstract: Coherence in writing, an aspect that second-language (L2) English learners often struggle with, is crucial in assessing L2 English writing. Existing automated writing evaluation systems primarily use basic surface linguistic features to detect coherence in writing. However, little effort has been made to correct the detected incoherence, which could significantly benefit L2 language learners seeki… ▽ More Coherence in writing, an aspect that second-language (L2) English learners often struggle with, is crucial in assessing L2 English writing. Existing automated writing evaluation systems primarily use basic surface linguistic features to detect coherence in writing. However, little effort has been made to correct the detected incoherence, which could significantly benefit L2 language learners seeking to improve their writing. To bridge this gap, we introduce DECOR, a novel benchmark that includes expert annotations for detecting incoherence in L2 English writing, identifying the underlying reasons, and rewriting the incoherent sentences. To our knowledge, DECOR is the first coherence assessment dataset specifically designed for improving L2 English writing, featuring pairs of original incoherent sentences alongside their expert-rewritten counterparts. Additionally, we fine-tuned models to automatically detect and rewrite incoherence in student essays. We find that incorporating specific reasons for incoherence during fine-tuning consistently improves the quality of the rewrites, achieving a result that is favored in both automatic and human evaluations. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: 21 pages, 5 figures, 20 tables

arXiv:2406.02062 [pdf, other]

Towards Railways Remote Driving: Analysis of Video Streaming Latency and Adaptive Rate Control

Authors: Daniel Mejias, Zaloa Fernandez, Roberto Viola, Ander Aramburu, Igor Lopez, Andoni Diaz

Abstract: Remote driving aims to improve transport systems by promoting efficiency, sustainability, and accessibility. In the railway sector, remote driving makes it possible to increase flexibility, as the driver no longer has to be in the cab. However, this brings several challenges, as it has to provide at least the same level of safety obtained when the driver is in the cab. To achieve it, wireless netw… ▽ More Remote driving aims to improve transport systems by promoting efficiency, sustainability, and accessibility. In the railway sector, remote driving makes it possible to increase flexibility, as the driver no longer has to be in the cab. However, this brings several challenges, as it has to provide at least the same level of safety obtained when the driver is in the cab. To achieve it, wireless networks and video streaming technologies gain importance as they should provide real-time track visualization and obstacle detection capabilities to the remote driver. Low latency camera capture, onboard media processing devices, and streaming protocols adapted for wireless links are the necessary enablers to be developed and integrated into the railway infrastructure. This paper compares video streaming protocols such as Real-Time Streaming Protocol (RTSP) and Web Real-Time Communication (WebRTC), as they are the main alternatives based on Real-time Transport Protocol (RTP) protocol to enable low latency. As latency is the main performance metric, this paper also provides a solution to calculate the End-to-End video streaming latency analytically. Finally, the paper proposes a rate control algorithm to adapt the video stream depending on the network capacity. The objective is to keep the latency as low as possible while avoiding any visual artifacts. The proposed solutions are tested in different setups and scenarios to prove their effectiveness before the planned field testing. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2311.14654 [pdf, other]

JetLOV: Enhancing Jet Tree Tagging through Neural Network Learning of Optimal LundNet Variables

Authors: Mauricio A. Diaz, Giorgio Cerro, Jacan Chaplais, Srinandan Dasmahapatra, Stefano Moretti

Abstract: Machine learning has played a pivotal role in advancing physics, with deep learning notably contributing to solving complex classification problems such as jet tagging in the field of jet physics. In this experiment, we aim to harness the full potential of neural networks while acknowledging that, at times, we may lose sight of the underlying physics governing these models. Nevertheless, we demons… ▽ More Machine learning has played a pivotal role in advancing physics, with deep learning notably contributing to solving complex classification problems such as jet tagging in the field of jet physics. In this experiment, we aim to harness the full potential of neural networks while acknowledging that, at times, we may lose sight of the underlying physics governing these models. Nevertheless, we demonstrate that we can achieve remarkable results obscuring physics knowledge and relying completely on the model's outcome. We introduce JetLOV, a composite comprising two models: a straightforward multilayer perceptron (MLP) and the well-established LundNet. Our study reveals that we can attain comparable jet tagging performance without relying on the pre-computed LundNet variables. Instead, we allow the network to autonomously learn an entirely new set of variables, devoid of a priori knowledge of the underlying physics. These findings hold promise, particularly in addressing the issue of model dependence, which can be mitigated through generalization and training on diverse data sets. △ Less

Submitted 24 November, 2023; originally announced November 2023.

Comments: Accepted at the NeurIPS 2023 workshop: Machine Learning and the Physical Sciences

arXiv:2307.04427 [pdf, other]

doi 10.1126/science.adc9818

Observation of high-energy neutrinos from the Galactic plane

Authors: R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., S. W. Barwick, V. Basu, S. Baur, R. Bay, J. J. Beatty, K. -H. Becker, J. Becker Tjus , et al. (364 additional authors not shown)

Abstract: The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrin… ▽ More The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrino emission using machine learning techniques applied to ten years of data from the IceCube Neutrino Observatory. We identify neutrino emission from the Galactic plane at the 4.5$σ$ level of significance, by comparing diffuse emission models to a background-only hypothesis. The signal is consistent with modeled diffuse emission from the Galactic plane, but could also arise from a population of unresolved point sources. △ Less

Submitted 10 July, 2023; originally announced July 2023.

Comments: Submitted on May 12th, 2022; Accepted on May 4th, 2023

Journal ref: Science 380, 6652, 1338-1343 (2023)

arXiv:2303.14229 [pdf, ps, other]

Sharp threshold for embedding balanced spanning trees in random geometric graphs

Authors: Alberto Espuny Díaz, Lyuben Lichev, Dieter Mitsche, Alexandra Wesolek

Abstract: A rooted tree is balanced if the degree of a vertex depends only on its distance to the root. In this paper we determine the sharp threshold for the appearance of a large family of balanced spanning trees in the random geometric graph $\mathcal{G}(n,r,d)$. In particular, we find the sharp threshold for balanced binary trees. More generally, we show that all sequences of balanced trees with uniform… ▽ More A rooted tree is balanced if the degree of a vertex depends only on its distance to the root. In this paper we determine the sharp threshold for the appearance of a large family of balanced spanning trees in the random geometric graph $\mathcal{G}(n,r,d)$. In particular, we find the sharp threshold for balanced binary trees. More generally, we show that all sequences of balanced trees with uniformly bounded degrees and height tending to infinity appear above a sharp threshold, and none of these appears below the same value. Our results hold more generally for geometric graphs satisfying a mild condition on the distribution of their vertex set, and we provide a polynomial time algorithm to find such trees. △ Less

Submitted 24 March, 2023; originally announced March 2023.

arXiv:2302.06018 [pdf, other]

Optimizing Floors in First Price Auctions: an Empirical Study of Yahoo Advertising

Authors: Miguel Alcobendas, Jonathan Ji, Hemakumar Gokulakannan, Dawit Wami, Boris Kapchits, Emilien Pouradier Duteil, Korby Satow, Maria Rosario Levy Roman, Oriol Diaz, Amado A. Diaz Jr., Rabi Kavoori

Abstract: Floors (also known as reserve prices) help publishers to increase the expected revenue of their ad space, which is usually sold via auctions. Floors are defined as the minimum bid that a seller (it can be a publisher or an ad exchange) is willing to accept for the inventory opportunity. In this paper, we present a model to set floors in first price auctions, and discuss the impact of its implement… ▽ More Floors (also known as reserve prices) help publishers to increase the expected revenue of their ad space, which is usually sold via auctions. Floors are defined as the minimum bid that a seller (it can be a publisher or an ad exchange) is willing to accept for the inventory opportunity. In this paper, we present a model to set floors in first price auctions, and discuss the impact of its implementation on Yahoo sites. The model captures important characteristics of the online advertising industry. For instance, some bidders impose restrictions on how ad exchanges can handle data from bidders, conditioning the model choice to set reserve prices. Our solution induces bidders to change their bidding behavior as a response to the floors enclosed in the bid request, hel** online publishers to increase their ad revenue. The outlined methodology has been implemented at Yahoo with remarkable results. The annualized incremental revenue is estimated at +1.3% on Yahoo display inventory, and +2.5% on video ad inventory. These are non-negligible numbers in the multi-million Yahoo ad business. △ Less

Submitted 9 February, 2024; v1 submitted 12 February, 2023; originally announced February 2023.

arXiv:2210.10263 [pdf, other]

Time and Cost-Efficient Bathymetric Map** System using Sparse Point Cloud Generation and Automatic Object Detection

Authors: Andres Pulido, Ruoyao Qin, Antonio Diaz, Andrew Ortega, Peter Ifju, Jaejeong Shin

Abstract: Generating 3D point cloud (PC) data from noisy sonar measurements is a problem that has potential applications for bathymetry map**, artificial object inspection, map** of aquatic plants and fauna as well as underwater navigation and localization of vehicles such as submarines. Side-scan sonar sensors are available in inexpensive cost ranges, especially in fish-finders, where the transducers a… ▽ More Generating 3D point cloud (PC) data from noisy sonar measurements is a problem that has potential applications for bathymetry map**, artificial object inspection, map** of aquatic plants and fauna as well as underwater navigation and localization of vehicles such as submarines. Side-scan sonar sensors are available in inexpensive cost ranges, especially in fish-finders, where the transducers are usually mounted to the bottom of a boat and can approach shallower depths than the ones attached to an Uncrewed Underwater Vehicle (UUV) can. However, extracting 3D information from side-scan sonar imagery is a difficult task because of its low signal-to-noise ratio and missing angle and depth information in the imagery. Since most algorithms that generate a 3D point cloud from side-scan sonar imagery use Shape from Shading (SFS) techniques, extracting 3D information is especially difficult when the seafloor is smooth, is slowly changing in depth, or does not have identifiable objects that make acoustic shadows. This paper introduces an efficient algorithm that generates a sparse 3D point cloud from side-scan sonar images. This computation is done in a computationally efficient manner by leveraging the geometry of the first sonar return combined with known positions provided by GPS and down-scan sonar depth measurement at each data point. Additionally, this paper implements another algorithm that uses a Convolutional Neural Network (CNN) using transfer learning to perform object detection on side-scan sonar images collected in real life and generated with a simulation. The algorithm was tested on both real and synthetic images to show reasonably accurate anomaly detection and classification. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Comments: Submitted to OCEANS 2022

arXiv:2209.07838 [pdf]

Notch Fracture predictions using the Phase Field method for Ti-6Al-4V produced by Selective Laser Melting after different post-processing conditions

Authors: A. Díaz, J. M. Alegre, I. I. Cuesta, E. Martínez-Pañeda, Z. Zhang

Abstract: Ti-6Al-4V is a titanium alloy with excellent properties for lightweight applications and its production through Additive Manufacturing processes is attractive for different industrial sectors. In this work, the influence of mechanical properties on the notch fracture resistance of Ti-6Al-4V produced by Selective Laser Melting is numerically investigated. Literature data is used to inform material… ▽ More Ti-6Al-4V is a titanium alloy with excellent properties for lightweight applications and its production through Additive Manufacturing processes is attractive for different industrial sectors. In this work, the influence of mechanical properties on the notch fracture resistance of Ti-6Al-4V produced by Selective Laser Melting is numerically investigated. Literature data is used to inform material behaviour. The as-built brittle behaviour is compared to the enhanced ductile response after heat treatment (HT) and hot isostatic pressing (HIP) post-processes. A Phase Field framework is adopted to capture damage nucleation and propagation from two different notch geometries and a discussion on the influence of fracture energy and the characteristic length is carried out. In addition, the influence of oxygen uptake is analysed by reproducing non-inert atmospheres during HT and HIP, showing that oxygen shifts fracture to brittle failures due to the formation of an alpha case layer, especially for the V-notch geometry. Results show that a pure elastic behaviour can be assumed for the as-built SLM condition, whereas elastic-plastic phenomena must be modelled for specimens subjected to heat treatment or hot isostatic pressing. The present brittle Phase Field framework coupled with an elastic-plastic constitutive analysis is demonstrated to be a robust prediction tool for notch fracture after different post-processing routes. △ Less

Submitted 16 September, 2022; originally announced September 2022.

arXiv:2209.03042 [pdf, other]

doi 10.1088/1748-0221/17/11/P11003

Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

Authors: R. Abbasi, M. Ackermann, J. Adams, N. Aggarwal, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, V. Basu, R. Bay, J. J. Beatty, K. -H. Becker , et al. (359 additional authors not shown)

Abstract: IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challen… ▽ More IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challenge due to the irregular detector geometry, inhomogeneous scattering and absorption of light in the ice and, below 100 GeV, the relatively low number of signal photons produced per event. To address this challenge, it is possible to represent IceCube events as point cloud graphs and use a Graph Neural Network (GNN) as the classification and reconstruction method. The GNN is capable of distinguishing neutrino events from cosmic-ray backgrounds, classifying different neutrino event types, and reconstructing the deposited energy, direction and interaction vertex. Based on simulation, we provide a comparison in the 1-100 GeV energy range to the current state-of-the-art maximum likelihood techniques used in current IceCube analyses, including the effects of known systematic uncertainties. For neutrino event classification, the GNN increases the signal efficiency by 18% at a fixed false positive rate (FPR), compared to current IceCube methods. Alternatively, the GNN offers a reduction of the FPR by over a factor 8 (to below half a percent) at a fixed signal efficiency. For the reconstruction of energy, direction, and interaction vertex, the resolution improves by an average of 13%-20% compared to current maximum likelihood techniques in the energy range of 1-30 GeV. The GNN, when run on a GPU, is capable of processing IceCube events at a rate nearly double of the median IceCube trigger rate of 2.7 kHz, which opens the possibility of using low energy neutrinos in online searches for transient events. △ Less

Submitted 11 October, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

Comments: Prepared for submission to JINST

arXiv:2208.07462 [pdf, other]

Speeding up random walk mixing by starting from a uniform vertex

Authors: Alberto Espuny Díaz, Patrick Morris, Guillem Perarnau, Oriol Serra

Abstract: The theory of rapid mixing random walks plays a fundamental role in the study of modern randomised algorithms. Usually, the mixing time is measured with respect to the worst initial position. It is well known that the presence of bottlenecks in a graph hampers mixing and, in particular, starting inside a small bottleneck significantly slows down the diffusion of the walk in the first steps of the… ▽ More The theory of rapid mixing random walks plays a fundamental role in the study of modern randomised algorithms. Usually, the mixing time is measured with respect to the worst initial position. It is well known that the presence of bottlenecks in a graph hampers mixing and, in particular, starting inside a small bottleneck significantly slows down the diffusion of the walk in the first steps of the process. The average mixing time is defined to be the mixing time starting at a uniformly random vertex and hence is not sensitive to the slow diffusion caused by these bottlenecks. In this paper we provide a general framework to show logarithmic average mixing time for random walks on graphs with small bottlenecks. The framework is especially effective on certain families of random graphs with heterogeneous properties. We demonstrate its applicability on two random models for which the mixing time was known to be of order $(\log n)^2$, speeding up the mixing to order $\log n$. First, in the context of smoothed analysis on connected graphs, we show logarithmic average mixing time for randomly perturbed graphs of bounded degeneracy. A particular instance is the Newman-Watts small-world model. Second, we show logarithmic average mixing time for supercritically percolated expander graphs. When the host graph is complete, this application gives an alternative proof that the average mixing time of the giant component in the supercritical Erdős-Rényi graph is logarithmic. △ Less

Submitted 27 January, 2024; v1 submitted 15 August, 2022; originally announced August 2022.

Comments: To appear in Electronic Journal of Probability

arXiv:2205.09005 [pdf, other]

Confidential Machine Learning within Graphcore IPUs

Authors: Kapil Vaswani, Stavros Volos, Cédric Fournet, Antonio Nino Diaz, Ken Gordon, Balaji Vembu, Sam Webster, David Chisnall, Saurabh Kulkarni, Graham Cunningham, Richard Osborne, Dan Wilkinson

Abstract: We present IPU Trusted Extensions (ITX), a set of experimental hardware extensions that enable trusted execution environments in Graphcore's AI accelerators. ITX enables the execution of AI workloads with strong confidentiality and integrity guarantees at low performance overheads. ITX isolates workloads from untrusted hosts, and ensures their data and models remain encrypted at all times except… ▽ More We present IPU Trusted Extensions (ITX), a set of experimental hardware extensions that enable trusted execution environments in Graphcore's AI accelerators. ITX enables the execution of AI workloads with strong confidentiality and integrity guarantees at low performance overheads. ITX isolates workloads from untrusted hosts, and ensures their data and models remain encrypted at all times except within the IPU. ITX includes a hardware root-of-trust that provides attestation capabilities and orchestrates trusted execution, and on-chip programmable cryptographic engines for authenticated encryption of code and data at PCIe bandwidth. We also present software for ITX in the form of compiler and runtime extensions that support multi-party training without requiring a CPU-based TEE. Experimental support for ITX is included in Graphcore's GC200 IPU taped out at TSMC's 7nm technology node. Its evaluation on a development board using standard DNN training workloads suggests that ITX adds less than 5% performance overhead, and delivers up to 17x better performance compared to CPU-based confidential computing systems relying on AMD SEV-SNP. △ Less

Submitted 20 May, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

arXiv:2106.02213 [pdf, other]

Analysis of the robustness of NMF algorithms

Authors: Alex Díaz, Damian Steele

Abstract: We examine three non-negative matrix factorization techniques; L2-norm, L1-norm, and L2,1-norm. Our aim is to establish the performance of these different approaches, and their robustness in real-world applications such as feature selection while managing computational complexity, sensitivity to noise and more. We thoroughly examine each approach from a theoretical perspective, and examine the per… ▽ More We examine three non-negative matrix factorization techniques; L2-norm, L1-norm, and L2,1-norm. Our aim is to establish the performance of these different approaches, and their robustness in real-world applications such as feature selection while managing computational complexity, sensitivity to noise and more. We thoroughly examine each approach from a theoretical perspective, and examine the performance of each using a series of experiments drawing on both the ORL and YaleB datasets. We examine the Relative Reconstruction Errors (RRE), Average Accuracy and Normalized Mutual Information (NMI) as criteria under a range of simulated noise scenarios. △ Less

Submitted 3 June, 2021; originally announced June 2021.

Comments: 16 pages

arXiv:2106.00274 [pdf, other]

Analysis of classifiers robust to noisy labels

Authors: Alex Díaz, Damian Steele

Abstract: We explore contemporary robust classification algorithms for overcoming class-dependant labelling noise: Forward, Importance Re-weighting and T-revision. The classifiers are trained and evaluated on class-conditional random label noise data while the final test data is clean. We demonstrate methods for estimating the transition matrix in order to obtain better classifier performance when working w… ▽ More We explore contemporary robust classification algorithms for overcoming class-dependant labelling noise: Forward, Importance Re-weighting and T-revision. The classifiers are trained and evaluated on class-conditional random label noise data while the final test data is clean. We demonstrate methods for estimating the transition matrix in order to obtain better classifier performance when working with noisy data. We apply deep learning to three data-sets and derive an end-to-end analysis with unknown noise on the CIFAR data-set from scratch. The effectiveness and robustness of the classifiers are analysed, and we compare and contrast the results of each experiment are using top-1 accuracy as our criterion. △ Less

Submitted 1 June, 2021; originally announced June 2021.

Comments: 12 pages

arXiv:2101.11589 [pdf, other]

doi 10.1088/1748-0221/16/07/P07041

A Convolutional Neural Network based Cascade Reconstruction for the IceCube Neutrino Observatory

Authors: R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, M. Ahrens, C. Alispach, A. A. Alves Jr., N. M. Amin, R. An, K. Andeen, T. Anderson, I. Ansseau, G. Anton, C. Argüelles, S. Axani, X. Bai, A. Balagopal V., A. Barbano, S. W. Barwick, B. Bastian, V. Basu, V. Baum, S. Baur, R. Bay , et al. (343 additional authors not shown)

Abstract: Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful an… ▽ More Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful and fast reconstruction methods are desired. Deep neural networks can be extremely powerful, and their usage is computationally inexpensive once the networks are trained. These characteristics make a deep learning-based approach an excellent candidate for the application in IceCube. A reconstruction method based on convolutional architectures and hexagonally shaped kernels is presented. The presented method is robust towards systematic uncertainties in the simulation and has been tested on experimental data. In comparison to standard reconstruction methods in IceCube, it can improve upon the reconstruction accuracy, while reducing the time necessary to run the reconstruction by two to three orders of magnitude. △ Less

Submitted 26 July, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

Comments: 39 pages, 15 figures, submitted to Journal of Instrumentation; added references

Journal ref: JINST 16 (2021) P07041

arXiv:2012.00902 [pdf]

Automatic Extraction of Ranked SNP-Phenotype Associations from Literature through Detecting Neural Candidates, Negation and Modality Markers

Authors: Behrouz Bokharaeian, Alberto Diaz

Abstract: Genome-wide association (GWA) constitutes a prominent portion of studies which have been conducted on personalized medicine and pharmacogenomics. Recently, very few methods have been developed for extracting mutation-diseases associations. However, there is no available method for extracting the association of SNP-phenotype from text which considers degree of confidence in associations. In this st… ▽ More Genome-wide association (GWA) constitutes a prominent portion of studies which have been conducted on personalized medicine and pharmacogenomics. Recently, very few methods have been developed for extracting mutation-diseases associations. However, there is no available method for extracting the association of SNP-phenotype from text which considers degree of confidence in associations. In this study, first a relation extraction method relying on linguistic-based negation detection and neutral candidates is proposed. The experiments show that negation cues and scope as well as detecting neutral candidates can be employed for implementing a superior relation extraction method which outperforms the kernel-based counterparts due to a uniform innate polarity of sentences and small number of complex sentences in the corpus. Moreover, a modality based approach is proposed to estimate the confidence level of the extracted association which can be used to assess the reliability of the reported association. Keywords: SNP, Phenotype, Biomedical Relation Extraction, Negation Detection. △ Less

Submitted 1 December, 2020; originally announced December 2020.

Comments: 8 pages, 4 figures, 3 tables

arXiv:2009.01657 [pdf, other]

A free web service for fast COVID-19 classification of chest X-Ray images

Authors: Jose David Bermudez Castro, Ricardo Rei, Jose E. Ruiz, Pedro Achanccaray Diaz, Smith Arauco Canchumuni, Cristian Muñoz Villalobos, Felipe Borges Coelho, Leonardo Forero Mendoza, Marco Aurelio C. Pacheco

Abstract: The coronavirus outbreak became a major concern for society worldwide. Technological innovation and ingenuity are essential to fight COVID-19 pandemic and bring us one step closer to overcome it. Researchers over the world are working actively to find available alternatives in different fields, such as the Healthcare System, pharmaceutic, health prevention, among others. With the rise of artificia… ▽ More The coronavirus outbreak became a major concern for society worldwide. Technological innovation and ingenuity are essential to fight COVID-19 pandemic and bring us one step closer to overcome it. Researchers over the world are working actively to find available alternatives in different fields, such as the Healthcare System, pharmaceutic, health prevention, among others. With the rise of artificial intelligence (AI) in the last 10 years, IA-based applications have become the prevalent solution in different areas because of its higher capability, being now adopted to help combat against COVID-19. This work provides a fast detection system of COVID-19 characteristics in X-Ray images based on deep learning (DL) techniques. This system is available as a free web deployed service for fast patient classification, alleviating the high demand for standards method for COVID-19 diagnosis. It is constituted of two deep learning models, one to differentiate between X-Ray and non-X-Ray images based on Mobile-Net architecture, and another one to identify chest X-Ray images with characteristics of COVID-19 based on the DenseNet architecture. For real-time inference, it is provided a pair of dedicated GPUs, which reduce the computational time. The whole system can filter out non-chest X-Ray images, and detect whether the X-Ray presents characteristics of COVID-19, highlighting the most sensitive regions. △ Less

Submitted 27 August, 2020; originally announced September 2020.

Comments: 14 pages, 12 figures

arXiv:2008.11872 [pdf, other]

doi 10.1016/j.compag.2020.105947

Towards Practical 2D Grapevine Bud Detection with Fully Convolutional Networks

Authors: Wenceslao Villegas Marset, Diego Sebastián Pérez, Carlos Ariel Díaz, Facundo Bromberg

Abstract: In Viticulture, visual inspection of the plant is a necessary task for measuring relevant variables. In many cases, these visual inspections are susceptible to automation through computer vision methods. Bud detection is one such visual task, central for the measurement of important variables such as: measurement of bud sunlight exposure, autonomous pruning, bud counting, type-of-bud classificatio… ▽ More In Viticulture, visual inspection of the plant is a necessary task for measuring relevant variables. In many cases, these visual inspections are susceptible to automation through computer vision methods. Bud detection is one such visual task, central for the measurement of important variables such as: measurement of bud sunlight exposure, autonomous pruning, bud counting, type-of-bud classification, bud geometric characterization, internode length, bud area, and bud development stage, among others. This paper presents a computer method for grapevine bud detection based on a Fully Convolutional Networks MobileNet architecture (FCN-MN). To validate its performance, this architecture was compared in the detection task with a strong method for bud detection, Scanning Windows (SW) based on a patch classifier, showing improvements over three aspects of detection: segmentation, correspondence identification and localization. The best version of FCN-MN showed a detection F1-measure of $88.6\%$ (for true positives defined as detected components whose intersection-over-union with the true bud is above $0.5$), and false positives that are small and near the true bud. Splits -- false positives overlap** the true bud -- showed a mean segmentation precision of $89.3\% (21.7)$, while false alarms -- false positives not overlap** the true bud -- showed a mean pixel area of only $8\%$ the area of a true bud, and a distance (between mass centers) of $1.1$ true bud diameters. The paper concludes by discussing how these results for FCN-MN would produce sufficiently accurate measurements of bud variables such as bud number, bud area, and internode length, suggesting a good performance in a practical setup. △ Less

Submitted 4 February, 2021; v1 submitted 26 August, 2020; originally announced August 2020.

Comments: Revised version submitted to Journal of Computers and Electronics in Agriculture November 13, 2020

Journal ref: Computers and Electronics in Agriculture, Volume 182, 2021,105947, ISSN 0168-1699

arXiv:1906.07298 [pdf]

Weighted delay-and-sum beamforming guided by visual tracking for human-robot interaction

Authors: José Novoa, Rodrigo Mahu, Alejandro Díaz, Jorge Wuth, Richard Stern, Nestor Becerra Yoma

Abstract: This paper describes the integration of weighted delay-and-sum beamforming with speech source localization using image processing and robot head visual servoing for source tracking. We take into consideration the fact that the directivity gain provided by the beamforming depends on the angular distance between its main lobe and the main response axis of the microphone array. A visual servoing sche… ▽ More This paper describes the integration of weighted delay-and-sum beamforming with speech source localization using image processing and robot head visual servoing for source tracking. We take into consideration the fact that the directivity gain provided by the beamforming depends on the angular distance between its main lobe and the main response axis of the microphone array. A visual servoing scheme is used to reduce the angular distance between the center of the video frame of a robot camera and a target object. Additionally, the beamforming strategy presented combines two information sources: the direction of the target object obtained with image processing and the audio signals provided by a microphone array. These sources of information were integrated by making use of a weighted delay-and-sum beamforming method. Experiments were carried out with a real mobile robotic testbed built with a PR2 robot. Static and dynamic robot head as well as the use of one and two external noise sources were considered. The results presented here show that the appropriate integration of visual source tracking with visual servoing and a beamforming method can lead to a reduction in WER as high as 34% compared to beamforming alone. △ Less

Submitted 17 June, 2019; originally announced June 2019.

arXiv:1903.08052 [pdf]

Trends on Computer Security: Cryptography, User Authentication, Denial of Service and Intrusion Detection

Authors: Pablo Daniel Marcillo Lara, Daniel Alejandro Maldonado-Ruiz, Santiago Daniel Arrais Díaz, Lorena Isabel Barona López, Ángel Leonardo Valdivieso Caraguay

Abstract: The new generation of security threats has been promoted by digital currencies and real-time applications, where all users develop new ways to communicate on the Internet. Security has evolved in the need of privacy and anonymity for all users and his portable devices. New technologies in every field prove that users need security features integrated into their communication applications, parallel… ▽ More The new generation of security threats has been promoted by digital currencies and real-time applications, where all users develop new ways to communicate on the Internet. Security has evolved in the need of privacy and anonymity for all users and his portable devices. New technologies in every field prove that users need security features integrated into their communication applications, parallel systems for mobile devices, internet, and identity management. This review presents the key concepts of the main areas in computer security and how it has evolved in the last years. This work focuses on cryptography, user authentication, denial of service attacks, intrusion detection and firewalls. △ Less

Submitted 19 March, 2019; originally announced March 2019.

Journal ref: Latin American Journal of Computing, 6(1), 2019, 39-49

arXiv:1811.11561 [pdf, other]

Approximate Evaluation of Label-Constrained Reachability Queries

Authors: Stefania Dumbrava, Angela Bonifati, Amaia Nazabal Ruiz Diaz, Romain Vuillemot

Abstract: The current surge of interest in graph-based data models mirrors the usage of increasingly complex reachability queries, as witnessed by recent analytical studies on real-world graph query logs. Despite the maturity of graph DBMS capabilities, complex label-constrained reachability queries, along with their corresponding aggregate versions, remain difficult to evaluate. In this paper, we focus on… ▽ More The current surge of interest in graph-based data models mirrors the usage of increasingly complex reachability queries, as witnessed by recent analytical studies on real-world graph query logs. Despite the maturity of graph DBMS capabilities, complex label-constrained reachability queries, along with their corresponding aggregate versions, remain difficult to evaluate. In this paper, we focus on the approximate evaluation of counting label-constrained reachability queries. We offer a human-explainable solution to graph Approximate Query Processing (AQP). This consists of a summarization algorithm (GRASP), as well as of a custom visualization plug-in, which allows users to explore the obtained summaries. We prove that the problem of node group minimization, associated to the creation of GRASP summaries, is NP-complete. Nonetheless, our GRASP summaries are reasonably small in practice, even for large graph instances, and guarantee approximate graph query answering, paired with controllable error estimates. We experimentally gauge the scalability and efficiency of our GRASP algorithm, and verify the accuracy and error estimation of the graph AQP module. To the best of our knowledge, ours is the first system capable of handling visualization-driven approximate graph analytics for complex label-constrained reachability queries. △ Less

Submitted 28 November, 2018; originally announced November 2018.

arXiv:1811.06078 [pdf, other]

Phishing in an Academic Community: A Study of User Susceptibility and Behavior

Authors: Alejandra Diaz, Alan T. Sherman, Anupam Joshi

Abstract: We present an observational study on the relationship between demographic factors and phishing susceptibility at the University of Maryland, Baltimore County (UMBC). In spring 2018, we delivered phishing attacks to 450 randomly-selected students on three different days (1,350 students total) to examine user click rates and demographics among UMBC's undergraduates. Participants were initially unawa… ▽ More We present an observational study on the relationship between demographic factors and phishing susceptibility at the University of Maryland, Baltimore County (UMBC). In spring 2018, we delivered phishing attacks to 450 randomly-selected students on three different days (1,350 students total) to examine user click rates and demographics among UMBC's undergraduates. Participants were initially unaware of the study. Experiment 1 claimed to bill students; Experiment 2 enticed users with monetary rewards; and Experiment 3 threatened users with account cancellation. We found correlations resulting in lowered susceptibility based on college affiliation, academic year progression, cyber training, involvement in cyber clubs or cyber scholarship programs, time spent on the computer, and age demographics. We found no significant correlation between gender and susceptibility. Contrary to our expectations, we observed greater user susceptibility with greater phishing knowledge and awareness. Students who identified themselves as understanding the definition of phishing had a higher susceptibility than did their peers who were merely aware of phishing attacks, with both groups having a higher susceptibility than those with no knowledge of phishing. Approximately 59% of subjects who opened the phishing email clicked on its phishing link, and approximately 70% of those subjects who additionally answered a demographic survey clicked. △ Less

Submitted 14 November, 2018; originally announced November 2018.

Comments: 7 pages, 5 figures, 3 tables, submitted to Cryptologia

arXiv:1809.10421 [pdf, ps, other]

Entropy versions of additive inequalities

Authors: Alberto Espuny Díaz, Oriol Serra

Abstract: The connection between inequalities in additive combinatorics and analogous versions in terms of the entropy of random variables has been extensively explored over the past few years. This paper extends a device introduced by Ruzsa in his seminal work introducing this correspondence. This extension provides a toolbox for establishing the equivalence between sumset inequalities and their entropic v… ▽ More The connection between inequalities in additive combinatorics and analogous versions in terms of the entropy of random variables has been extensively explored over the past few years. This paper extends a device introduced by Ruzsa in his seminal work introducing this correspondence. This extension provides a toolbox for establishing the equivalence between sumset inequalities and their entropic versions. It supplies simpler proofs of known results and opens a path for obtaining new ones. △ Less

Submitted 27 May, 2019; v1 submitted 27 September, 2018; originally announced September 2018.

Comments: The former version had an error the authors could not fix. This version keeps only that parts that did not depend on the incorrect statement

arXiv:1808.07269 [pdf, other]

doi 10.1103/PhysRevD.99.092001

A Deep Neural Network for Pixel-Level Electromagnetic Particle Identification in the MicroBooNE Liquid Argon Time Projection Chamber

Authors: MicroBooNE collaboration, C. Adams, M. Alrashed, R. An, J. Anthony, J. Asaadi, A. Ashkenazi, M. Auger, S. Balasubramanian, B. Baller, C. Barnes, G. Barr, M. Bass, F. Bay, A. Bhat, K. Bhattacharya, M. Bishai, A. Blake, T. Bolton, L. Camilleri, D. Caratelli, I. Caro Terrazas, R. Carr, R. Castillo Fernandez, F. Cavanna , et al. (148 additional authors not shown)

Abstract: We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction cha… ▽ More We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction chain for the MicroBooNE detector. We show the first demonstration of a network's validity on real LArTPC data using MicroBooNE collection plane images. The demonstration is performed for stop** muon and a $ν_μ$ charged current neutral pion data samples. △ Less

Submitted 22 August, 2018; originally announced August 2018.

Journal ref: Phys. Rev. D 99, 092001 (2019)

arXiv:1606.04589 [pdf, other]

Impossibility in Belief Merging

Authors: Amílcar Mata Díaz, Ramón Pino Pérez

Abstract: With the aim of studying social properties of belief merging and having a better understanding of impossibility, we extend in three ways the framework of logic-based merging introduced by Konieczny and Pino Pérez. First, at the level of representation of the information, we pass from belief bases to complex epistemic states. Second, the profiles are represented as functions of finite societies to… ▽ More With the aim of studying social properties of belief merging and having a better understanding of impossibility, we extend in three ways the framework of logic-based merging introduced by Konieczny and Pino Pérez. First, at the level of representation of the information, we pass from belief bases to complex epistemic states. Second, the profiles are represented as functions of finite societies to the set of epistemic states (a sort of vectors) and not as multisets of epistemic states. Third, we extend the set of rational postulates in order to consider the epistemic versions of the classical postulates of Social Choice Theory: Standard Domain, Pareto Property, Independence of Irrelevant Alternatives and Absence of Dictator. These epistemic versions of social postulates are given, essentially, in terms of the finite propositional logic. We state some representation theorems for these operators. These extensions and representation theorems allow us to establish an epistemic and very general version of Arrow's Impossibility Theorem. One of the interesting features of our result, is that it holds for different representations of epistemic states; for instance conditionals, Ordinal Conditional functions and, of course, total preorders. △ Less

Submitted 14 June, 2016; originally announced June 2016.

MSC Class: 68T27; 68T30

arXiv:1605.02775 [pdf, other]

Image Classification of Grapevine Buds using Scale-Invariant Features Transform, Bag of Features and Support Vector Machines

Authors: Diego Sebastián Pérez, Facundo Bromberg, Carlos Ariel Diaz

Abstract: In viticulture, there are several applications where bud detection in vineyard images is a necessary task, susceptible of being automated through the use of computer vision methods. A common and effective family of visual detection algorithms are the scanning-window type, that slide a (usually) fixed size window along the original image, classifying each resulting windowed-patch as containing or n… ▽ More In viticulture, there are several applications where bud detection in vineyard images is a necessary task, susceptible of being automated through the use of computer vision methods. A common and effective family of visual detection algorithms are the scanning-window type, that slide a (usually) fixed size window along the original image, classifying each resulting windowed-patch as containing or not containing the target object. The simplicity of these algorithms finds its most challenging aspect in the classification stage. Interested in grapevine buds detection in natural field conditions, this paper presents a classification method for images of grapevine buds ranging 100 to 1600 pixels in diameter, captured in outdoor, under natural field conditions, in winter (i.e., no grape bunches, very few leaves, and dormant buds), without artificial background, and with minimum equipment requirements. The proposed method uses well-known computer vision technologies: Scale-Invariant Feature Transform for calculating low-level features, Bag of Features for building an image descriptor, and Support Vector Machines for training a classifier. When evaluated over images containing buds of at least 100 pixels in diameter, the approach achieves a recall higher than 0.9 and a precision of 0.86 over all windowed-patches covering the whole bud and down to 60% of it, and scaled up to window patches containing a proportion of 20%-80% of bud versus background pixels. This robustness on the position and size of the window demonstrates its viability for use as the classification stage in a scanning-window detection algorithms. △ Less

Submitted 9 May, 2016; originally announced May 2016.

Comments: 21 pages, 7 figures, 2 tables

arXiv:1401.5054 [pdf]

Análisis e implementación de algoritmos evolutivos para la optimización de simulaciones en ingeniería civil. (draft)

Authors: José Alberto García Gutiérrez, Alejandro Mateo Hernández Díaz

Abstract: This paper studies the applicability of evolutionary algorithms, particularly, the evolution strategies family in order to estimate a degradation parameter in the shear design of reinforced concrete members. This problem represents a great computational task and is highly relevant in the framework of the structural engineering that for the first time is solved using genetic algorithms. You are v… ▽ More This paper studies the applicability of evolutionary algorithms, particularly, the evolution strategies family in order to estimate a degradation parameter in the shear design of reinforced concrete members. This problem represents a great computational task and is highly relevant in the framework of the structural engineering that for the first time is solved using genetic algorithms. You are viewing a draft, the authors appreciate corrections, comments and suggestions to this work. △ Less

Submitted 21 June, 2014; v1 submitted 20 January, 2014; originally announced January 2014.

Comments: In Spanish. The authors acknowledge corrections, comments and suggestions to this paper, thanks.

arXiv:0912.0898 [pdf]

Gouverner la standardisation par les changements d'arene. Le cas du XML

Authors: Pablo Andres Diaz, Francois-Xavier Dudouet, Jean-Christophe Graz, Benjamin Nguyen, Antoine Vion

Abstract: In this paper, we discuss the available approches of the new governance structures of standardization, in order to propose new hypothesis on the way computer sciences languages are dealt with. We consider the example of the XML language and its applications in order to propose a dynamic analysis of this governance, focusing on the coordination that is done by companies, and the strategic usage t… ▽ More In this paper, we discuss the available approches of the new governance structures of standardization, in order to propose new hypothesis on the way computer sciences languages are dealt with. We consider the example of the XML language and its applications in order to propose a dynamic analysis of this governance, focusing on the coordination that is done by companies, and the strategic usage they have of these arenas to further their goals. We advocate the development of more of such empirical analysis in order to cover all the perspectives of possible international policies in this area. △ Less

Submitted 4 December, 2009; originally announced December 2009.

Comments: Communication in the Economy of Politics and Politics of Economy session of the French Political Sciences Association Congress (AFSP), Grenoble, France 2009

ACM Class: K.1; K.4.2

Journal ref: Congres de l'Association Francaise de Sciences Politiques, Grenoble, France, 2009

Showing 1–27 of 27 results for author: Diaz, A