Search | arXiv e-print repository

Text Augmentations with R-drop for Classification of Tweets Self Reporting Covid-19

Authors: Sumam Francis, Marie-Francine Moens

Abstract: This paper presents models created for the Social Media Mining for Health 2023 shared task. Our team addressed the first task, classifying tweets that self-report Covid-19 diagnosis. Our approach involves a classification model that incorporates diverse textual augmentations and utilizes R-drop to augment data and mitigate overfitting, boosting model efficacy. Our leading model, enhanced with R-dr… ▽ More This paper presents models created for the Social Media Mining for Health 2023 shared task. Our team addressed the first task, classifying tweets that self-report Covid-19 diagnosis. Our approach involves a classification model that incorporates diverse textual augmentations and utilizes R-drop to augment data and mitigate overfitting, boosting model efficacy. Our leading model, enhanced with R-drop and augmentations like synonym substitution, reserved words, and back translations, outperforms the task mean and median scores. Our system achieves an impressive F1 score of 0.877 on the test set. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: This paper has been peer-reviewed and accepted for presentation at SMM4H'23 at AMIA 2023 Annual Symposium

arXiv:2311.03113 [pdf, other]

Injecting Categorical Labels and Syntactic Information into Biomedical NER

Authors: Sumam Francis, Marie-Francine Moens

Abstract: We present a simple approach to improve biomedical named entity recognition (NER) by injecting categorical labels and Part-of-speech (POS) information into the model. We use two approaches, in the first approach, we first train a sequence-level classifier to classify the sentences into categories to obtain the sentence-level tags (categorical labels). The sequence classifier is modeled as an entai… ▽ More We present a simple approach to improve biomedical named entity recognition (NER) by injecting categorical labels and Part-of-speech (POS) information into the model. We use two approaches, in the first approach, we first train a sequence-level classifier to classify the sentences into categories to obtain the sentence-level tags (categorical labels). The sequence classifier is modeled as an entailment problem by modifying the labels as a natural language template. This helps to improve the accuracy of the classifier. Further, this label information is injected into the NER model. In this paper, we demonstrate effective ways to represent and inject these labels and POS attributes into the NER model. In the second approach, we jointly learn the categorical labels and NER labels. Here we also inject the POS tags into the model to increase the syntactic context of the model. Experiments on three benchmark datasets show that incorporating categorical label information with syntactic context is quite useful and outperforms baseline BERT-based models. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: Proceedings of the 18th Conference on Computational Intelligence Methods for Bioinformatics & Biostatistics (CIBB 2023)

arXiv:2107.02260 [pdf, other]

doi 10.1109/JSEN.2021.3090790

Analysis of GRACE Follow-On Laser Ranging Interferometer derived inter-satellite pointing angles

Authors: Sujata Goswami, Samuel P. Francis, Tamara Bandikova, Robert E Spero

Abstract: Gravity Recovery and Climate Experiment Follow-On (GRACE-FO) was launched on May 22, 2018. It carries the Laser Ranging Interferometer (LRI) as a technology demonstrator that measures the inter-satellite range with nanometer precision using a laser-link between satellites. To maintain the laser-link between satellites, the LRI uses the beam steering method: a Fast Steering Mirror (FSM) is actuated… ▽ More Gravity Recovery and Climate Experiment Follow-On (GRACE-FO) was launched on May 22, 2018. It carries the Laser Ranging Interferometer (LRI) as a technology demonstrator that measures the inter-satellite range with nanometer precision using a laser-link between satellites. To maintain the laser-link between satellites, the LRI uses the beam steering method: a Fast Steering Mirror (FSM) is actuated to correct for misalignment between the incoming and outgoing laser beams. From the FSM commands, we can compute the inter-satellite pitch and yaw angles. These angles provide information about the spacecraft's relative orientation with respect to line-of-sight (LOS). We analyze LRI derived inter-satellite pointing angles for 2019 and 2020. Further, we present its comparison with the pointing angles derived from GRACE-FO SCA1B data, which represents the spacecraft attitude computed from star cameras and Inertial Measurement Unit (IMU) data using a Kalman filter. We discuss the correlations seen between the laser based attitude data and the spacecraft temperature variations. This analysis serves as the basis to explore the potential of this new attitude product obtained from the Differential Wavefront Sensing (DWS) control of a FSM. △ Less

Submitted 5 July, 2021; originally announced July 2021.

Comments: 14 pages, 11 figures, in IEEE Sensors Journal

arXiv:2105.07347 [pdf, other]

High resolution and sensitivity bi-directional x-ray phase contrast imaging using 2D Talbot array illuminators

Authors: Alex Gustschin, Mirko Riedel, Kirsten Taphorn, Christian Petrich, Wolfgang Gottwald, Wolfgang Noichl, Madleen Busse, Sheila E. Francis, Felix Beckmann, Jörg U. Hammel, Julian Moosmann, Pierre Thibault, Julia Herzen

Abstract: Two-dimensional Talbot array illuminators (TAIs) were designed, fabricated, and evaluated for high-resolution high-contrast x-ray phase imaging of soft tissue at 10-20keV. The TAIs create intensity modulations with a high compression ratio on the micrometer scale at short propagation distances. Their performance was compared with various other wavefront markers in terms of period, visibility, flux… ▽ More Two-dimensional Talbot array illuminators (TAIs) were designed, fabricated, and evaluated for high-resolution high-contrast x-ray phase imaging of soft tissue at 10-20keV. The TAIs create intensity modulations with a high compression ratio on the micrometer scale at short propagation distances. Their performance was compared with various other wavefront markers in terms of period, visibility, flux efficiency and flexibility to be adapted for limited beam coherence and detector resolution. Differential x-ray phase contrast and dark-field imaging were demonstrated with a one-dimensional, linear phase step** approach yielding two-dimensional phase sensitivity using Unified Modulated Pattern Analysis (UMPA) for phase retrieval. The method was employed for x-ray phase computed tomography reaching a resolution of 3$μ$m on an unstained murine artery. It opens new possibilities for three-dimensional, non-destructive, and quantitative imaging of soft matter such as virtual histology. The phase modulators can also be used for various other x-ray applications such as dynamic phase imaging, super-resolution structured illumination microscopy, or wavefront sensing. △ Less

Submitted 16 May, 2021; originally announced May 2021.

Comments: 17 pages, 10 figures

arXiv:2104.10322 [pdf, other]

Gradient Masked Federated Optimization

Authors: Irene Tenison, Sreya Francis, Irina Rish

Abstract: Federated Averaging (FedAVG) has become the most popular federated learning algorithm due to its simplicity and low communication overhead. We use simple examples to show that FedAVG has the tendency to sew together the optima across the participating clients. These sewed optima exhibit poor generalization when used on a new client with new data distribution. Inspired by the invariance principles… ▽ More Federated Averaging (FedAVG) has become the most popular federated learning algorithm due to its simplicity and low communication overhead. We use simple examples to show that FedAVG has the tendency to sew together the optima across the participating clients. These sewed optima exhibit poor generalization when used on a new client with new data distribution. Inspired by the invariance principles in (Arjovsky et al., 2019; Parascandolo et al., 2020), we focus on learning a model that is locally optimal across the different clients simultaneously. We propose a modification to FedAVG algorithm to include masked gradients (AND-mask from (Parascandolo et al., 2020)) across the clients and uses them to carry out an additional server model update. We show that this algorithm achieves better accuracy (out-of-distribution) than FedAVG, especially when the data is non-identically distributed across clients. △ Less

Submitted 20 April, 2021; originally announced April 2021.

ACM Class: I.2.0

Journal ref: ICLR 2021 Distributed and Private Machine Learning(DPML) Workshop

arXiv:2104.06557 [pdf, other]

Towards Causal Federated Learning For Enhanced Robustness and Privacy

Authors: Sreya Francis, Irene Tenison, Irina Rish

Abstract: Federated Learning is an emerging privacy-preserving distributed machine learning approach to building a shared model by performing distributed training locally on participating devices (clients) and aggregating the local models into a global one. As this approach prevents data collection and aggregation, it helps in reducing associated privacy risks to a great extent. However, the data samples ac… ▽ More Federated Learning is an emerging privacy-preserving distributed machine learning approach to building a shared model by performing distributed training locally on participating devices (clients) and aggregating the local models into a global one. As this approach prevents data collection and aggregation, it helps in reducing associated privacy risks to a great extent. However, the data samples across all participating clients are usually not independent and identically distributed (non-iid), and Out of Distribution(OOD) generalization for the learned models can be poor. Besides this challenge, federated learning also remains vulnerable to various attacks on security wherein a few malicious participating entities work towards inserting backdoors, degrading the generated aggregated model as well as inferring the data owned by participating entities. In this paper, we propose an approach for learning invariant (causal) features common to all participating clients in a federated learning setup and analyze empirically how it enhances the Out of Distribution (OOD) accuracy as well as the privacy of the final learned model. △ Less

Submitted 13 April, 2021; originally announced April 2021.

ACM Class: I.2.0

Journal ref: ICLR 2021 Distributed and Private Machine Learning(DPML) Workshop

arXiv:2011.05652 [pdf]

A simple synthesis method for growing single crystals of a copper coordination polymer [Cu(C2O4)(4-aminopyridine)2(H2O)]n, and its theoretical and physical properties studies

Authors: George Mathew, Sebastian Francis, Neeraj K. Rajak, Praveen S. G., C. V. Tomy, Deepshikha Jaiswal-Nagar

Abstract: This work reports on a novel and simple synthetic route for the growth of metal-organic crystal [Cu(C2O4)(4-aminopyridine)2(H2O)]n of large size using the technique of liquid-liquid diffusion or layer diffusion. Single crystal X-ray diffraction measurements revealed a very good quality of the grown single crystals with a small value 1.101 of goodness of fit R. Rietveld refinement done on powder X-… ▽ More This work reports on a novel and simple synthetic route for the growth of metal-organic crystal [Cu(C2O4)(4-aminopyridine)2(H2O)]n of large size using the technique of liquid-liquid diffusion or layer diffusion. Single crystal X-ray diffraction measurements revealed a very good quality of the grown single crystals with a small value 1.101 of goodness of fit R. Rietveld refinement done on powder X-ray diffractogram obtained on few single crystals crushed together revealed a very small value of R as 3.45, indicating very good crystal quality in a batch of crystals. Density functional theory with three different basis sets generated the optimized geometry of a monomeric unit as well as its vibrational spectra. Comparison between experimentally obtained bond lengths, bond angles, IR frequencies etc. suggest (B3LYP/LanL2DZ, B3LYP/6-311++ G(d,p) basis set to describe the properties the best. Magnetic susceptibility measurements confirm the metal-organic crystal [Cu(C2O4)(4-aminopyridine)2(H2O)]n to be a very good representation of a spin 1/2 Heisenberg antiferromagnet. △ Less

Submitted 11 November, 2020; originally announced November 2020.

Comments: Accepted for publication in Crystal Research And Technology

arXiv:2004.13470 [pdf]

doi 10.1007/978-3-030-34110-7_44

FU-net: Multi-class Image Segmentation Using Feedback Weighted U-net

Authors: Mina Jafari, Ruizhe Li, Yue Xing, Dorothee Auer, Susan Francis, Jonathan Garibaldi, Xin Chen

Abstract: In this paper, we present a generic deep convolutional neural network (DCNN) for multi-class image segmentation. It is based on a well-established supervised end-to-end DCNN model, known as U-net. U-net is firstly modified by adding widely used batch normalization and residual block (named as BRU-net) to improve the efficiency of model training. Based on BRU-net, we further introduce a dynamically… ▽ More In this paper, we present a generic deep convolutional neural network (DCNN) for multi-class image segmentation. It is based on a well-established supervised end-to-end DCNN model, known as U-net. U-net is firstly modified by adding widely used batch normalization and residual block (named as BRU-net) to improve the efficiency of model training. Based on BRU-net, we further introduce a dynamically weighted cross-entropy loss function. The weighting scheme is calculated based on the pixel-wise prediction accuracy during the training process. Assigning higher weights to pixels with lower segmentation accuracies enables the network to learn more from poorly predicted image regions. Our method is named as feedback weighted U-net (FU-net). We have evaluated our method based on T1- weighted brain MRI for the segmentation of midbrain and substantia nigra, where the number of pixels in each class is extremely unbalanced to each other. Based on the dice coefficient measurement, our proposed FU-net has outperformed BRU-net and U-net with statistical significance, especially when only a small number of training examples are available. The code is publicly available in GitHub (GitHub link: https://github.com/MinaJf/FU-net). △ Less

Submitted 28 April, 2020; originally announced April 2020.

Comments: Accepted for publication at International Conference on Image and Graphics (ICIG 2019)

Journal ref: The 10th International Conference on Image and Graphics (ICIG 2019)

arXiv:2004.13453 [pdf]

DRU-net: An Efficient Deep Convolutional Neural Network for Medical Image Segmentation

Authors: Mina Jafari, Dorothee Auer, Susan Francis, Jonathan Garibaldi, Xin Chen

Abstract: Residual network (ResNet) and densely connected network (DenseNet) have significantly improved the training efficiency and performance of deep convolutional neural networks (DCNNs) mainly for object classification tasks. In this paper, we propose an efficient network architecture by considering advantages of both networks. The proposed method is integrated into an encoder-decoder DCNN model for me… ▽ More Residual network (ResNet) and densely connected network (DenseNet) have significantly improved the training efficiency and performance of deep convolutional neural networks (DCNNs) mainly for object classification tasks. In this paper, we propose an efficient network architecture by considering advantages of both networks. The proposed method is integrated into an encoder-decoder DCNN model for medical image segmentation. Our method adds additional skip connections compared to ResNet but uses significantly fewer model parameters than DenseNet. We evaluate the proposed method on a public dataset (ISIC 2018 grand-challenge) for skin lesion segmentation and a local brain MRI dataset. In comparison with ResNet-based, DenseNet-based and attention network (AttnNet) based methods within the same encoder-decoder network structure, our method achieves significantly higher segmentation accuracy with fewer number of model parameters than DenseNet and AttnNet. The code is available on GitHub (GitHub link: https://github.com/MinaJf/DRU-net). △ Less

Submitted 28 April, 2020; originally announced April 2020.

Comments: Accepted for publication at IEEE International Symposium on Biomedical Imaging (ISBI) 2020, 5 pages, 3 figures

Journal ref: 2020 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2020)

arXiv:2004.09443 [pdf, other]

A Spatially Constrained Deep Convolutional Neural Network for Nerve Fiber Segmentation in Corneal Confocal Microscopic Images using Inaccurate Annotations

Authors: Ning Zhang, Susan Francis, Rayaz Malik, Xin Chen

Abstract: Semantic image segmentation is one of the most important tasks in medical image analysis. Most state-of-the-art deep learning methods require a large number of accurately annotated examples for model training. However, accurate annotation is difficult to obtain especially in medical applications. In this paper, we propose a spatially constrained deep convolutional neural network (DCNN) to achieve… ▽ More Semantic image segmentation is one of the most important tasks in medical image analysis. Most state-of-the-art deep learning methods require a large number of accurately annotated examples for model training. However, accurate annotation is difficult to obtain especially in medical applications. In this paper, we propose a spatially constrained deep convolutional neural network (DCNN) to achieve smooth and robust image segmentation using inaccurately annotated labels for training. In our proposed method, image segmentation is formulated as a graph optimization problem that is solved by a DCNN model learning process. The cost function to be optimized consists of a unary term that is calculated by cross entropy measurement and a pairwise term that is based on enforcing a local label consistency. The proposed method has been evaluated based on corneal confocal microscopic (CCM) images for nerve fiber segmentation, where accurate annotations are extremely difficult to be obtained. Based on both the quantitative result of a synthetic dataset and qualitative assessment of a real dataset, the proposed method has achieved superior performance in producing high quality segmentation results even with inaccurate labels for training. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Comments: 4 pages, accepted for publication at IEEE International Symposium on Biomedical Imaging (ISBI) 2020

Journal ref: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI 2020)

arXiv:1907.06482 [pdf, other]

The Laser Interferometer Space Antenna: Unveiling the Millihertz Gravitational Wave Sky

Authors: John Baker, Jillian Bellovary, Peter L. Bender, Emanuele Berti, Robert Caldwell, Jordan Camp, John W. Conklin, Neil Cornish, Curt Cutler, Ryan DeRosa, Michael Eracleous, Elizabeth C. Ferrara, Samuel Francis, Martin Hewitson, Kelly Holley-Bockelmann, Ann Hornschemeier, Craig Hogan, Brittany Kamai, Bernard J. Kelly, Joey Shapiro Key, Shane L. Larson, Jeff Livas, Sridhar Manthripragada, Kirk McKenzie, Sean T. McWilliams , et al. (17 additional authors not shown)

Abstract: The first terrestrial gravitational wave interferometers have dramatically underscored the scientific value of observing the Universe through an entirely different window, and of folding this new channel of information with traditional astronomical data for a multimessenger view. The Laser Interferometer Space Antenna (LISA) will broaden the reach of gravitational wave astronomy by conducting the… ▽ More The first terrestrial gravitational wave interferometers have dramatically underscored the scientific value of observing the Universe through an entirely different window, and of folding this new channel of information with traditional astronomical data for a multimessenger view. The Laser Interferometer Space Antenna (LISA) will broaden the reach of gravitational wave astronomy by conducting the first survey of the millihertz gravitational wave sky, detecting tens of thousands of individual astrophysical sources ranging from white-dwarf binaries in our own galaxy to mergers of massive black holes at redshifts extending beyond the epoch of reionization. These observations will inform - and transform - our understanding of the end state of stellar evolution, massive black hole birth, and the co-evolution of galaxies and black holes through cosmic time. LISA also has the potential to detect gravitational wave emission from elusive astrophysical sources such as intermediate-mass black holes as well as exotic cosmological sources such as inflationary fields and cosmic string cusps. △ Less

Submitted 26 July, 2019; v1 submitted 15 July, 2019; originally announced July 2019.

Comments: White Paper submitted to Astro2020 (2020 Decadal Survey on Astronomy and Astrophysics). v2: fixed a reference

arXiv:1907.00104 [pdf, other]

doi 10.1103/PhysRevLett.123.031101

On orbit performance of the GRACE Follow-On Laser Ranging Interferometer

Authors: Klaus Abich, Claus Braxmaier, Martin Gohlke, Josep Sanjuan, Alexander Abramovici, Brian Bachman Okihiro, David C. Barr, Maxime P. Bize, Michael J. Burke, Ken C. Clark, Glenn de Vine, Jeffrey A. Dickson, Serge Dubovitsky, William M. Folkner, Samuel Francis, Martin S. Gilbert, Mark Katsumura, William Klipstein, Kameron Larsen, Carl Christian Liebe, Jehhal Liu, Kirk McKenzie, Phillip R. Morton, Alexander T. Murray, Don J. Nguyen , et al. (58 additional authors not shown)

Abstract: The Laser Ranging Interferometer (LRI) instrument on the Gravity Recovery and Climate Experiment (GRACE) Follow-On mission has provided the first laser interferometric range measurements between remote spacecraft, separated by approximately 220 km. Autonomous controls that lock the laser frequency to a cavity reference and establish the 5 degree of freedom two-way laser link between remote spacecr… ▽ More The Laser Ranging Interferometer (LRI) instrument on the Gravity Recovery and Climate Experiment (GRACE) Follow-On mission has provided the first laser interferometric range measurements between remote spacecraft, separated by approximately 220 km. Autonomous controls that lock the laser frequency to a cavity reference and establish the 5 degree of freedom two-way laser link between remote spacecraft succeeded on the first attempt. Active beam pointing based on differential wavefront sensing compensates spacecraft attitude fluctuations. The LRI has operated continuously without breaks in phase tracking for more than 50 days, and has shown biased range measurements similar to the primary ranging instrument based on microwaves, but with much less noise at a level of $1\,{\rm nm}/\sqrt{\rm Hz}$ at Fourier frequencies above 100 mHz. △ Less

Submitted 28 June, 2019; originally announced July 2019.

Journal ref: Phys. Rev. Lett. 123 031101 19 July 2019

arXiv:1501.05763 [pdf, ps, other]

Mixed Effect Modelling of Single Trial Variability in Ultra-High Field fMRI

Authors: Christopher J. Brignell, William J. Browne, Ian L. Dryden, Susan T. Francis

Abstract: Neuronal brain activity in response to repeated stimuli can be perceived using functional magnetic resonance imaging (fMRI). In this paper, we develop a statistical model for fMRI data that estimates both the associated haemodynamic response function and the within and between trial variability through maximum likelihood estimation. We discuss our results in the context of other model-driven appro… ▽ More Neuronal brain activity in response to repeated stimuli can be perceived using functional magnetic resonance imaging (fMRI). In this paper, we develop a statistical model for fMRI data that estimates both the associated haemodynamic response function and the within and between trial variability through maximum likelihood estimation. We discuss our results in the context of other model-driven approaches, extending models already popular in the literature, while removing the need for some of their assumptions. We consider an application to the motor cortex activity caused by a subject pressing a button and observe that the response changes significantly with task and through time. △ Less

Submitted 23 January, 2015; originally announced January 2015.

arXiv:1412.6413 [pdf]

doi 10.14445/22312803/IJCTT-V17P112

Towards a Consistent, Sound and Complete Conceptual Knowledge

Authors: Gowri Shankar Ramaswamy, F Sagayaraj Francis

Abstract: Knowledge is only good if it is sound, consistent and complete. The same holds true for conceptual knowledge, which holds knowledge about concepts and its association. Conceptual knowledge no matter what format they are represented in, must be consistent, sound and complete in order to realise its practical use. This paper discusses consistency, soundness and completeness in the ambit of conceptua… ▽ More Knowledge is only good if it is sound, consistent and complete. The same holds true for conceptual knowledge, which holds knowledge about concepts and its association. Conceptual knowledge no matter what format they are represented in, must be consistent, sound and complete in order to realise its practical use. This paper discusses consistency, soundness and completeness in the ambit of conceptual knowledge and the need to consider these factors as fundamental to the development of conceptual knowledge. △ Less

Submitted 24 November, 2014; originally announced December 2014.

Journal ref: International Journal of Computer Trends and Technology (IJCTT) V17(2):61-63, Nov 2014

arXiv:1412.3367 [pdf]

Experimenting with Request Assignment Simulator (RAS)

Authors: R. Arokia Paul Rajan, F. Sagayaraj Francis

Abstract: There is no existence of dedicated simulators on the Internet that studies the impact of load balancing principles of the cloud architectures. Request Assignment Simulator (RAS) is a customizable, visual tool that helps to understand the request assignment to the resources based on the load balancing principles. We have designed this simulator to fit into Infrastructure as a Service (IaaS) cloud m… ▽ More There is no existence of dedicated simulators on the Internet that studies the impact of load balancing principles of the cloud architectures. Request Assignment Simulator (RAS) is a customizable, visual tool that helps to understand the request assignment to the resources based on the load balancing principles. We have designed this simulator to fit into Infrastructure as a Service (IaaS) cloud model. In this paper, we present a working manual useful for the conduct of experiment with RAS. The objective of this paper is to instill the user to understand the pertinent parameters in the cloud, their metrics, load balancing principles, and their impact on the performance. △ Less

Submitted 10 December, 2014; originally announced December 2014.

Comments: November 2014, IJCSE

arXiv:1308.1471 [pdf]

Application of Inventory Management Principles for Efficient Data Placement in Storage Networks

Authors: R. Arokia Paul Rajan, F. Sagayaraj Francis

Abstract: The principles and strategies found in material management are comparable and analogue with the data management. This paper concentrates on the conversion of product inventory management principles into data inventory management principles. Efforts were made to enumerate various impacting parameters that would be appropriate to consider if any data inventory model could be plotted. The principles and strategies found in material management are comparable and analogue with the data management. This paper concentrates on the conversion of product inventory management principles into data inventory management principles. Efforts were made to enumerate various impacting parameters that would be appropriate to consider if any data inventory model could be plotted. △ Less

Submitted 7 August, 2013; originally announced August 2013.

Comments: IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 6, No 2, November 2012

arXiv:1208.4016 [pdf]

Concept driven framework for Latent Table Discovery

Authors: Gowri Shankar Ramaswamy, F Sagayaraj Francis

Abstract: Database systems have to cater to the growing demands of the information age. The growth of the new age information retrieval powerhouses like search engines has thrown a challenge to the data management community to come up with novel mechanisms for feeding information to end users. The burgeoning use of natural language query interfaces compels system designers to present meaningful and customis… ▽ More Database systems have to cater to the growing demands of the information age. The growth of the new age information retrieval powerhouses like search engines has thrown a challenge to the data management community to come up with novel mechanisms for feeding information to end users. The burgeoning use of natural language query interfaces compels system designers to present meaningful and customised information. Conventional query languages like SQL do not cater to these requirements due to syntax oriented design. Providing a semantic cover over these systems was the aim of latent table discovery focusing on semantically connecting unrelated tables that were not syntactically related by design and document the discovered knowledge. This paper throws a new direction towards improving the semantic capabilities of database systems by introducing a concept driven framework over the latent table discovery method. △ Less

Submitted 20 August, 2012; originally announced August 2012.

Journal ref: JOURNAL OF COMPUTING, VOLUME 4, ISSUE 7, JULY 2012, ISSN (Online) 2151-9617

arXiv:1104.1311 [pdf]

Latent table discovery by semantic relationship extraction between unrelated sets of entity sets of structured data sources

Authors: Gowri Shankar Ramaswamy, F Sagayaraj Francis

Abstract: Querying is one of the basic functionality expected from a database system. Query efficiency is adversely affected by increase in the number of participating tables. Also, querying based on syntax largely limits the gamut of queries a database system can process. Syntactic queries rely on the database table structure, which is a cause of concern for large organisations due to incompatibility betwe… ▽ More Querying is one of the basic functionality expected from a database system. Query efficiency is adversely affected by increase in the number of participating tables. Also, querying based on syntax largely limits the gamut of queries a database system can process. Syntactic queries rely on the database table structure, which is a cause of concern for large organisations due to incompatibility between heterogeneous systems that store data distributed across geographic locations. Solution to these problems is answered to some extent by moving towards semantic technology by making data and the database meaningful. In doing so, relationship between sets of entity sets will not be limited only to syntactic constraints but would also permit semantic connections nonetheless such relationships may be tacit, intangible and invisible. The goal of this work is to extract such hidden relationships between unrelated sets of entity sets and store them in a tangible form. A few sample cases are provided to vindicate that the proposed work improves querying significantly. △ Less

Submitted 7 April, 2011; originally announced April 2011.

Journal ref: IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 2, March 2011

Showing 1–18 of 18 results for author: Francis, S