Skip to main content

Showing 1–30 of 30 results for author: Dras, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19642  [pdf, other

    cs.CL cs.CR cs.LG

    IDT: Dual-Task Adversarial Attacks for Privacy Protection

    Authors: Pedro Faustini, Shakila Mahjabin Tonni, Annabelle McIver, Qiongkai Xu, Mark Dras

    Abstract: Natural language processing (NLP) models may leak private information in different ways, including membership inference, reconstruction or attribute inference attacks. Sensitive information may not be explicit in the text, but hidden in underlying writing characteristics. Methods to protect privacy can involve using representations inside models that are demonstrated not to detect sensitive attrib… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 28 pages, 1 figure

  2. arXiv:2406.13569  [pdf, other

    cs.LG cs.AI cs.CR cs.IT

    Bayes' capacity as a measure for reconstruction attacks in federated learning

    Authors: Sayan Biswas, Mark Dras, Pedro Faustini, Natasha Fernandes, Annabelle McIver, Catuscia Palamidessi, Parastoo Sadeghi

    Abstract: Within the machine learning community, reconstruction attacks are a principal attack of concern and have been identified even in federated learning, which was designed with privacy preservation in mind. In federated learning, it has been shown that an adversary with knowledge of the machine learning architecture is able to infer the exact value of a training element given an observation of the wei… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  3. arXiv:2406.00999  [pdf, other

    cs.LG cs.CL cs.CR

    Seeing the Forest through the Trees: Data Leakage from Partial Transformer Gradients

    Authors: Weijun Li, Qiongkai Xu, Mark Dras

    Abstract: Recent studies have shown that distributed machine learning is vulnerable to gradient inversion attacks, where private training data can be reconstructed by analyzing the gradients of the models shared in training. Previous attacks established that such reconstructions are possible using gradients from all parameters in the entire models. However, we hypothesize that most of the involved modules,… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 12 pages, 7 figures

    ACM Class: I.2.7; I.2.11

  4. arXiv:2402.19334  [pdf, other

    cs.CL

    Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge

    Authors: Ansh Arora, Xuanli He, Maximilian Mozes, Srinibas Swain, Mark Dras, Qiongkai Xu

    Abstract: The democratization of pre-trained language models through open-source initiatives has rapidly advanced innovation and expanded access to cutting-edge technologies. However, this openness also brings significant security risks, including backdoor attacks, where hidden malicious behaviors are triggered by specific inputs, compromising natural language processing (NLP) system integrity and reliabili… ▽ More

    Submitted 3 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: accepted to ACL2024 (Findings)

  5. arXiv:2309.10916  [pdf, other

    cs.LG cs.CL

    What Learned Representations and Influence Functions Can Tell Us About Adversarial Examples

    Authors: Shakila Mahjabin Tonni, Mark Dras

    Abstract: Adversarial examples, deliberately crafted using small perturbations to fool deep neural networks, were first studied in image processing and more recently in NLP. While approaches to detecting adversarial examples in NLP have largely relied on search over input perturbations, image processing has seen a range of techniques that aim to characterise adversarial subspaces over the learned representa… ▽ More

    Submitted 10 October, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: 20 pages, Accepted in IJCNLP_AACL 2023

  6. arXiv:2306.12703  [pdf, other

    cs.LG cs.AI

    OptIForest: Optimal Isolation Forest for Anomaly Detection

    Authors: Haolong Xiang, Xuyun Zhang, Hongsheng Hu, Lianyong Qi, Wanchun Dou, Mark Dras, Amin Beheshti, Xiaolong Xu

    Abstract: Anomaly detection plays an increasingly important role in various fields for critical tasks such as intrusion detection in cybersecurity, financial risk detection, and human health monitoring. A variety of anomaly detection methods have been proposed, and a category based on the isolation forest mechanism stands out due to its simplicity, effectiveness, and efficiency, e.g., iForest is often emplo… ▽ More

    Submitted 23 June, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

    Comments: This paper has been accepted by International Joint Conference on Artificial Intelligence (IJCAI-23)

  7. arXiv:2211.04686  [pdf, other

    cs.LG cs.CR

    Directional Privacy for Deep Learning

    Authors: Pedro Faustini, Natasha Fernandes, Shakila Tonni, Annabelle McIver, Mark Dras

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) is a key method for applying privacy in the training of deep learning models. It applies isotropic Gaussian noise to gradients during training, which can perturb these gradients in any direction, damaging utility. Metric DP, however, can provide alternative mechanisms based on arbitrary metrics that might be more suitable for preserving u… ▽ More

    Submitted 26 November, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  8. arXiv:2204.13853  [pdf, other

    cs.CL cs.LG

    Detecting Textual Adversarial Examples Based on Distributional Characteristics of Data Representations

    Authors: Na Liu, Mark Dras, Wei Emma Zhang

    Abstract: Although deep neural networks have achieved state-of-the-art performance in various machine learning tasks, adversarial examples, constructed by adding small non-random perturbations to correctly classified inputs, successfully fool highly expressive deep classifiers into incorrect predictions. Approaches to adversarial attacks in natural language tasks have boomed in the last five years using cha… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: 13 pages, RepL4NLP 2022

  9. arXiv:2203.10093  [pdf, other

    cs.LG cs.AI cs.NE q-bio.NC

    Deep reinforcement learning guided graph neural networks for brain network analysis

    Authors: Xusheng Zhao, Jia Wu, Hao Peng, Amin Beheshti, Jessica J. M. Monaghan, David McAlpine, Heivet Hernandez-Perez, Mark Dras, Qiong Dai, Yangyang Li, Philip S. Yu, Lifang He

    Abstract: Modern neuroimaging techniques, such as diffusion tensor imaging (DTI) and functional magnetic resonance imaging (fMRI), enable us to model the human brain as a brain network or connectome. Capturing brain networks' structural information and hierarchical patterns is essential for understanding brain functions and disease states. Recently, the promising network representation learning capability o… ▽ More

    Submitted 24 July, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

  10. arXiv:2107.13077  [pdf, other

    cs.CL

    Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation

    Authors: Yufei Wang, Can Xu, Huang Hu, Chongyang Tao, Stephen Wan, Mark Dras, Mark Johnson, Daxin Jiang

    Abstract: Sequence-to-Sequence (S2S) neural text generation models, especially the pre-trained ones (e.g., BART and T5), have exhibited compelling performance on various natural language generation tasks. However, the black-box nature of these models limits their application in tasks where specific rules (e.g., controllable constraints, prior knowledge) need to be executed. Previous works either design spec… ▽ More

    Submitted 27 July, 2021; originally announced July 2021.

  11. Pick-Object-Attack: Type-Specific Adversarial Attack for Object Detection

    Authors: Omid Mohamad Nezami, Akshay Chaturvedi, Mark Dras, Utpal Garain

    Abstract: Many recent studies have shown that deep neural models are vulnerable to adversarial samples: images with imperceptible perturbations, for example, can fool image classifiers. In this paper, we present the first type-specific approach to generating adversarial examples for object detection, which entails detecting bounding boxes around multiple objects present in the image and classifying them at… ▽ More

    Submitted 21 August, 2021; v1 submitted 4 June, 2020; originally announced June 2020.

  12. arXiv:1912.10616  [pdf, other

    cs.CL

    Siamese Networks for Large-Scale Author Identification

    Authors: Chakaveh Saedi, Mark Dras

    Abstract: Authorship attribution is the process of identifying the author of a text. Approaches to tackling it have been conventionally divided into classification-based ones, which work well for small numbers of candidate authors, and similarity-based methods, which are applicable for larger numbers of authors or for authors beyond the training set; these existing similarity-based methods have only embodie… ▽ More

    Submitted 15 May, 2021; v1 submitted 22 December, 2019; originally announced December 2019.

    Comments: 28 pages. Accepted by Computer Speech and Language

  13. arXiv:1908.02943  [pdf, other

    cs.CV cs.CL

    Towards Generating Stylized Image Captions via Adversarial Training

    Authors: Omid Mohamad Nezami, Mark Dras, Stephen Wan, Cecile Paris, Len Hamey

    Abstract: While most image captioning aims to generate objective descriptions of images, the last few years have seen work on generating visually grounded image captions which have a specific style (e.g., incorporating positive or negative sentiment). However, because the stylistic component is typically the last part of training, current models usually pay more attention to the style at the expense of accu… ▽ More

    Submitted 8 August, 2019; originally announced August 2019.

  14. arXiv:1908.02923  [pdf, other

    cs.CV cs.CL

    Image Captioning using Facial Expression and Attention

    Authors: Omid Mohamad Nezami, Mark Dras, Stephen Wan, Cecile Paris

    Abstract: Benefiting from advances in machine vision and natural language processing techniques, current image captioning systems are able to generate detailed visual descriptions. For the most part, these descriptions represent an objective characterisation of the image, although some models do incorporate subjective aspects related to the observer's view of the image, such as sentiment; current models, ho… ▽ More

    Submitted 14 April, 2020; v1 submitted 8 August, 2019; originally announced August 2019.

  15. arXiv:1811.10256  [pdf, other

    cs.CR cs.LG

    Generalised Differential Privacy for Text Document Processing

    Authors: Natasha Fernandes, Mark Dras, Annabelle McIver

    Abstract: We address the problem of how to "obfuscate" texts by removing stylistic clues which can identify authorship, whilst preserving (as much as possible) the content of the text. In this paper we combine ideas from "generalised differential privacy" and machine learning techniques for text processing to model privacy for text documents. We define a privacy mechanism that operates at the level of text… ▽ More

    Submitted 5 February, 2019; v1 submitted 26 November, 2018; originally announced November 2018.

    Comments: Typos corrected

  16. arXiv:1811.09789  [pdf, other

    cs.CV

    Senti-Attend: Image Captioning using Sentiment and Attention

    Authors: Omid Mohamad Nezami, Mark Dras, Stephen Wan, Cecile Paris

    Abstract: There has been much recent work on image captioning models that describe the factual aspects of an image. Recently, some models have incorporated non-factual aspects into the captions, such as sentiment or style. However, such models typically have difficulty in balancing the semantic aspects of the image and the non-factual dimensions of the caption; in addition, it can be observed that humans ma… ▽ More

    Submitted 24 November, 2018; originally announced November 2018.

  17. arXiv:1808.02324  [pdf, other

    cs.CV cs.HC

    Automatic Recognition of Student Engagement using Deep Learning and Facial Expression

    Authors: Omid Mohamad Nezami, Mark Dras, Len Hamey, Deborah Richards, Stephen Wan, Cecile Paris

    Abstract: Engagement is a key indicator of the quality of learning experience, and one that plays a major role in develo** intelligent educational interfaces. Any such interface requires the ability to recognise the level of engagement in order to respond appropriately; however, there is very little existing data to learn from, and new data is expensive and difficult to acquire. This paper presents a deep… ▽ More

    Submitted 8 July, 2019; v1 submitted 7 August, 2018; originally announced August 2018.

  18. arXiv:1807.02250  [pdf, other

    cs.CV

    Face-Cap: Image Captioning using Facial Expression Analysis

    Authors: Omid Mohamad Nezami, Mark Dras, Peter Anderson, Len Hamey

    Abstract: Image captioning is the process of generating a natural language description of an image. Most current image captioning models, however, do not take into account the emotional aspect of an image, which is very relevant to activities and interpersonal relationships represented therein. Towards develo** a model that can produce human-like captions incorporating these, we use facial expression feat… ▽ More

    Submitted 25 January, 2019; v1 submitted 6 July, 2018; originally announced July 2018.

  19. arXiv:1805.08866  [pdf, ps, other

    cs.CR

    Author Obfuscation Using Generalised Differential Privacy

    Authors: Natasha Fernandes, Mark Dras, Annabelle McIver

    Abstract: The problem of obfuscating the authorship of a text document has received little attention in the literature to date. Current approaches are ad-hoc and rely on assumptions about an adversary's auxiliary knowledge which makes it difficult to reason about the privacy properties of these methods. Differential privacy is a well-known and robust privacy approach, but its reliance on the notion of adjac… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

  20. VnCoreNLP: A Vietnamese Natural Language Processing Toolkit

    Authors: Thanh Vu, Dat Quoc Nguyen, Dai Quoc Nguyen, Mark Dras, Mark Johnson

    Abstract: We present an easy-to-use and fast toolkit, namely VnCoreNLP---a Java NLP annotation pipeline for Vietnamese. Our VnCoreNLP supports key natural language processing (NLP) tasks including word segmentation, part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing, and obtains state-of-the-art (SOTA) results for these tasks. We release VnCoreNLP to provide rich linguistic… ▽ More

    Submitted 1 April, 2018; v1 submitted 4 January, 2018; originally announced January 2018.

    Comments: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, NAACL 2018, to appear

  21. arXiv:1711.04951  [pdf, other

    cs.CL

    From Word Segmentation to POS Tagging for Vietnamese

    Authors: Dat Quoc Nguyen, Thanh Vu, Dai Quoc Nguyen, Mark Dras, Mark Johnson

    Abstract: This paper presents an empirical comparison of two strategies for Vietnamese Part-of-Speech (POS) tagging from unsegmented text: (i) a pipeline strategy where we consider the output of a word segmenter as the input of a POS tagger, and (ii) a joint strategy where we predict a combined segmentation and POS tag for each syllable. We also make a comparison between state-of-the-art (SOTA) feature-base… ▽ More

    Submitted 14 November, 2017; originally announced November 2017.

    Comments: To appear in Proceedings of the 15th Annual Workshop of the Australasian Language Technology Association, ALTA 2017

  22. arXiv:1709.06307  [pdf, other

    cs.CL

    A Fast and Accurate Vietnamese Word Segmenter

    Authors: Dat Quoc Nguyen, Dai Quoc Nguyen, Thanh Vu, Mark Dras, Mark Johnson

    Abstract: We propose a novel approach to Vietnamese word segmentation. Our approach is based on the Single Classification Ripple Down Rules methodology (Compton and Jansen, 1990), where rules are stored in an exception structure and new rules are only added to correct segmentation errors given by existing rules. Experimental results on the benchmark Vietnamese treebank show that our approach outperforms pre… ▽ More

    Submitted 23 December, 2017; v1 submitted 19 September, 2017; originally announced September 2017.

    Comments: In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), to appear

  23. A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing

    Authors: Dat Quoc Nguyen, Mark Dras, Mark Johnson

    Abstract: We present a novel neural network model that learns POS tagging and graph-based dependency parsing jointly. Our model uses bidirectional LSTMs to learn feature representations shared for both POS tagging and dependency parsing tasks, thus handling the feature-engineering problem. Our extensive experiments, on 19 languages from the Universal Dependencies project, show that our model outperforms the… ▽ More

    Submitted 8 June, 2017; v1 submitted 16 May, 2017; originally announced May 2017.

    Comments: v2: also include universal POS tagging, UAS and LAS accuracies w.r.t gold-standard segmentation on Universal Dependencies 2.0 - CoNLL 2017 shared task test data; in CoNLL 2017

  24. arXiv:1703.06541  [pdf, other

    cs.CL

    Native Language Identification using Stacked Generalization

    Authors: Shervin Malmasi, Mark Dras

    Abstract: Ensemble methods using multiple classifiers have proven to be the most successful approach for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of exp… ▽ More

    Submitted 19 March, 2017; originally announced March 2017.

  25. arXiv:1611.00995  [pdf, other

    cs.CL

    An empirical study for Vietnamese dependency parsing

    Authors: Dat Quoc Nguyen, Mark Dras, Mark Johnson

    Abstract: This paper presents an empirical comparison of different dependency parsers for Vietnamese, which has some unusual characteristics such as copula drop and verb serialization. Experimental results show that the neural network-based parsers perform significantly better than the traditional parsers. We report the highest parsing scores published to date for Vietnamese with the labeled attachment scor… ▽ More

    Submitted 3 November, 2016; originally announced November 2016.

    Comments: To appear in Proceedings of the 14th Annual Workshop of the Australasian Language Technology Association

  26. arXiv:1610.00030  [pdf, other

    cs.CL

    Modeling Language Change in Historical Corpora: The Case of Portuguese

    Authors: Marcos Zampieri, Shervin Malmasi, Mark Dras

    Abstract: This paper presents a number of experiments to model changes in a historical Portuguese corpus composed of literary texts for the purpose of temporal text classification. Algorithms were trained to classify texts with respect to their publication date taking into account lexical variation represented as word n-grams, and morphosyntactic variation represented by part-of-speech (POS) distribution. W… ▽ More

    Submitted 30 September, 2016; originally announced October 2016.

    Comments: Proceedings of Language Resources and Evaluation (LREC)

    Journal ref: Proceedings of Language Resources and Evaluation (LREC). Portoroz, Slovenia. pp. 4098-4104 (2016)

  27. Reluctant Paraphrase: Textual Restructuring under an Optimisation Model

    Authors: Mark Dras

    Abstract: This paper develops a computational model of paraphrase under which text modification is carried out reluctantly; that is, there are external constraints, such as length or readability, on an otherwise ideal text, and modifications to the text are necessary to ensure conformance to these constraints. This problem is analogous to a mathematical optimisation problem: the textual constraints can be… ▽ More

    Submitted 3 July, 1997; originally announced July 1997.

    Comments: 7 pages, LaTeX source (pacling97, examples styles)

  28. Death and Lightness: Using a Demographic Model to Find Support Verbs

    Authors: Mark Dras, Mike Johnson

    Abstract: Some verbs have a particular kind of binary ambiguity: they can carry their normal, full meaning, or they can be merely acting as a prop for the nominal object. It has been suggested that there is a detectable pattern in the relationship between a verb acting as a prop (a \term{support verb}) and the noun it supports. The task this paper undertakes is to develop a model which identifies the su… ▽ More

    Submitted 2 October, 1996; originally announced October 1996.

    Comments: LaTeX, 8 pages, uses aclap.sty

    Journal ref: CSNLP-96, Sept. 2-4, Dublin, Ireland

  29. Automatic Identification of Support Verbs: A Step Towards a Definition of Semantic Weight

    Authors: Mark Dras

    Abstract: Current definitions of notions of lexical density and semantic weight are based on the division of words into closed and open classes, and on intuition. This paper develops a computationally tractable definition of semantic weight, concentrating on what it means for a word to be semantically light; the definition involves looking at the frequency of a word in particular syntactic constructions w… ▽ More

    Submitted 26 October, 1995; v1 submitted 25 October, 1995; originally announced October 1995.

    Comments: 8 pages, standard LaTeX (replaced to fix LaTeX style used)

  30. arXiv:cmp-lg/9409003  [pdf, ps

    cs.CL

    A Probabilistic Model of Compound Nouns

    Authors: Mark Lauer, Mark Dras

    Abstract: Compound nouns such as example noun compound are becoming more common in natural language and pose a number of difficult problems for NLP systems, notably increasing the complexity of parsing. In this paper we develop a probabilistic model for syntactically analysing such compounds. The model predicts compound noun structures based on knowledge of affinities between nouns, which can be acquired… ▽ More

    Submitted 6 September, 1994; originally announced September 1994.

    Comments: 9 pages, uuencoded compressed postscript, please ignore any undefined command error at end

    Journal ref: 7th Australian Joint Conference on AI, 1994