Skip to main content

Showing 1–18 of 18 results for author: Koo, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.14146  [pdf, other

    cs.CL

    Reinforcement Learning with Dynamic Multi-Reward Weighting for Multi-Style Controllable Generation

    Authors: Karin de Langis, Ryan Koo, Dongyeop Kang

    Abstract: Style is an integral component of text that expresses a diverse set of information, including interpersonal dynamics (e.g. formality) and the author's emotions or attitudes (e.g. disgust). Humans often employ multiple styles simultaneously. An open question is how large language models can be explicitly controlled so that they weave together target styles when generating text: for example, to prod… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  2. arXiv:2401.14698  [pdf, other

    cs.CL cs.AI

    Under the Surface: Tracking the Artifactuality of LLM-Generated Data

    Authors: Debarati Das, Karin De Langis, Anna Martin-Boyle, Jaehyung Kim, Minhwa Lee, Zae Myung Kim, Shirley Anugrah Hayati, Risako Owan, Bin Hu, Ritik Parkar, Ryan Koo, Jonginn Park, Aahan Tyagi, Libby Ferland, Sanjali Roy, Vincent Liu, Dongyeop Kang

    Abstract: This work delves into the expanding role of large language models (LLMs) in generating artificial data. LLMs are increasingly employed to create a variety of outputs, including annotations, preferences, instruction prompts, simulated dialogues, and free text. As these forms of LLM-generated data often intersect in their application, they exert mutual influence on each other and raise significant c… ▽ More

    Submitted 30 January, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: Core Authors: Debarati Das, Karin De Langis, Anna Martin-Boyle, Jaehyung Kim, Minhwa Lee and Zae Myung Kim | Project lead : Debarati Das | PI : Dongyeop Kang

  3. arXiv:2309.17012  [pdf, other

    cs.CL cs.AI cs.LG

    Benchmarking Cognitive Biases in Large Language Models as Evaluators

    Authors: Ryan Koo, Minhwa Lee, Vipul Raheja, Jong Inn Park, Zae Myung Kim, Dongyeop Kang

    Abstract: Large Language Models (LLMs) have recently been shown to be effective as automatic evaluators with simple prompting and in-context learning. In this work, we assemble 15 LLMs of four different size ranges and evaluate their output responses by preference ranking from the other LLMs as evaluators, such as System Star is better than System Square. We then evaluate the quality of ranking outputs intr… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: Under review at ICLR 2024. 26 pages, 8 figures, 7 tables

  4. arXiv:2305.09857  [pdf, other

    cs.CL cs.AI

    CoEdIT: Text Editing by Task-Specific Instruction Tuning

    Authors: Vipul Raheja, Dhruv Kumar, Ryan Koo, Dongyeop Kang

    Abstract: We introduce CoEdIT, a state-of-the-art text editing system for writing assistance. CoEdIT takes instructions from the user specifying the attributes of the desired text, such as "Make the sentence simpler" or "Write it in a more neutral style," and outputs the edited text. We present a large language model fine-tuned on a diverse collection of task-specific instructions for text editing (a total… ▽ More

    Submitted 23 October, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023 (Findings). 18 pages, 13 tables, 2 figures

    ACM Class: I.2.7

  5. arXiv:2304.00121  [pdf, other

    cs.CL cs.HC

    Decoding the End-to-end Writing Trajectory in Scholarly Manuscripts

    Authors: Ryan Koo, Anna Martin, Linghe Wang, Dongyeop Kang

    Abstract: Scholarly writing presents a complex space that generally follows a methodical procedure to plan and produce both rationally sound and creative compositions. Recent works involving large language models (LLM) demonstrate considerable success in text generation and revision tasks; however, LLMs still struggle to provide structural and creative feedback on the document level that is crucial to acade… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

  6. arXiv:2303.09802  [pdf, other

    cs.SE

    TypeScript's Evolution: An Analysis of Feature Adoption Over Time

    Authors: Joshua D. Scarsbrook, Mark Utting, Ryan K. L. Ko

    Abstract: TypeScript is a quickly evolving superset of JavaScript with active development of new features. Our paper seeks to understand how quickly these features are adopted by the developer community. Existing work in JavaScript shows the adoption of dynamic language features can be a major hindrance to static analysis. As TypeScript evolves the addition of features makes the underlying standard more and… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  7. arXiv:2211.05327  [pdf, other

    cs.DB cs.SE

    Ultraverse: Efficient Retroactive Operation for Attack Recovery in Database Systems and Web Frameworks

    Authors: Ronny Ko, Chuan Xiao, Makoto Onizuka, Yihe Huang, Zhiqiang Lin

    Abstract: Retroactive operation is an operation that changes a past operation in a series of committed ones (e.g., cancelling the past insertion of '5' into a queue committed at t=3). Retroactive operation has many important security applications such as attack recovery or private data removal (e.g., for GDPR compliance). While prior efforts designed retroactive algorithms for low-level data structures (e.g… ▽ More

    Submitted 2 January, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  8. arXiv:2210.08461  [pdf, other

    cs.LG stat.ML

    Positive-Unlabeled Learning using Random Forests via Recursive Greedy Risk Minimization

    Authors: Jonathan Wilton, Abigail M. Y. Koay, Ryan K. L. Ko, Miao Xu, Nan Ye

    Abstract: The need to learn from positive and unlabeled data, or PU learning, arises in many applications and has attracted increasing interest. While random forests are known to perform well on many tasks with positive and negative data, recent PU algorithms are generally based on deep neural networks, and the potential of tree-based PU learning is under-explored. In this paper, we propose new random fores… ▽ More

    Submitted 16 October, 2022; originally announced October 2022.

    Comments: Accepted at NeurIPS 2022

  9. arXiv:2110.11464  [pdf, other

    cs.LG

    FDGATII : Fast Dynamic Graph Attention with Initial Residual and Identity Map**

    Authors: Gayan K. Kulatilleke, Marius Portmann, Ryan Ko, Shekhar S. Chandra

    Abstract: While Graph Neural Networks have gained popularity in multiple domains, graph-structured input remains a major challenge due to (a) over-smoothing, (b) noisy neighbours (heterophily), and (c) the suspended animation problem. To address all these problems simultaneously, we propose a novel graph neural network FDGATII, inspired by attention mechanism's ability to focus on selective information supp… ▽ More

    Submitted 25 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: 10 pages, 4 figures. Reworded section 2.1 with references. Reworded argument in section 2.3 para 2

    ACM Class: I.2.6; C.4; J.4

  10. arXiv:2104.13050  [pdf, other

    cs.LG

    Confined Gradient Descent: Privacy-preserving Optimization for Federated Learning

    Authors: Yanjun Zhang, Guangdong Bai, Xue Li, Surya Nepal, Ryan K L Ko

    Abstract: Federated learning enables multiple participants to collaboratively train a model without aggregating the training data. Although the training data are kept within each participant and the local gradients can be securely synthesized, recent studies have shown that such privacy protection is insufficient. The global model parameters that have to be shared for optimization are susceptible to leak in… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

  11. arXiv:2103.07012  [pdf, other

    cs.CR cs.SE

    ColdPress: An Extensible Malware Analysis Platform for Threat Intelligence

    Authors: Haoxi Tan, Mahin Chandramohan, Cristina Cifuentes, Guangdong Bai, Ryan K. L. Ko

    Abstract: Malware analysis is still largely a manual task. This slow and inefficient approach does not scale to the exponential rise in the rate of new unique malware generated. Hence, automating the process as much as possible becomes desirable. In this paper, we present ColdPress - an extensible malware analysis platform that automates the end-to-end process of malware threat intelligence gathering inte… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

    Comments: The code is open source at https://github.com/uqcyber/ColdPress

  12. arXiv:2101.11866  [pdf, other

    cs.CR

    An Analytics Framework for Heuristic Inference Attacks against Industrial Control Systems

    Authors: Taejun Choi, Guangdong Bai, Ryan K L Ko, Naipeng Dong, Wenlu Zhang, Shunyao Wang

    Abstract: Industrial control systems (ICS) of critical infrastructure are increasingly connected to the Internet for remote site management at scale. However, cyber attacks against ICS - especially at the communication channels between humanmachine interface (HMIs) and programmable logic controllers (PLCs) - are increasing at a rate which outstrips the rate of mitigation. In this paper, we introduce a ven… ▽ More

    Submitted 28 January, 2021; originally announced January 2021.

  13. arXiv:2012.04405  [pdf

    cs.CR cs.AI cs.CY cs.SE

    Cyber Autonomy: Automating the Hacker- Self-healing, self-adaptive, automatic cyber defense systems and their impact to the industry, society and national security

    Authors: Ryan K L Ko

    Abstract: This paper sets the context for the urgency for cyber autonomy, and the current gaps of the cyber security industry. A novel framework proposing four phases of maturity for full cyber autonomy will be discussed. The paper also reviews new and emerging cyber security automation techniques and tools, and discusses their impact on society, the perceived cyber security skills gap/shortage and national… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

    Comments: 15 pages, 5 figures, preprint of chapter in edited book "Emerging Technologies and International Security: Machines, the State, and War" edited By Reuben Steff, Joe Burton, Simona R. Soare

    ACM Class: I.2.2; I.2.m; K.4.0; K.4.1

  14. arXiv:2009.11484  [pdf

    cs.CR cs.AI cs.SE

    Pandora: A Cyber Range Environment for the Safe Testing and Deployment of Autonomous Cyber Attack Tools

    Authors: Hetong Jiang, Taejun Choi, Ryan K. L. Ko

    Abstract: Cybersecurity tools are increasingly automated with artificial intelligent (AI) capabilities to match the exponential scale of attacks, compensate for the relatively slower rate of training new cybersecurity talents, and improve of the accuracy and performance of both tools and users. However, the safe and appropriate usage of autonomous cyber attack tools - especially at the development stages fo… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: 20 pages, 10 figures, to be published in SSCC 2020

    MSC Class: 68M25 ACM Class: D.4.6; D.2.5; K.3

  15. arXiv:2007.06953  [pdf, other

    cs.CR

    PrivColl: Practical Privacy-Preserving Collaborative Machine Learning

    Authors: Yanjun Zhang, Guangdong Bai, Xue Li, Caitlin Curtis, Chen Chen, Ryan K L Ko

    Abstract: Collaborative learning enables two or more participants, each with their own training dataset, to collaboratively learn a joint model. It is desirable that the collaboration should not cause the disclosure of either the raw datasets of each individual owner or the local model parameters trained on them. This privacy-preservation requirement has been approached through differential privacy mechanis… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

    Comments: 20 pages, 3 figures, to be published in 25th European Symposium on Research in Computer Security (ESORICS) 2020

  16. arXiv:2007.06167  [pdf, other

    cs.DS cs.IT

    Local Editing in LZ-End Compressed Data

    Authors: Daniel Roodt, Ulrich Speidel, Vimal Kumar, Ryan K. L. Ko

    Abstract: This paper presents an algorithm for the modification of data compressed using LZ-End, a derivate of LZ77, without prior decompression. The performance of the algorithm and the impact of the modifications on the compression ratio is evaluated. Finally, we discuss the importance of this work as a first step towards local editing in Lempel-Ziv compressed data.

    Submitted 12 July, 2020; originally announced July 2020.

    Comments: 12 pages, 1 Figure, 2 Tables

  17. arXiv:1408.2889  [pdf, other

    cs.LG cs.NE

    A Classifier-free Ensemble Selection Method based on Data Diversity in Random Subspaces

    Authors: Albert H. R. Ko, Robert Sabourin, Alceu S. Britto Jr, Luiz E. S. Oliveira

    Abstract: The Ensemble of Classifiers (EoC) has been shown to be effective in improving the performance of single classifiers by combining their outputs, and one of the most important properties involved in the selection of the best EoC from a pool of classifiers is considered to be classifier diversity. In general, classifier diversity does not occur randomly, but is generated systematically by various ens… ▽ More

    Submitted 12 August, 2014; originally announced August 2014.

    ACM Class: I.5.2; I.5.3

  18. arXiv:1103.5046  [pdf, other

    cs.IR cs.AI cs.HC

    From Linked Data to Relevant Data -- Time is the Essence

    Authors: Markus Kirchberg, Ryan K L Ko, Bu Sung Lee

    Abstract: The Semantic Web initiative puts emphasis not primarily on putting data on the Web, but rather on creating links in a way that both humans and machines can explore the Web of data. When such users access the Web, they leave a trail as Web servers maintain a history of requests. Web usage mining approaches have been studied since the beginning of the Web given the log's huge potential for purposes… ▽ More

    Submitted 25 March, 2011; originally announced March 2011.

    Comments: 1st International Workshop on Usage Analysis and the Web of Data (USEWOD2011) in the 20th International World Wide Web Conference (WWW2011), Hyderabad, India, March 28th, 2011

    Report number: WWW2011USEWOD/2011/kirkolee