-
Requirements Satisfiability with In-Context Learning
Authors:
Sarah Santos,
Travis Breaux,
Thomas Norton,
Sara Haghighi,
Sepideh Ghanavati
Abstract:
Language models that can learn a task at inference time, called in-context learning (ICL), show increasing promise in natural language inference tasks. In ICL, a model user constructs a prompt to describe a task with a natural language instruction and zero or more examples, called demonstrations. The prompt is then input to the language model to generate a completion. In this paper, we apply ICL t…
▽ More
Language models that can learn a task at inference time, called in-context learning (ICL), show increasing promise in natural language inference tasks. In ICL, a model user constructs a prompt to describe a task with a natural language instruction and zero or more examples, called demonstrations. The prompt is then input to the language model to generate a completion. In this paper, we apply ICL to the design and evaluation of satisfaction arguments, which describe how a requirement is satisfied by a system specification and associated domain knowledge. The approach builds on three prompt design patterns, including augmented generation, prompt tuning, and chain-of-thought prompting, and is evaluated on a privacy problem to check whether a mobile app scenario and associated design description satisfies eight consent requirements from the EU General Data Protection Regulation (GDPR). The overall results show that GPT-4 can be used to verify requirements satisfaction with 96.7% accuracy and dissatisfaction with 93.2% accuracy. Inverting the requirement improves verification of dissatisfaction to 97.2%. Chain-of-thought prompting improves overall GPT-3.5 performance by 9.0% accuracy. We discuss the trade-offs among templates, models and prompt strategies and provide a detailed analysis of the generated specifications to inform how the approach can be applied in practice.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Evaluating Privacy Perceptions, Experience, and Behavior of Software Development Teams
Authors:
Maxwell Prybylo,
Sara Haghighi,
Sai Teja Peddinti,
Sepideh Ghanavati
Abstract:
With the increase in the number of privacy regulations, small development teams are forced to make privacy decisions on their own. In this paper, we conduct a mixed-method survey study, including statistical and qualitative analysis, to evaluate the privacy perceptions, practices, and knowledge of members involved in various phases of the Software Development Life Cycle (SDLC). Our survey includes…
▽ More
With the increase in the number of privacy regulations, small development teams are forced to make privacy decisions on their own. In this paper, we conduct a mixed-method survey study, including statistical and qualitative analysis, to evaluate the privacy perceptions, practices, and knowledge of members involved in various phases of the Software Development Life Cycle (SDLC). Our survey includes 362 participants from 23 countries, encompassing roles such as product managers, developers, and testers. Our results show diverse definitions of privacy across SDLC roles, emphasizing the need for a holistic privacy approach throughout SDLC. We find that software teams, regardless of their region, are less familiar with privacy concepts (such as anonymization), relying on self-teaching and forums. Most participants are more familiar with GDPR and HIPAA than other regulations, with multi-jurisdictional compliance being their primary concern. Our results advocate the need for role-dependent solutions to address the privacy challenges, and we highlight research directions and educational takeaways to help improve privacy-aware SDLC.
△ Less
Submitted 8 June, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
Evaluating Privacy Questions From Stack Overflow: Can ChatGPT Compete?
Authors:
Zack Delile,
Sean Radel,
Joe Godinez,
Garrett Engstrom,
Theo Brucker,
Kenzie Young,
Sepideh Ghanavati
Abstract:
Stack Overflow and other similar forums are used commonly by developers to seek answers for their software development as well as privacy-related concerns. Recently, ChatGPT has been used as an alternative to generate code or produce responses to developers' questions. In this paper, we aim to understand developers' privacy challenges by evaluating the types of privacy-related questions asked on S…
▽ More
Stack Overflow and other similar forums are used commonly by developers to seek answers for their software development as well as privacy-related concerns. Recently, ChatGPT has been used as an alternative to generate code or produce responses to developers' questions. In this paper, we aim to understand developers' privacy challenges by evaluating the types of privacy-related questions asked on Stack Overflow. We then conduct a comparative analysis between the accepted responses given by Stack Overflow users and the responses produced by ChatGPT for those extracted questions to identify if ChatGPT could serve as a viable alternative. Our results show that most privacy-related questions are related to choice/consent, aggregation, and identification. Furthermore, our findings illustrate that ChatGPT generates similarly correct responses for about 56% of questions, while for the rest of the responses, the answers from Stack Overflow are slightly more accurate than ChatGPT.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
Towards Fine-Grained Localization of Privacy Behaviors
Authors:
Vijayanta Jain,
Sepideh Ghanavati,
Sai Teja Peddinti,
Collin McMillan
Abstract:
Mobile applications are required to give privacy notices to users when they collect or share personal information. Creating consistent and concise privacy notices can be a challenging task for developers. Previous work has attempted to help developers create privacy notices through a questionnaire or predefined templates. In this paper, we propose a novel approach and a framework, called PriGen, t…
▽ More
Mobile applications are required to give privacy notices to users when they collect or share personal information. Creating consistent and concise privacy notices can be a challenging task for developers. Previous work has attempted to help developers create privacy notices through a questionnaire or predefined templates. In this paper, we propose a novel approach and a framework, called PriGen, that extends these prior work. PriGen uses static analysis to identify Android applications' code segments that process sensitive information (i.e. permission-requiring code segments) and then leverages a Neural Machine Translation model to translate them into privacy captions. We present the initial evaluation of our translation task for ~300,000 code segments.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
A Language Model of Java Methods with Train/Test Deduplication
Authors:
Chia-Yi Su,
Aakash Bansal,
Vijayanta Jain,
Sepideh Ghanavati,
Collin McMillan
Abstract:
This tool demonstration presents a research toolkit for a language model of Java source code. The target audience includes researchers studying problems at the granularity level of subroutines, statements, or variables in Java. In contrast to many existing language models, we prioritize features for researchers including an open and easily-searchable training set, a held out test set with differen…
▽ More
This tool demonstration presents a research toolkit for a language model of Java source code. The target audience includes researchers studying problems at the granularity level of subroutines, statements, or variables in Java. In contrast to many existing language models, we prioritize features for researchers including an open and easily-searchable training set, a held out test set with different levels of deduplication from the training set, infrastructure for deduplicating new examples, and an implementation platform suitable for execution on equipment accessible to a relatively modest budget. Our model is a GPT2-like architecture with 350m parameters. Our training set includes 52m Java methods (9b tokens) and 13m StackOverflow threads (10.5b tokens). To improve accessibility of research to more members of the community, we limit local resource requirements to GPUs with 16GB video memory. We provide a test set of held out Java methods that include descriptive comments, including the entire Java projects for those methods. We also provide deduplication tools using precomputed hash tables at various similarity thresholds to help researchers ensure that their own test examples are not in the training set. We make all our tools and data open source and available via Huggingface and Github.
△ Less
Submitted 14 May, 2023;
originally announced May 2023.
-
PriGen: Towards Automated Translation of Android Applications' Code to Privacy Captions
Authors:
Vijayanta Jain,
Sanonda Datta Gupta,
Sepideh Ghanavati,
Sai Teja Peddinti
Abstract:
Mobile applications are required to give privacy notices to the users when they collect or share personal information. Creating consistent and concise privacy notices can be a challenging task for developers. Previous work has attempted to help developers create privacy notices through a questionnaire or predefined templates. In this paper, we propose a novel approach and a framework, called PriGe…
▽ More
Mobile applications are required to give privacy notices to the users when they collect or share personal information. Creating consistent and concise privacy notices can be a challenging task for developers. Previous work has attempted to help developers create privacy notices through a questionnaire or predefined templates. In this paper, we propose a novel approach and a framework, called PriGen, that extends these prior work. PriGen uses static analysis to identify Android applications' code segments which process sensitive information (i.e. permission-requiring code segments) and then leverages a Neural Machine Translation model to translate them into privacy captions. We present the initial evaluation of our translation task for $\sim$300,000 code segments.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
Understanding Developers Privacy Concerns Through Reddit Thread Analysis
Authors:
Jonathan Parsons,
Michael Schrider,
Oyebanjo Ogunlela,
Sepideh Ghanavati
Abstract:
With the growing global emphasis on regulating the protection of personal information and increasing user expectation of the same, develo** with privacy in mind is becoming ever more important. In this paper, we study the concerns, questions, and solutions developers discuss on Reddit forums to enhance our understanding of their perceptions and challenges while develo** applications in the cur…
▽ More
With the growing global emphasis on regulating the protection of personal information and increasing user expectation of the same, develo** with privacy in mind is becoming ever more important. In this paper, we study the concerns, questions, and solutions developers discuss on Reddit forums to enhance our understanding of their perceptions and challenges while develo** applications in the current privacy-focused world. We perform various forms of Natural Language Processing (NLP) on 437,317 threads from subreddits such as r/webdev, r/androiddev, and r/iOSProgramming to identify both common points of discussion and how these points change over time as new regulations are passed around the globe. Our results show that there are common trends in privacy topics among the different subreddits while the frequency of those topics differs between web and mobile applications.
△ Less
Submitted 15 April, 2023;
originally announced April 2023.
-
3D and 4D printing in dentistry and maxillofacial surgery: Recent advances and future perspectives
Authors:
Danial Khorsandi,
Amir Fahimipour,
Payam Abasian,
Sepehr Sadeghpour Saber,
Mahla Seyedi,
Sonia Ghanavati,
Amir Ahmad,
Andrea Amoretti De Stephanis,
Fatemeh Taghavinezhad,
Anna Leonova,
Reza Mohammadinejad,
Majid Shabani,
Barbara Mazzolai,
Virgilio Mattoli,
Franklin R. Tay,
Pooyan Makvandi
Abstract:
3D and 4D printing are cutting-edge technologies for precise and expedited manufacturing of objects ranging from plastic to metal. Recent advances in 3D and 4D printing technologies in dentistry and maxillofacial surgery enable dentists to custom design and print surgical drill guides, temporary and permanent crowns and bridges, orthodontic appliances and orthotics, implants, mouthguards for drug…
▽ More
3D and 4D printing are cutting-edge technologies for precise and expedited manufacturing of objects ranging from plastic to metal. Recent advances in 3D and 4D printing technologies in dentistry and maxillofacial surgery enable dentists to custom design and print surgical drill guides, temporary and permanent crowns and bridges, orthodontic appliances and orthotics, implants, mouthguards for drug delivery. In the present review, different 3D printing technologies available for use in dentistry are highlighted together with a critique on the materials available for printing. Recent reports of the application of these printed platformed are highlighted to enable readers appreciate the progress in 3D/4D printing in dentistry.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
Automated Approach to Improve IoT Privacy Policies
Authors:
Parvaneh Shayegh,
Vijayanta Jain,
Amin Rabinia,
Sepideh Ghanavati
Abstract:
The massive growth of the Internet of Things (IoT) as a network of interconnected entities [18], brings up new challenges in terms of privacy and security requirements to the traditional software engineering domain [4]. To protect the individuals' privacy, the FTC's Fair Information Practice Principles (FIPPs) [6] proposes to companies to give notice to the consumer about their data practices, pro…
▽ More
The massive growth of the Internet of Things (IoT) as a network of interconnected entities [18], brings up new challenges in terms of privacy and security requirements to the traditional software engineering domain [4]. To protect the individuals' privacy, the FTC's Fair Information Practice Principles (FIPPs) [6] proposes to companies to give notice to the consumer about their data practices, provide them with choices and give them means to have control over their own data.. Using privacy policy is the most common way for this type of notices. However, privacy policies are not generally effective due to two main reasons: first, privacy policies are long and full of legal jargon which are not understandable by a normal user; second, it is not guaranteed that an IoT device behave as it is explained in its privacy policy. In this technical report, we propose and discuss our methodologies to analyze privacy policies. By the help of this analysis, we reduce the length of a privacy policy and make it organized based on privacy practices to improve understanding level for the user. We also come up with a method to find the inconsistencies between IoT devices and their privacy policies.
△ Less
Submitted 6 October, 2019;
originally announced October 2019.