Search | arXiv e-print repository

arXiv:2406.20053 [pdf, other]

Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation

Authors: Danny Halawi, Alexander Wei, Eric Wallace, Tony T. Wang, Nika Haghtalab, Jacob Steinhardt

Abstract: Black-box finetuning is an emerging interface for adapting state-of-the-art language models to user needs. However, such access may also let malicious actors undermine model safety. To demonstrate the challenge of defending finetuning interfaces, we introduce covert malicious finetuning, a method to compromise model safety via finetuning while evading detection. Our method constructs a malicious d… ▽ More Black-box finetuning is an emerging interface for adapting state-of-the-art language models to user needs. However, such access may also let malicious actors undermine model safety. To demonstrate the challenge of defending finetuning interfaces, we introduce covert malicious finetuning, a method to compromise model safety via finetuning while evading detection. Our method constructs a malicious dataset where every individual datapoint appears innocuous, but finetuning on the dataset teaches the model to respond to encoded harmful requests with encoded harmful responses. Applied to GPT-4, our method produces a finetuned model that acts on harmful instructions 99% of the time and avoids detection by defense mechanisms such as dataset inspection, safety evaluations, and input/output classifiers. Our findings question whether black-box finetuning access can be secured against sophisticated adversaries. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: 22 pages

arXiv:2403.13213 [pdf, other]

From Representational Harms to Quality-of-Service Harms: A Case Study on Llama 2 Safety Safeguards

Authors: Khaoula Chehbouni, Megha Roshan, Emmanuel Ma, Futian Andrew Wei, Afaf Taik, Jackie CK Cheung, Golnoosh Farnadi

Abstract: Recent progress in large language models (LLMs) has led to their widespread adoption in various domains. However, these advancements have also introduced additional safety risks and raised concerns regarding their detrimental impact on already marginalized populations. Despite growing mitigation efforts to develop safety safeguards, such as supervised safety-oriented fine-tuning and leveraging saf… ▽ More Recent progress in large language models (LLMs) has led to their widespread adoption in various domains. However, these advancements have also introduced additional safety risks and raised concerns regarding their detrimental impact on already marginalized populations. Despite growing mitigation efforts to develop safety safeguards, such as supervised safety-oriented fine-tuning and leveraging safe reinforcement learning from human feedback, multiple concerns regarding the safety and ingrained biases in these models remain. Furthermore, previous work has demonstrated that models optimized for safety often display exaggerated safety behaviors, such as a tendency to refrain from responding to certain requests as a precautionary measure. As such, a clear trade-off between the helpfulness and safety of these models has been documented in the literature. In this paper, we further investigate the effectiveness of safety measures by evaluating models on already mitigated biases. Using the case of Llama 2 as an example, we illustrate how LLMs' safety responses can still encode harmful assumptions. To do so, we create a set of non-toxic prompts, which we then use to evaluate Llama models. Through our new taxonomy of LLMs responses to users, we observe that the safety/helpfulness trade-offs are more pronounced for certain demographic groups which can lead to quality-of-service harms for marginalized populations. △ Less

Submitted 7 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

Comments: 9 pages, 4 figures. Accepted to Findings of the Association for Computational Linguistics: ACL 2024

arXiv:2403.09675 [pdf, other]

Open-Universe Indoor Scene Generation using LLM Program Synthesis and Uncurated Object Databases

Authors: Rio Aguina-Kang, Maxim Gumin, Do Heon Han, Stewart Morris, Seung Jean Yoo, Aditya Ganeshan, R. Kenny Jones, Qiuhong Anna Wei, Kailiang Fu, Daniel Ritchie

Abstract: We present a system for generating indoor scenes in response to text prompts. The prompts are not limited to a fixed vocabulary of scene descriptions, and the objects in generated scenes are not restricted to a fixed set of object categories -- we call this setting indoor scene generation. Unlike most prior work on indoor scene generation, our system does not require a large training dataset of ex… ▽ More We present a system for generating indoor scenes in response to text prompts. The prompts are not limited to a fixed vocabulary of scene descriptions, and the objects in generated scenes are not restricted to a fixed set of object categories -- we call this setting indoor scene generation. Unlike most prior work on indoor scene generation, our system does not require a large training dataset of existing 3D scenes. Instead, it leverages the world knowledge encoded in pre-trained large language models (LLMs) to synthesize programs in a domain-specific layout language that describe objects and spatial relations between them. Executing such a program produces a specification of a constraint satisfaction problem, which the system solves using a gradient-based optimization scheme to produce object positions and orientations. To produce object geometry, the system retrieves 3D meshes from a database. Unlike prior work which uses databases of category-annotated, mutually-aligned meshes, we develop a pipeline using vision-language models (VLMs) to retrieve meshes from massive databases of un-annotated, inconsistently-aligned meshes. Experimental evaluations show that our system outperforms generative models trained on 3D data for traditional, closed-universe scene generation tasks; it also outperforms a recent LLM-based layout generation method on open-universe scene generation. △ Less

Submitted 4 February, 2024; originally announced March 2024.

Comments: See ancillary files for link to supplemental material

arXiv:2307.02483 [pdf, other]

Jailbroken: How Does LLM Safety Training Fail?

Authors: Alexander Wei, Nika Haghtalab, Jacob Steinhardt

Abstract: Large language models trained for safety and harmlessness remain susceptible to adversarial misuse, as evidenced by the prevalence of "jailbreak" attacks on early releases of ChatGPT that elicit undesired behavior. Going beyond recognition of the issue, we investigate why such attacks succeed and how they can be created. We hypothesize two failure modes of safety training: competing objectives and… ▽ More Large language models trained for safety and harmlessness remain susceptible to adversarial misuse, as evidenced by the prevalence of "jailbreak" attacks on early releases of ChatGPT that elicit undesired behavior. Going beyond recognition of the issue, we investigate why such attacks succeed and how they can be created. We hypothesize two failure modes of safety training: competing objectives and mismatched generalization. Competing objectives arise when a model's capabilities and safety goals conflict, while mismatched generalization occurs when safety training fails to generalize to a domain for which capabilities exist. We use these failure modes to guide jailbreak design and then evaluate state-of-the-art models, including OpenAI's GPT-4 and Anthropic's Claude v1.3, against both existing and newly designed attacks. We find that vulnerabilities persist despite the extensive red-teaming and safety-training efforts behind these models. Notably, new attacks utilizing our failure modes succeed on every prompt in a collection of unsafe requests from the models' red-teaming evaluation sets and outperform existing ad hoc jailbreaks. Our analysis emphasizes the need for safety-capability parity -- that safety mechanisms should be as sophisticated as the underlying model -- and argues against the idea that scaling alone can resolve these safety failure modes. △ Less

Submitted 5 July, 2023; originally announced July 2023.

arXiv:2304.11259 [pdf, other]

Consensus Complementarity Control for Multi-Contact MPC

Authors: Alp Aydinoglu, Adam Wei, Wei-Cheng Huang, Michael Posa

Abstract: We propose a hybrid model predictive control algorithm, consensus complementarity control (C3), for systems that make and break contact with their environment. Many state-of-the-art controllers for tasks which require initiating contact with the environment, such as locomotion and manipulation, require a priori mode schedules or are too computationally complex to run at real-time rates. We present… ▽ More We propose a hybrid model predictive control algorithm, consensus complementarity control (C3), for systems that make and break contact with their environment. Many state-of-the-art controllers for tasks which require initiating contact with the environment, such as locomotion and manipulation, require a priori mode schedules or are too computationally complex to run at real-time rates. We present a method based on the alternating direction method of multipliers (ADMM) that is capable of high-speed reasoning over potential contact events. Via a consensus formulation, our approach enables parallelization of the contact scheduling problem. We validate our results on five numerical examples, including four high-dimensional frictional contact problems, and a physical experimentation on an underactuated multi-contact system. We further demonstrate the effectiveness of our method on a physical experiment accomplishing a high-dimensional, multi-contact manipulation task with a robot arm. △ Less

Submitted 7 March, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

Comments: T-RO submission. Continuation of the work: arXiv:2109.07076v2

arXiv:2301.09629 [pdf, other]

LEGO-Net: Learning Regular Rearrangements of Objects in Rooms

Authors: Qiuhong Anna Wei, Sijie Ding, Jeong Joon Park, Rahul Sajnani, Adrien Poulenard, Srinath Sridhar, Leonidas Guibas

Abstract: Humans universally dislike the task of cleaning up a messy room. If machines were to help us with this task, they must understand human criteria for regular arrangements, such as several types of symmetry, co-linearity or co-circularity, spacing uniformity in linear or circular patterns, and further inter-object relationships that relate to style and functionality. Previous approaches for this tas… ▽ More Humans universally dislike the task of cleaning up a messy room. If machines were to help us with this task, they must understand human criteria for regular arrangements, such as several types of symmetry, co-linearity or co-circularity, spacing uniformity in linear or circular patterns, and further inter-object relationships that relate to style and functionality. Previous approaches for this task relied on human input to explicitly specify goal state, or synthesized scenes from scratch -- but such methods do not address the rearrangement of existing messy scenes without providing a goal state. In this paper, we present LEGO-Net, a data-driven transformer-based iterative method for LEarning reGular rearrangement of Objects in messy rooms. LEGO-Net is partly inspired by diffusion models -- it starts with an initial messy state and iteratively ''de-noises'' the position and orientation of objects to a regular state while reducing distance traveled. Given randomly perturbed object positions and orientations in an existing dataset of professionally-arranged scenes, our method is trained to recover a regular re-arrangement. Results demonstrate that our method is able to reliably rearrange room scenes and outperform other methods. We additionally propose a metric for evaluating regularity in room arrangements using number-theoretic machinery. △ Less

Submitted 24 March, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

Comments: Project page: https://ivl.cs.brown.edu/projects/lego-net

arXiv:2211.05910 [pdf, other]

Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

Authors: Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, **gang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, **woo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li , et al. (71 additional authors not shown)

Abstract: Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose… ▽ More Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: arXiv admin note: text overlap with arXiv:2105.07825, arXiv:2105.08826, arXiv:2211.04470, arXiv:2211.03885, arXiv:2211.05256

arXiv:2208.09407 [pdf, other]

Learning in Stackelberg Games with Non-myopic Agents

Authors: Nika Haghtalab, Thodoris Lykouris, Sloan Nietert, Alex Wei

Abstract: We study Stackelberg games where a principal repeatedly interacts with a long-lived, non-myopic agent, without knowing the agent's payoff function. Although learning in Stackelberg games is well-understood when the agent is myopic, non-myopic agents pose additional complications. In particular, non-myopic agents may strategically select actions that are inferior in the present to mislead the princ… ▽ More We study Stackelberg games where a principal repeatedly interacts with a long-lived, non-myopic agent, without knowing the agent's payoff function. Although learning in Stackelberg games is well-understood when the agent is myopic, non-myopic agents pose additional complications. In particular, non-myopic agents may strategically select actions that are inferior in the present to mislead the principal's learning algorithm and obtain better outcomes in the future. We provide a general framework that reduces learning in presence of non-myopic agents to robust bandit optimization in the presence of myopic agents. Through the design and analysis of minimally reactive bandit algorithms, our reduction trades off the statistical efficiency of the principal's learning algorithm against its effectiveness in inducing near-best-responses. We apply this framework to Stackelberg security games (SSGs), pricing with unknown demand curve, strategic classification, and general finite Stackelberg games. In each setting, we characterize the type and impact of misspecifications present in near-best-responses and develop a learning algorithm robust to such misspecifications. Along the way, we improve the query complexity of learning in SSGs with $n$ targets from the state-of-the-art $O(n^3)$ to a near-optimal $\widetilde{O}(n)$ by uncovering a fundamental structural property of such games. This result is of independent interest beyond learning with non-myopic agents. △ Less

Submitted 19 August, 2022; originally announced August 2022.

Comments: An extended abstract of this work appeared at the ACM Conference on Economics and Computation (EC) 2022

arXiv:2207.06343 [pdf, other]

TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels

Authors: Yaodong Yu, Alexander Wei, Sai Praneeth Karimireddy, Yi Ma, Michael I. Jordan

Abstract: State-of-the-art federated learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions. For neural networks, even when centralized SGD easily finds a solution that is simultaneously performant for all clients, current federated optimization methods fail to converge to a comparable solution. We show that this performance disparity can l… ▽ More State-of-the-art federated learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions. For neural networks, even when centralized SGD easily finds a solution that is simultaneously performant for all clients, current federated optimization methods fail to converge to a comparable solution. We show that this performance disparity can largely be attributed to optimization challenges presented by nonconvexity. Specifically, we find that the early layers of the network do learn useful features, but the final layers fail to make use of them. That is, federated optimization applied to this non-convex problem distorts the learning of the final layers. Leveraging this observation, we propose a Train-Convexify-Train (TCT) procedure to sidestep this issue: first, learn features using off-the-shelf methods (e.g., FedAvg); then, optimize a convexified problem obtained from the network's empirical neural tangent kernel approximation. Our technique yields accuracy improvements of up to +36% on FMNIST and +37% on CIFAR10 when clients have dissimilar data. △ Less

Submitted 5 October, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

Comments: Accepted at Neural Information Processing Systems (NeurIPS) 2022. V2 releases code

MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

arXiv:2207.05531 [pdf, other]

Fuzzing Deep-Learning Libraries via Automated Relational API Inference

Authors: Yinlin Deng, Chenyuan Yang, Anjiang Wei, Lingming Zhang

Abstract: A growing body of research has been dedicated to DL model testing. However, there is still limited work on testing DL libraries, which serve as the foundations for building, training, and running DL models. Prior work on fuzzing DL libraries can only generate tests for APIs which have been invoked by documentation examples, developer tests, or DL models, leaving a large number of APIs untested. In… ▽ More A growing body of research has been dedicated to DL model testing. However, there is still limited work on testing DL libraries, which serve as the foundations for building, training, and running DL models. Prior work on fuzzing DL libraries can only generate tests for APIs which have been invoked by documentation examples, developer tests, or DL models, leaving a large number of APIs untested. In this paper, we propose DeepREL, the first approach to automatically inferring relational APIs for more effective DL library fuzzing. Our basic hypothesis is that for a DL library under test, there may exist a number of APIs sharing similar input parameters and outputs; in this way, we can easily "borrow" test inputs from invoked APIs to test other relational APIs. Furthermore, we formalize the notion of value equivalence and status equivalence for relational APIs to serve as the oracle for effective bug finding. We have implemented DeepREL as a fully automated end-to-end relational API inference and fuzzing technique for DL libraries, which 1) automatically infers potential API relations based on API syntactic or semantic information, 2) synthesizes concrete test programs for invoking relational APIs, 3) validates the inferred relational APIs via representative test inputs, and finally 4) performs fuzzing on the verified relational APIs to find potential inconsistencies. Our evaluation on two of the most popular DL libraries, PyTorch and TensorFlow, demonstrates that DeepREL can cover 157% more APIs than state-of-the-art FreeFuzz. To date, DeepREL has detected 162 bugs in total, with 106 already confirmed by the developers as previously unknown bugs. Surprisingly, DeepREL has detected 13.5% of the high-priority bugs for the entire PyTorch issue-tracking system in a three-month period. Also, besides the 162 code bugs, we have also detected 14 documentation bugs (all confirmed). △ Less

Submitted 12 July, 2022; originally announced July 2022.

Comments: Accepted at ESEC/FSE 2022

arXiv:2203.06176 [pdf, other]

More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize

Authors: Alexander Wei, Wei Hu, Jacob Steinhardt

Abstract: Of theories for why large-scale machine learning models generalize despite being vastly overparameterized, which of their assumptions are needed to capture the qualitative phenomena of generalization in the real world? On one hand, we find that most theoretical analyses fall short of capturing these qualitative phenomena even for kernel regression, when applied to kernels derived from large-scale… ▽ More Of theories for why large-scale machine learning models generalize despite being vastly overparameterized, which of their assumptions are needed to capture the qualitative phenomena of generalization in the real world? On one hand, we find that most theoretical analyses fall short of capturing these qualitative phenomena even for kernel regression, when applied to kernels derived from large-scale neural networks (e.g., ResNet-50) and real data (e.g., CIFAR-100). On the other hand, we find that the classical GCV estimator (Craven and Wahba, 1978) accurately predicts generalization risk even in such overparameterized settings. To bolster this empirical finding, we prove that the GCV estimator converges to the generalization risk whenever a local random matrix law holds. Finally, we apply this random matrix theory lens to explain why pretrained representations generalize better as well as what factors govern scaling laws for kernel regression. Our findings suggest that random matrix theory, rather than just being a toy model, may be central to understanding the properties of neural representations in practice. △ Less

Submitted 11 March, 2022; originally announced March 2022.

arXiv:2202.05834 [pdf, other]

Predicting Out-of-Distribution Error with the Projection Norm

Authors: Yaodong Yu, Zitong Yang, Alexander Wei, Yi Ma, Jacob Steinhardt

Abstract: We propose a metric -- Projection Norm -- to predict a model's performance on out-of-distribution (OOD) data without access to ground truth labels. Projection Norm first uses model predictions to pseudo-label test samples and then trains a new model on the pseudo-labels. The more the new model's parameters differ from an in-distribution model, the greater the predicted OOD error. Empirically, our… ▽ More We propose a metric -- Projection Norm -- to predict a model's performance on out-of-distribution (OOD) data without access to ground truth labels. Projection Norm first uses model predictions to pseudo-label test samples and then trains a new model on the pseudo-labels. The more the new model's parameters differ from an in-distribution model, the greater the predicted OOD error. Empirically, our approach outperforms existing methods on both image and text classification tasks and across different network architectures. Theoretically, we connect our approach to a bound on the test error for overparameterized linear models. Furthermore, we find that Projection Norm is the only approach that achieves non-trivial detection performance on adversarial examples. Our code is available at https://github.com/yaodongyu/ProjNorm. △ Less

Submitted 11 February, 2022; originally announced February 2022.

arXiv:2201.06589 [pdf, other]

Free Lunch for Testing: Fuzzing Deep-Learning Libraries from Open Source

Authors: Anjiang Wei, Yinlin Deng, Chenyuan Yang, Lingming Zhang

Abstract: Deep learning (DL) systems can make our life much easier, and thus are gaining more and more attention from both academia and industry. Meanwhile, bugs in DL systems can be disastrous, and can even threaten human lives in safety-critical applications. To date, a huge body of research efforts have been dedicated to testing DL models. However, interestingly, there is still limited work for testing t… ▽ More Deep learning (DL) systems can make our life much easier, and thus are gaining more and more attention from both academia and industry. Meanwhile, bugs in DL systems can be disastrous, and can even threaten human lives in safety-critical applications. To date, a huge body of research efforts have been dedicated to testing DL models. However, interestingly, there is still limited work for testing the underlying DL libraries, which are the foundation for building, optimizing, and running DL models. One potential reason is that test generation for the underlying DL libraries can be rather challenging since their public APIs are mainly exposed in Python, making it even hard to automatically determine the API input parameter types due to dynamic ty**. In this paper, we propose FreeFuzz, the first approach to fuzzing DL libraries via mining from open source. More specifically, FreeFuzz obtains code/models from three different sources: 1) code snippets from the library documentation, 2) library developer tests, and 3) DL models in the wild. Then, FreeFuzz automatically runs all the collected code/models with instrumentation to trace the dynamic information for each covered API, including the types and values of each parameter during invocation, and shapes of input/output tensors. Lastly, FreeFuzz will leverage the traced dynamic information to perform fuzz testing for each covered API. The extensive study of FreeFuzz on PyTorch and TensorFlow, two of the most popular DL libraries, shows that FreeFuzz is able to automatically trace valid dynamic information for fuzzing 1158 popular APIs, 9X more than state-of-the-art LEMON with 3.5X lower overhead than LEMON. To date, FreeFuzz has detected 49 bugs for PyTorch and TensorFlow (with 38 already confirmed by developers as previously unknown). △ Less

Submitted 25 February, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

arXiv:2108.09922 [pdf]

Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease

Authors: Yongming Li, Chengyu Liu, Pin Wang, Hehua Zhang, Anhai Wei

Abstract: The risk of Parkinson's disease (PD) is extremely serious, and PD speech recognition is an effective method of diagnosis nowadays. However, due to the influence of the disease stage, corpus, and other factors on data collection, the ability of every samples within one subject to reflect the status of PD vary. No samples are useless totally, and not samples are 100% perfect. This characteristic mea… ▽ More The risk of Parkinson's disease (PD) is extremely serious, and PD speech recognition is an effective method of diagnosis nowadays. However, due to the influence of the disease stage, corpus, and other factors on data collection, the ability of every samples within one subject to reflect the status of PD vary. No samples are useless totally, and not samples are 100% perfect. This characteristic means that it is not suitable just to remove some samples or keep some samples. It is necessary to consider the sample transformation for obtaining high quality new samples. Unfortunately, existing PD speech recognition methods focus mainly on feature learning and classifier design rather than sample learning, and few methods consider the sample transformation. To solve the problem above, a PD speech sample transformation algorithm based on multitype reconstruction operators is proposed in this paper. The algorithm is divided into four major steps. Three types of reconstruction operators are designed in the algorithm: types A, B and C. Concerning the type A operator, the original dataset is directly reconstructed by designing a linear transformation to obtain the first dataset. The type B operator is designed for clustering and linear transformation of the dataset to obtain the second new dataset. The third operator, namely, the type C operator, reconstructs the dataset by clustering and convolution to obtain the third dataset. Finally, the base classifier is trained based on the three new datasets, and then the classification results are fused by decision weighting. In the experimental section, two representative PD speech datasets are used for verification. The results show that the proposed algorithm is effective. Compared with other algorithms, the proposed algorithm achieves apparent improvements in terms of classification accuracy. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: 11 pages, 6 tables

arXiv:2108.08843 [pdf, other]

Learning Equilibria in Matching Markets from Bandit Feedback

Authors: Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, Jacob Steinhardt

Abstract: Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data. Classical notions of stability (Gale and Shapley, 1962; Shapley and Shubik, 1971) are unfortunately of limited value in the learning setting, given that preferences are inherently uncertain and destabilizing while they are being learned. To… ▽ More Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data. Classical notions of stability (Gale and Shapley, 1962; Shapley and Shubik, 1971) are unfortunately of limited value in the learning setting, given that preferences are inherently uncertain and destabilizing while they are being learned. To bridge this gap, we develop a framework and algorithms for learning stable market outcomes under uncertainty. Our primary setting is matching with transferable utilities, where the platform both matches agents and sets monetary transfers between them. We design an incentive-aware learning objective that captures the distance of a market outcome from equilibrium. Using this objective, we analyze the complexity of learning as a function of preference structure, casting learning as a stochastic multi-armed bandit problem. Algorithmically, we show that "optimism in the face of uncertainty," the principle underlying many bandit algorithms, applies to a primal-dual formulation of matching with transfers and leads to near-optimal regret bounds. Our work takes a first step toward elucidating when and how stable matchings arise in large, data-driven marketplaces. △ Less

Submitted 31 January, 2023; v1 submitted 19 August, 2021; originally announced August 2021.

Comments: Accepted to the Journal of the ACM; conference version appeared at NeurIPS 2021

arXiv:2102.09017 [pdf, other]

Designing Approximately Optimal Search on Matching Platforms

Authors: Nicole Immorlica, Brendan Lucier, Vahideh Manshadi, Alexander Wei

Abstract: We study the design of a decentralized two-sided matching market in which agents' search is guided by the platform. There are finitely many agent types, each with (potentially random) preferences drawn from known type-specific distributions. Equipped with knowledge of these distributions, the platform guides the search process by determining the meeting rate between each pair of types from the two… ▽ More We study the design of a decentralized two-sided matching market in which agents' search is guided by the platform. There are finitely many agent types, each with (potentially random) preferences drawn from known type-specific distributions. Equipped with knowledge of these distributions, the platform guides the search process by determining the meeting rate between each pair of types from the two sides. Focusing on symmetric pairwise preferences in a continuum model, we first characterize the unique stationary equilibrium that arises given a feasible set of meeting rates. We then introduce the platform's optimal directed search problem, which involves optimizing meeting rates to maximize equilibrium social welfare. We first show that incentive issues arising from congestion and cannibalization make the design problem fairly intricate. Nonetheless, we develop an efficiently computable search design whose corresponding equilibrium achieves at least 1/4 the social welfare of the optimal design. In fact, our construction always recovers at least 1/4 the first-best social welfare, where agents' incentives are disregarded. Our directed search design is simple and easy-to-implement, as its corresponding bipartite graph consists of disjoint stars. Furthermore, our design implies the platform can substantially limit choice and yet induce an equilibrium with an approximately optimal welfare. Finally, we show that approximation is likely the best we can hope for by establishing that the problem of designing optimal directed search is NP-hard to even approximate beyond a certain constant factor. △ Less

Submitted 18 August, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

arXiv:2010.11443 [pdf, other]

Optimal Robustness-Consistency Trade-offs for Learning-Augmented Online Algorithms

Authors: Alexander Wei, Fred Zhang

Abstract: We study the problem of improving the performance of online algorithms by incorporating machine-learned predictions. The goal is to design algorithms that are both consistent and robust, meaning that the algorithm performs well when predictions are accurate and maintains worst-case guarantees. Such algorithms have been studied in a recent line of works due to Lykouris and Vassilvitskii (ICML '18)… ▽ More We study the problem of improving the performance of online algorithms by incorporating machine-learned predictions. The goal is to design algorithms that are both consistent and robust, meaning that the algorithm performs well when predictions are accurate and maintains worst-case guarantees. Such algorithms have been studied in a recent line of works due to Lykouris and Vassilvitskii (ICML '18) and Purohit et al (NeurIPS '18). They provide robustness-consistency trade-offs for a variety of online problems. However, they leave open the question of whether these trade-offs are tight, i.e., to what extent to such trade-offs are necessary. In this paper, we provide the first set of non-trivial lower bounds for competitive analysis using machine-learned predictions. We focus on the classic problems of ski-rental and non-clairvoyant scheduling and provide optimal trade-offs in various settings. △ Less

Submitted 22 October, 2020; originally announced October 2020.

Comments: To appear at NeurIPS 2020

arXiv:2007.09610 [pdf, other]

Self-similarity Student for Partial Label Histopathology Image Segmentation

Authors: Hsien-Tzu Cheng, Chun-Fu Yeh, Po-Chen Kuo, Andy Wei, Keng-Chi Liu, Mong-Chi Ko, Kuan-Hua Chao, Yu-Ching Peng, Tyng-Luh Liu

Abstract: Delineation of cancerous regions in gigapixel whole slide images (WSIs) is a crucial diagnostic procedure in digital pathology. This process is time-consuming because of the large search space in the gigapixel WSIs, causing chances of omission and misinterpretation at indistinct tumor lesions. To tackle this, the development of an automated cancerous region segmentation method is imperative. We fr… ▽ More Delineation of cancerous regions in gigapixel whole slide images (WSIs) is a crucial diagnostic procedure in digital pathology. This process is time-consuming because of the large search space in the gigapixel WSIs, causing chances of omission and misinterpretation at indistinct tumor lesions. To tackle this, the development of an automated cancerous region segmentation method is imperative. We frame this issue as a modeling problem with partial label WSIs, where some cancerous regions may be misclassified as benign and vice versa, producing patches with noisy labels. To learn from these patches, we propose Self-similarity Student, combining teacher-student model paradigm with similarity learning. Specifically, for each patch, we first sample its similar and dissimilar patches according to spatial distance. A teacher-student model is then introduced, featuring the exponential moving average on both student model weights and teacher predictions ensemble. While our student model takes patches, teacher model takes all their corresponding similar and dissimilar patches for learning robust representation against noisy label patches. Following this similarity learning, our similarity ensemble merges similar patches' ensembled predictions as the pseudo-label of a given patch to counteract its noisy label. On the CAMELYON16 dataset, our method substantially outperforms state-of-the-art noise-aware learning methods by 5$\%$ and the supervised-trained baseline by 10$\%$ in various degrees of noise. Moreover, our method is superior to the baseline on our TVGH TURP dataset with 2$\%$ improvement, demonstrating the generalizability to more clinical histopathology segmentation tasks. △ Less

Submitted 19 July, 2020; originally announced July 2020.

Comments: ECCV 2020

arXiv:2005.13716 [pdf, ps, other]

Better and Simpler Learning-Augmented Online Caching

Authors: Alexander Wei

Abstract: Lykouris and Vassilvitskii (ICML 2018) introduce a model of online caching with machine-learned advice, where each page request additionally comes with a prediction of when that page will next be requested. In this model, a natural goal is to design algorithms that (1) perform well when the advice is accurate and (2) remain robust in the worst case a la traditional competitive analysis. Lykouris a… ▽ More Lykouris and Vassilvitskii (ICML 2018) introduce a model of online caching with machine-learned advice, where each page request additionally comes with a prediction of when that page will next be requested. In this model, a natural goal is to design algorithms that (1) perform well when the advice is accurate and (2) remain robust in the worst case a la traditional competitive analysis. Lykouris and Vassilvitskii give such an algorithm by adapting the Marker algorithm to the learning-augmented setting. In a recent work, Rohatgi (SODA 2020) improves on their result with an approach also inspired by randomized marking. We continue the study of this problem, but with a somewhat different approach: We consider combining the BlindOracle algorithm, which just naïvely follows the predictions, with an optimal competitive algorithm for online caching in a black-box manner. The resulting algorithm outperforms all existing approaches while being significantly simpler. Moreover, we show that combining BlindOracle with LRU is in fact optimal among deterministic algorithms for this problem. △ Less

Submitted 27 May, 2020; originally announced May 2020.

arXiv:2004.12786 [pdf, other]

A Cascaded Learning Strategy for Robust COVID-19 Pneumonia Chest X-Ray Screening

Authors: Chun-Fu Yeh, Hsien-Tzu Cheng, Andy Wei, Hsin-Ming Chen, Po-Chen Kuo, Keng-Chi Liu, Mong-Chi Ko, Ray-Jade Chen, Po-Chang Lee, Jen-Hsiang Chuang, Chi-Mai Chen, Yi-Chang Chen, Wen-Jeng Lee, Ning Chien, Jo-Yu Chen, Yu-Sen Huang, Yu-Chien Chang, Yu-Cheng Huang, Nai-Kuan Chou, Kuan-Hua Chao, Yi-Chin Tu, Yeun-Chung Chang, Tyng-Luh Liu

Abstract: We introduce a comprehensive screening platform for the COVID-19 (a.k.a., SARS-CoV-2) pneumonia. The proposed AI-based system works on chest x-ray (CXR) images to predict whether a patient is infected with the COVID-19 disease. Although the recent international joint effort on making the availability of all sorts of open data, the public collection of CXR images is still relatively small for relia… ▽ More We introduce a comprehensive screening platform for the COVID-19 (a.k.a., SARS-CoV-2) pneumonia. The proposed AI-based system works on chest x-ray (CXR) images to predict whether a patient is infected with the COVID-19 disease. Although the recent international joint effort on making the availability of all sorts of open data, the public collection of CXR images is still relatively small for reliably training a deep neural network (DNN) to carry out COVID-19 prediction. To better address such inefficiency, we design a cascaded learning strategy to improve both the sensitivity and the specificity of the resulting DNN classification model. Our approach leverages a large CXR image dataset of non-COVID-19 pneumonia to generalize the original well-trained classification model via a cascaded learning scheme. The resulting screening system is shown to achieve good classification performance on the expanded dataset, including those newly added COVID-19 CXR images. △ Less

Submitted 30 April, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

Comments: 14 pages, 6 figures

arXiv:1807.07527 [pdf, ps, other]

Optimal Las Vegas Approximate Near Neighbors in $\ell_p$

Authors: Alexander Wei

Abstract: We show that approximate near neighbor search in high dimensions can be solved in a Las Vegas fashion (i.e., without false negatives) for $\ell_p$ ($1\le p\le 2$) while matching the performance of optimal locality-sensitive hashing. Specifically, we construct a data-independent Las Vegas data structure with query time $O(dn^ρ)$ and space usage $O(dn^{1+ρ})$ for $(r, c r)$-approximate near neighbor… ▽ More We show that approximate near neighbor search in high dimensions can be solved in a Las Vegas fashion (i.e., without false negatives) for $\ell_p$ ($1\le p\le 2$) while matching the performance of optimal locality-sensitive hashing. Specifically, we construct a data-independent Las Vegas data structure with query time $O(dn^ρ)$ and space usage $O(dn^{1+ρ})$ for $(r, c r)$-approximate near neighbors in $\mathbb{R}^{d}$ under the $\ell_p$ norm, where $ρ= 1/c^p + o(1)$. Furthermore, we give a Las Vegas locality-sensitive filter construction for the unit sphere that can be used with the data-dependent data structure of Andoni et al. (SODA 2017) to achieve optimal space-time tradeoffs in the data-dependent setting. For the symmetric case, this gives us a data-dependent Las Vegas data structure with query time $O(dn^ρ)$ and space usage $O(dn^{1+ρ})$ for $(r, c r)$-approximate near neighbors in $\mathbb{R}^{d}$ under the $\ell_p$ norm, where $ρ= 1/(2c^p - 1) + o(1)$. Our data-independent construction improves on the recent Las Vegas data structure of Ahle (FOCS 2017) for $\ell_p$ when $1 < p\le 2$. Our data-dependent construction does even better for $\ell_p$ for all $p\in [1, 2]$ and is the first Las Vegas approximate near neighbors data structure to make use of data-dependent approaches. We also answer open questions of Indyk (SODA 2000), Pagh (SODA 2016), and Ahle by showing that for approximate near neighbors, Las Vegas data structures can match state-of-the-art Monte Carlo data structures in performance for both the data-independent and data-dependent settings and across space-time tradeoffs. △ Less

Submitted 19 July, 2018; originally announced July 2018.

arXiv:1511.09113 [pdf]

doi 10.1109/ICASSP.2009.4960329

Bayesian and hybrid Cramer-Rao bounds for QAM dynamical phase estimation

Authors: Jianxiao Yang, Benoit Geller, A Wei

Abstract: -In this paper, we study Bayesian and hybrid Cramer-Rao bounds for the dynamical phase estimation of QAM modulated signals. We present the analytical expressions for the various CRBs. This avoids the calculation of any matrix inversion and thus greatly reduces the computation complexity. Through simulations, we also illustrate the behaviors of the BCRB and of the HCRB with the signal-to-noise rati… ▽ More -In this paper, we study Bayesian and hybrid Cramer-Rao bounds for the dynamical phase estimation of QAM modulated signals. We present the analytical expressions for the various CRBs. This avoids the calculation of any matrix inversion and thus greatly reduces the computation complexity. Through simulations, we also illustrate the behaviors of the BCRB and of the HCRB with the signal-to-noise ratio. Index Terms-Bayesian Cramer-Rao Bound (BCRB), Hybrid Cramer-Rao Bound (HCRB), Synchronization Performance △ Less

Submitted 5 November, 2015; originally announced November 2015.

Journal ref: Acoustics, Speech and Signal Processing, Apr 2009, Taipei, Taiwan. 2009

arXiv:1206.1419 [pdf]

Analysis study of time synchronization protocols in wireless sensor networks

Authors: Salim el Khediri, Nejah Nasri, Mounir Samet, Anne Wei, Abdennaceur Kachouri

Abstract: One of the main pervasive problems Wireless Sensor Networks (WSN) encounter is to maintain flawless communication sharing and cooperative processing between sensors via radio links to ensure a reliable treatment of information. Many applications based on these WSNs consider local clocks at each sensor node that need to be synchronized to a common notion of time. In this context, the majority of pr… ▽ More One of the main pervasive problems Wireless Sensor Networks (WSN) encounter is to maintain flawless communication sharing and cooperative processing between sensors via radio links to ensure a reliable treatment of information. Many applications based on these WSNs consider local clocks at each sensor node that need to be synchronized to a common notion of time. In this context, the majority of previous researches were focused on the study of protocols, and algorithms that address these issues in order to resolve synchronization problems. Previous fforts and empirical studies in wireless sensor network (WSN) proposed several solutions (algorithms). The focus of this this paper is to examine and evaluate the most important synchronization algorithms based on the positions of various quantitative and qualitative synchronization protocols for energy-efficient information processing and routing in WSNs. △ Less

Submitted 7 June, 2012; originally announced June 2012.

Comments: 15 pages, 1 figures, 2 tables

Showing 1–23 of 23 results for author: Wei, A