Skip to main content

Showing 1–11 of 11 results for author: Phan, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16829  [pdf, other

    cs.CL cs.AI cs.LG

    Understanding and Mitigating Tokenization Bias in Language Models

    Authors: Buu Phan, Marton Havasi, Matthew Muckley, Karen Ullrich

    Abstract: State-of-the-art language models are autoregressive and operate on subword units known as tokens. Specifically, one must encode the conditioning string into a list of tokens before passing to the language models for next-token prediction. We show that, for encoding schemes such as maximum prefix matching, tokenization induces a sampling bias that cannot be mitigated with more training or data. To… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2401.02609  [pdf, other

    cs.IT

    Importance Matching Lemma for Lossy Compression with Side Information

    Authors: Buu Phan, Ashish Khisti, Christos Louizos

    Abstract: We propose two extensions to existing importance sampling based methods for lossy compression. First, we introduce an importance sampling based compression scheme that is a variant of ordered random coding (Theis and Ahmed, 2022) and is amenable to direct evaluation of the achievable compression rate for a finite number of samples. Our second and major contribution is the importance matching lemma… ▽ More

    Submitted 8 March, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

  3. arXiv:2305.19301  [pdf, other

    eess.IV cs.CV cs.IT cs.LG

    On the Choice of Perception Loss Function for Learned Video Compression

    Authors: Sadaf Salehkalaibar, Buu Phan, Jun Chen, Wei Yu, Ashish Khisti

    Abstract: We study causal, low-latency, sequential video compression when the output is subjected to both a mean squared-error (MSE) distortion loss as well as a perception loss to target realism. Motivated by prior approaches, we consider two different perception loss functions (PLFs). The first, PLF-JD, considers the joint distribution (JD) of all the video frames up to the current one, while the second m… ▽ More

    Submitted 22 August, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

  4. arXiv:2102.03728  [pdf, other

    cs.CV eess.IV

    Adversarial Imaging Pipelines

    Authors: Buu Phan, Fahim Mannan, Felix Heide

    Abstract: Adversarial attacks play an essential role in understanding deep neural network predictions and improving their robustness. Existing attack methods aim to deceive convolutional neural network (CNN)-based classifiers by manipulating RGB images that are fed directly to the classifiers. However, these approaches typically neglect the influence of the camera optics and image processing pipeline (ISP)… ▽ More

    Submitted 19 February, 2021; v1 submitted 7 February, 2021; originally announced February 2021.

  5. arXiv:2012.13668  [pdf, other

    cs.LG cs.CV cs.SD eess.AS

    Deep Learning Framework Applied for Predicting Anomaly of Respiratory Sounds

    Authors: Dat Ngo, Lam Pham, Anh Nguyen, Ben Phan, Khoa Tran, Truong Nguyen

    Abstract: This paper proposes a robust deep learning framework used for classifying anomaly of respiratory cycles. Initially, our framework starts with front-end feature extraction step. This step aims to transform the respiratory input sound into a two-dimensional spectrogram where both spectral and temporal features are well presented. Next, an ensemble of C- DNN and Autoencoder networks is then applied t… ▽ More

    Submitted 25 December, 2020; originally announced December 2020.

    Comments: 5 pages, 2 figures, 8 tables

  6. Seeing Around Street Corners: Non-Line-of-Sight Detection and Tracking In-the-Wild Using Doppler Radar

    Authors: Nicolas Scheiner, Florian Kraus, Fangyin Wei, Buu Phan, Fahim Mannan, Nils Appenrodt, Werner Ritter, Jürgen Dickmann, Klaus Dietmayer, Bernhard Sick, Felix Heide

    Abstract: Conventional sensor systems record information about directly visible objects, whereas occluded scene components are considered lost in the measurement process. Non-line-of-sight (NLOS) methods try to recover such hidden objects from their indirect reflections - faint signal components, traditionally treated as measurement noise. Existing NLOS approaches struggle to record these low-signal compone… ▽ More

    Submitted 31 March, 2020; v1 submitted 13 December, 2019; originally announced December 2019.

    Comments: First three authors contributed equally; Accepted at CVPR 2020

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 2068-2077

  7. arXiv:1910.10307  [pdf, other

    cs.LG stat.ML

    Detecting Out-of-Distribution Inputs in Deep Neural Networks Using an Early-Layer Output

    Authors: Vahdat Abdelzad, Krzysztof Czarnecki, Rick Salay, Taylor Denounden, Sachin Vernekar, Buu Phan

    Abstract: Deep neural networks achieve superior performance in challenging tasks such as image classification. However, deep classifiers tend to incorrectly classify out-of-distribution (OOD) inputs, which are inputs that do not belong to the classifier training distribution. Several approaches have been proposed to detect OOD inputs, but the detection task is still an ongoing challenge. In this paper, we p… ▽ More

    Submitted 22 October, 2019; originally announced October 2019.

    Comments: 15 pages, 8 figures

  8. arXiv:1904.12220  [pdf, other

    cs.LG cs.CV stat.ML

    Analysis of Confident-Classifiers for Out-of-distribution Detection

    Authors: Sachin Vernekar, Ashish Gaurav, Taylor Denouden, Buu Phan, Vahdat Abdelzad, Rick Salay, Krzysztof Czarnecki

    Abstract: Discriminatively trained neural classifiers can be trusted, only when the input data comes from the training distribution (in-distribution). Therefore, detecting out-of-distribution (OOD) samples is very important to avoid classification errors. In the context of OOD detection for image classification, one of the recent approaches proposes training a classifier called "confident-classifier" by min… ▽ More

    Submitted 27 April, 2019; originally announced April 2019.

    Comments: SafeML 2019 ICLR workshop paper

  9. arXiv:1812.02765  [pdf, other

    cs.LG stat.ML

    Improving Reconstruction Autoencoder Out-of-distribution Detection with Mahalanobis Distance

    Authors: Taylor Denouden, Rick Salay, Krzysztof Czarnecki, Vahdat Abdelzad, Buu Phan, Sachin Vernekar

    Abstract: There is an increasingly apparent need for validating the classifications made by deep learning systems in safety-critical applications like autonomous vehicle systems. A number of recent papers have proposed methods for detecting anomalous image data that appear different from known inlier data samples, including reconstruction-based autoencoders. Autoencoders optimize the compression of input da… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

    Comments: 9 pages, 5 figures

  10. arXiv:1811.11210  [pdf, other

    cs.LG stat.ML

    Calibrating Uncertainties in Object Localization Task

    Authors: Buu Phan, Rick Salay, Krzysztof Czarnecki, Vahdat Abdelzad, Taylor Denouden, Sachin Vernekar

    Abstract: In many safety-critical applications such as autonomous driving and surgical robots, it is desirable to obtain prediction uncertainties from object detection modules to help support safe decision-making. Specifically, such modules need to estimate the probability of each predicted object in a given region and the confidence interval for its bounding box. While recent Bayesian deep learning methods… ▽ More

    Submitted 27 November, 2018; originally announced November 2018.

  11. arXiv:1805.01955   

    cs.LG stat.ML

    Improve Uncertainty Estimation for Unknown Classes in Bayesian Neural Networks with Semi-Supervised /One Set Classification

    Authors: Buu Phan

    Abstract: Although deep neural network (DNN) has achieved many state-of-the-art results, estimating the uncertainty presented in the DNN model and the data is a challenging task. Problems related to uncertainty such as classifying unknown classes (class which does not appear in the training data) data as known class with high confidence, is critically concerned in the safety domain area (e.g, autonomous dri… ▽ More

    Submitted 16 May, 2018; v1 submitted 4 May, 2018; originally announced May 2018.

    Comments: Major updates for current version required, especially on section 3 and format