Search | arXiv e-print repository

Non-Generative Energy Based Models

Authors: Jacob Piland, Christopher Sweet, Priscila Saboia, Charles Vardeman II, Adam Czajka

Abstract: Energy-based models (EBM) have become increasingly popular within computer vision. EBMs bring a probabilistic approach to training deep neural networks (DNN) and have been shown to enhance performance in areas such as calibration, out-of-distribution detection, and adversarial resistance. However, these advantages come at the cost of estimating input data probabilities, usually using a Langevin ba… ▽ More Energy-based models (EBM) have become increasingly popular within computer vision. EBMs bring a probabilistic approach to training deep neural networks (DNN) and have been shown to enhance performance in areas such as calibration, out-of-distribution detection, and adversarial resistance. However, these advantages come at the cost of estimating input data probabilities, usually using a Langevin based method such as Stochastic Gradient Langevin Dynamics (SGLD), which bring additional computational costs, require parameterization, caching methods for efficiency, and can run into stability and scaling issues. EBMs use dynamical methods to draw samples from the probability density function (PDF) defined by the current state of the network and compare them to the training data using a maximum log likelihood approach to learn the correct PDF. We propose a non-generative training approach, Non-Generative EBM (NG-EBM), that utilizes the {\it{Approximate Mass}}, identified by Grathwohl et al., as a loss term to direct the training. We show that our NG-EBM training strategy retains many of the benefits of EBM in calibration, out-of-distribution detection, and adversarial resistance, but without the computational complexity and overhead of the traditional approaches. In particular, the NG-EBM approach improves the Expected Calibration Error by a factor of 2.5 for CIFAR10 and 7.5 times for CIFAR100, when compared to traditionally trained models. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: 12 pages, 4 figures

arXiv:2303.11969 [pdf, other]

Explain To Me: Salience-Based Explainability for Synthetic Face Detection Models

Authors: Colton Crum, Patrick Tinsley, Aidan Boyd, Jacob Piland, Christopher Sweet, Timothy Kelley, Kevin Bowyer, Adam Czajka

Abstract: The performance of convolutional neural networks has continued to improve over the last decade. At the same time, as model complexity grows, it becomes increasingly more difficult to explain model decisions. Such explanations may be of critical importance for reliable operation of human-machine pairing setups, or for model selection when the "best" model among many equally-accurate models must be… ▽ More The performance of convolutional neural networks has continued to improve over the last decade. At the same time, as model complexity grows, it becomes increasingly more difficult to explain model decisions. Such explanations may be of critical importance for reliable operation of human-machine pairing setups, or for model selection when the "best" model among many equally-accurate models must be established. Saliency maps represent one popular way of explaining model decisions by highlighting image regions models deem important when making a prediction. However, examining salience maps at scale is not practical. In this paper, we propose five novel methods of leveraging model salience to explain a model behavior at scale. These methods ask: (a) what is the average entropy for a model's salience maps, (b) how does model salience change when fed out-of-set samples, (c) how closely does model salience follow geometrical transformations, (d) what is the stability of model salience across independent training runs, and (e) how does model salience react to salience-guided image degradations. To assess the proposed measures on a concrete and topical problem, we conducted a series of experiments for the task of synthetic face detection with two types of models: those trained traditionally with cross-entropy loss, and those guided by human salience when training to increase model generalizability. These two types of models are characterized by different, interpretable properties of their salience maps, which allows for the evaluation of the correctness of the proposed measures. We offer source codes for each measure along with this paper. △ Less

Submitted 27 March, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

Comments: 13 pages, 10 figures

arXiv:2303.00818 [pdf, other]

Improving Model's Focus Improves Performance of Deep Learning-Based Synthetic Face Detectors

Authors: Jacob Piland, Adam Czajka, Christopher Sweet

Abstract: Deep learning-based models generalize better to unknown data samples after being guided "where to look" by incorporating human perception into training strategies. We made an observation that the entropy of the model's salience trained in that way is lower when compared to salience entropy computed for models training without human perceptual intelligence. Thus the question: does further increase… ▽ More Deep learning-based models generalize better to unknown data samples after being guided "where to look" by incorporating human perception into training strategies. We made an observation that the entropy of the model's salience trained in that way is lower when compared to salience entropy computed for models training without human perceptual intelligence. Thus the question: does further increase of model's focus, by lowering the entropy of model's class activation map, help in further increasing the performance? In this paper we propose and evaluate several entropy-based new loss function components controlling the model's focus, covering the full range of the level of such control, from none to its "aggressive" minimization. We show, using a problem of synthetic face detection, that improving the model's focus, through lowering entropy, leads to models that perform better in an open-set scenario, in which the test samples are synthesized by unknown generative models. We also show that optimal performance is obtained when the model's loss function blends three aspects: regular classification, low-entropy of the model's focus, and human-guided saliency. △ Less

Submitted 1 March, 2023; originally announced March 2023.

Comments: 15 pages, 7 figures

arXiv:1908.01219 [pdf, other]

On the Veracity of Cyber Intrusion Alerts Synthesized by Generative Adversarial Networks

Authors: Christopher Sweet, Stephen Moskal, Shanchieh Jay Yang

Abstract: Recreating cyber-attack alert data with a high level of fidelity is challenging due to the intricate interaction between features, non-homogeneity of alerts, and potential for rare yet critical samples. Generative Adversarial Networks (GANs) have been shown to effectively learn complex data distributions with the intent of creating increasingly realistic data. This paper presents the application o… ▽ More Recreating cyber-attack alert data with a high level of fidelity is challenging due to the intricate interaction between features, non-homogeneity of alerts, and potential for rare yet critical samples. Generative Adversarial Networks (GANs) have been shown to effectively learn complex data distributions with the intent of creating increasingly realistic data. This paper presents the application of GANs to cyber-attack alert data and shows that GANs not only successfully learn to generate realistic alerts, but also reveal feature dependencies within alerts. This is accomplished by reviewing the intersection of histograms for varying alert-feature combinations between the ground truth and generated datsets. Traditional statistical metrics, such as conditional and joint entropy, are also employed to verify the accuracy of these dependencies. Finally, it is shown that a Mutual Information constraint on the network can be used to increase the generation of low probability, critical, alert values. By map** alerts to a set of attack stages it is shown that the output of these low probability alerts has a direct contextual meaning for Cyber Security analysts. Overall, this work provides the basis for generating new cyber intrusion alerts and provides evidence that synthesized alerts emulate critical dependencies from the source dataset. △ Less

Submitted 3 August, 2019; originally announced August 2019.

arXiv:1704.04251 [pdf, other]

Visual Recognition of Paper Analytical Device Images for Detection of Falsified Pharmaceuticals

Authors: Sandipan Banerjee, James Sweet, Christopher Sweet, Marya Lieberman

Abstract: Falsification of medicines is a big problem in many develo** countries, where technological infrastructure is inadequate to detect these harmful products. We have developed a set of inexpensive paper cards, called Paper Analytical Devices (PADs), which can efficiently classify drugs based on their chemical composition, as a potential solution to the problem. These cards have different reagents e… ▽ More Falsification of medicines is a big problem in many develo** countries, where technological infrastructure is inadequate to detect these harmful products. We have developed a set of inexpensive paper cards, called Paper Analytical Devices (PADs), which can efficiently classify drugs based on their chemical composition, as a potential solution to the problem. These cards have different reagents embedded in them which produce a set of distinctive color descriptors upon reacting with the chemical compounds that constitute pharmaceutical dosage forms. If a falsified version of the medicine lacks the active ingredient or includes substitute fillers, the difference in color is perceivable by humans. However, reading the cards with accuracy takes training and practice, which may hamper their scaling and implementation in low resource settings. To deal with this, we have developed an automatic visual recognition system to read the results from the PAD images. At first, the optimal set of reagents was found by running singular value decomposition on the intensity values of the color tones in the card images. A dataset of cards embedded with these reagents is produced to generate the most distinctive results for a set of 26 different active pharmaceutical ingredients (APIs) and excipients. Then, we train two popular convolutional neural network (CNN) models, with the card images. We also extract some "hand-crafted" features from the images and train a nearest neighbor classifier and a non-linear support vector machine with them. On testing, higher-level features performed much better in accurately classifying the PAD images, with the CNN models reaching the highest average accuracy of over 94\%. △ Less

Submitted 13 April, 2017; originally announced April 2017.

Comments: in Proc. IEEE Winter Conference on Applications of Computer Vision (WACV), 2016

arXiv:1008.3591 [pdf, ps, other]

doi 10.1088/0067-0049/190/1/100

Discoveries from a Near-infrared Proper Motion Survey using Multi-epoch 2MASS Data

Authors: J. Davy Kirkpatrick, Dagny L. Looper, Adam J. Burgasser, Steven D. Schurr, Roc M. Cutri, Michael C. Cushing, Kelle L. Cruz, Anne C. Sweet, Gillian R. Knapp, Travis S. Barman, John J. Bochanski, Thomas L. Roellig, Ian S. McLean, Mark R. McGovern, Emily L. Rice

Abstract: We have conducted a 4030-square-deg near-infrared proper motion survey using multi-epoch data from the Two Micron All-Sky Survey (2MASS). We find 2778 proper motion candidates, 647 of which are not listed in SIMBAD. After comparison to DSS images, we find that 107 of our proper motion candidates lack counterparts at B-, R-, and I-bands and are thus 2MASS-only detections. We present results of spec… ▽ More We have conducted a 4030-square-deg near-infrared proper motion survey using multi-epoch data from the Two Micron All-Sky Survey (2MASS). We find 2778 proper motion candidates, 647 of which are not listed in SIMBAD. After comparison to DSS images, we find that 107 of our proper motion candidates lack counterparts at B-, R-, and I-bands and are thus 2MASS-only detections. We present results of spectroscopic follow-up of 188 targets that include the infrared-only sources along with selected optical-counterpart sources with faint reduced proper motions or interesting colors. We also establish a set of near-infrared spectroscopic standards with which to anchor near-infrared classifications for our objects. Among the discoveries are six young field brown dwarfs, five "red L" dwarfs, three L-type subdwarfs, twelve M-type subdwarfs, eight "blue L" dwarfs, and several T dwarfs. We further refine the definitions of these exotic classes to aid future identification of similar objects. We examine their kinematics and find that both the "blue L" and "red L" dwarfs appear to be drawn from a relatively old population. This survey provides a glimpse of the kinds of research that will be possible through time-domain infrared projects such as the UKIDSS Large Area Survey, various VISTA surveys, and WISE, and also through z- or y-band enabled, multi-epoch surveys such as Pan-STARRS and LSST. △ Less

Submitted 20 August, 2010; originally announced August 2010.

Comments: To appear in the September 2010 issue of The Astrophysical Journal, Supplement Series

Showing 1–6 of 6 results for author: Sweet, C