-
Development of Machine Vision Approach for Mechanical Component Identification based on its Dimension and Pitch
Authors:
Toshit Jain,
Faisel Mushtaq,
K Ramesh,
Sandip Deshmukh,
Tathagata Ray,
Chandu Parimi,
Praveen Tandon,
Pramod Kumar Jha
Abstract:
In this work, a highly customizable and scalable vision based system for automation of mechanical assembly lines is described. The proposed system calculates the features that are required to classify and identify the different kinds of bolts that are used in the assembly line. The system describes a novel method of calculating the pitch of the bolt in addition to bolt identification and calculati…
▽ More
In this work, a highly customizable and scalable vision based system for automation of mechanical assembly lines is described. The proposed system calculates the features that are required to classify and identify the different kinds of bolts that are used in the assembly line. The system describes a novel method of calculating the pitch of the bolt in addition to bolt identification and calculating the dimensions of the bolts. This identification and classification system is extremely lightweight and can be run on bare minimum hardware. The system is very fast in the order of milliseconds, hence the system can be used successfully even if the components are steadily moving on a conveyor. The results show that our system can correctly identify the parts in our dataset with 98% accuracy using the calculated features.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Towards a statistical theory of data selection under weak supervision
Authors:
Germain Kolossov,
Andrea Montanari,
Pulkit Tandon
Abstract:
Given a sample of size $N$, it is often useful to select a subsample of smaller size $n<N$ to be used for statistical estimation or learning. Such a data selection step is useful to reduce the requirements of data labeling and the computational complexity of learning. We assume to be given $N$ unlabeled samples $\{{\boldsymbol x}_i\}_{i\le N}$, and to be given access to a `surrogate model' that ca…
▽ More
Given a sample of size $N$, it is often useful to select a subsample of smaller size $n<N$ to be used for statistical estimation or learning. Such a data selection step is useful to reduce the requirements of data labeling and the computational complexity of learning. We assume to be given $N$ unlabeled samples $\{{\boldsymbol x}_i\}_{i\le N}$, and to be given access to a `surrogate model' that can predict labels $y_i$ better than random guessing. Our goal is to select a subset of the samples, to be denoted by $\{{\boldsymbol x}_i\}_{i\in G}$, of size $|G|=n<N$. We then acquire labels for this set and we use them to train a model via regularized empirical risk minimization.
By using a mixture of numerical experiments on real and synthetic data, and mathematical derivations under low- and high- dimensional asymptotics, we show that: $(i)$~Data selection can be very effective, in particular beating training on the full sample in some cases; $(ii)$~Certain popular choices in data selection methods (e.g. unbiased reweighted subsampling, or influence function-based subsampling) can be substantially suboptimal.
△ Less
Submitted 4 October, 2023; v1 submitted 25 September, 2023;
originally announced September 2023.
-
Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via Text
Authors:
Pulkit Tandon,
Shubham Chandak,
Pat Pataranutaporn,
Yimeng Liu,
Anesu M. Mapuranga,
Pattie Maes,
Tsachy Weissman,
Misha Sra
Abstract:
Video represents the majority of internet traffic today, driving a continual race between the generation of higher quality content, transmission of larger file sizes, and the development of network infrastructure. In addition, the recent COVID-19 pandemic fueled a surge in the use of video conferencing tools. Since videos take up considerable bandwidth (~100 Kbps to a few Mbps), improved video com…
▽ More
Video represents the majority of internet traffic today, driving a continual race between the generation of higher quality content, transmission of larger file sizes, and the development of network infrastructure. In addition, the recent COVID-19 pandemic fueled a surge in the use of video conferencing tools. Since videos take up considerable bandwidth (~100 Kbps to a few Mbps), improved video compression can have a substantial impact on network performance for live and pre-recorded content, providing broader access to multimedia content worldwide. We present a novel video compression pipeline, called Txt2Vid, which dramatically reduces data transmission rates by compressing webcam videos ("talking-head videos") to a text transcript. The text is transmitted and decoded into a realistic reconstruction of the original video using recent advances in deep learning based voice cloning and lip syncing models. Our generative pipeline achieves two to three orders of magnitude reduction in the bitrate as compared to the standard audio-video codecs (encoders-decoders), while maintaining equivalent Quality-of-Experience based on a subjective evaluation by users (n = 242) in an online study. The Txt2Vid framework opens up the potential for creating novel applications such as enabling audio-video communication during poor internet connectivity, or in remote terrains with limited bandwidth. The code for this work is available at https://github.com/tpulkit/txt2vid.git.
△ Less
Submitted 2 April, 2022; v1 submitted 26 June, 2021;
originally announced June 2021.
-
CAMBI: Contrast-aware Multiscale Banding Index
Authors:
Pulkit Tandon,
Mariana Afonso,
Joel Sole,
Lukáš Krasula
Abstract:
Banding artifacts are artificially-introduced contours arising from the quantization of a smooth region in a video. Despite the advent of recent higher quality video systems with more efficient codecs, these artifacts remain conspicuous, especially on larger displays. In this work, a comprehensive subjective study is performed to understand the dependence of the banding visibility on encoding para…
▽ More
Banding artifacts are artificially-introduced contours arising from the quantization of a smooth region in a video. Despite the advent of recent higher quality video systems with more efficient codecs, these artifacts remain conspicuous, especially on larger displays. In this work, a comprehensive subjective study is performed to understand the dependence of the banding visibility on encoding parameters and dithering. We subsequently develop a simple and intuitive no-reference banding index called CAMBI (Contrast-aware Multiscale Banding Index) which uses insights from Contrast Sensitivity Function in the Human Visual System to predict banding visibility. CAMBI correlates well with subjective perception of banding while using only a few visually-motivated hyperparameters.
△ Less
Submitted 29 January, 2021;
originally announced February 2021.
-
Suppressing Background Radiation Using Poisson Principal Component Analysis
Authors:
P. Tandon,
P. Huggins,
A. Dubrawski,
S. Labov,
K. Nelson
Abstract:
Performance of nuclear threat detection systems based on gamma-ray spectrometry often strongly depends on the ability to identify the part of measured signal that can be attributed to background radiation. We have successfully applied a method based on Principal Component Analysis (PCA) to obtain a compact null-space model of background spectra using PCA projection residuals to derive a source det…
▽ More
Performance of nuclear threat detection systems based on gamma-ray spectrometry often strongly depends on the ability to identify the part of measured signal that can be attributed to background radiation. We have successfully applied a method based on Principal Component Analysis (PCA) to obtain a compact null-space model of background spectra using PCA projection residuals to derive a source detection score. We have shown the method's utility in a threat detection system using mobile spectrometers in urban scenes (Tandon et al 2012). While it is commonly assumed that measured photon counts follow a Poisson process, standard PCA makes a Gaussian assumption about the data distribution, which may be a poor approximation when photon counts are low. This paper studies whether and in what conditions PCA with a Poisson-based loss function (Poisson PCA) can outperform standard Gaussian PCA in modeling background radiation to enable more sensitive and specific nuclear threat detection.
△ Less
Submitted 26 May, 2016;
originally announced May 2016.
-
Computer Aided Design Modeling for Heterogeneous Objects
Authors:
Vikas Gupta,
K. S. Kasana,
Puneet Tandon
Abstract:
Heterogeneous object design is an active research area in recent years. The conventional CAD modeling approaches only provide geometry and topology of the object, but do not contain any information with regard to the materials of the object and so can not be used for the fabrication of heterogeneous objects (HO) through rapid prototy**. Current research focuses on computer-aided design issues in…
▽ More
Heterogeneous object design is an active research area in recent years. The conventional CAD modeling approaches only provide geometry and topology of the object, but do not contain any information with regard to the materials of the object and so can not be used for the fabrication of heterogeneous objects (HO) through rapid prototy**. Current research focuses on computer-aided design issues in heterogeneous object design. A new CAD modeling approach is proposed to integrate the material information into geometric regions thus model the material distributions in the heterogeneous object. The gradient references are used to represent the complex geometry heterogeneous objects which have simultaneous geometry intricacies and accurate material distributions. The gradient references helps in flexible manipulability and control to heterogeneous objects, which guarantees the local control over gradient regions of developed heterogeneous objects. A systematic approach on data flow, processing, computer visualization, and slicing of heterogeneous objects for rapid prototy** is also presented.
△ Less
Submitted 20 April, 2010;
originally announced April 2010.