Skip to main content

Showing 1–29 of 29 results for author: jain, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05949  [pdf, other

    cs.CV

    CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

    Authors: Jiachen Li, Xinyao Wang, Sijie Zhu, Chia-Wen Kuo, Lu Xu, Fan Chen, Jitesh Jain, Humphrey Shi, Longyin Wen

    Abstract: Recent advancements in Multimodal Large Language Models (LLMs) have focused primarily on scaling by increasing text-image pair data and enhancing LLMs to improve performance on multimodal tasks. However, these scaling approaches are computationally expensive and overlook the significance of improving model capabilities from the vision side. Inspired by the successful applications of Mixture-of-Exp… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2404.07467  [pdf, other

    cs.CV

    Trashbusters: Deep Learning Approach for Litter Detection and Tracking

    Authors: Kashish Jain, Manthan Juthani, Jash Jain, Anant V. Nimkar

    Abstract: The illegal disposal of trash is a major public health and environmental concern. Disposing of trash in unplanned places poses serious health and environmental risks. We should try to restrict public trash cans as much as possible. This research focuses on automating the penalization of litterbugs, addressing the persistent problem of littering in public places. Traditional approaches relying on m… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  3. arXiv:2403.18819  [pdf, other

    cs.CV

    Benchmarking Object Detectors with COCO: A New Path Forward

    Authors: Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai

    Abstract: The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer,… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Technical report. Dataset website: https://cocorem.xyz and code: https://github.com/kdexd/coco-rem

  4. arXiv:2312.14233  [pdf, other

    cs.CV

    VCoder: Versatile Vision Encoders for Multimodal Large Language Models

    Authors: Jitesh Jain, Jianwei Yang, Humphrey Shi

    Abstract: Humans possess the remarkable skill of Visual Perception, the ability to see and understand the seen, hel** them make sense of the visual world and, in turn, reason. Multimodal Large Language Models (MLLM) have recently achieved impressive performance on vision-language tasks ranging from visual question-answering and image captioning to visual reasoning and image generation. However, when promp… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Project Page: https://praeclarumjj3.github.io/vcoder/

  5. arXiv:2306.05399  [pdf, other

    cs.CV

    Matting Anything

    Authors: Jiachen Li, Jitesh Jain, Humphrey Shi

    Abstract: In this paper, we propose the Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance. MAM offers several significant advantages over previous specialized image matting networks: (i) MAM is capable of dealing with various types of image matting, including se… ▽ More

    Submitted 16 November, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: Project web-page: https://chrisjuniorli.github.io/project/Matting-Anything/

  6. arXiv:2211.06220  [pdf, other

    cs.CV

    OneFormer: One Transformer to Rule Universal Image Segmentation

    Authors: Jitesh Jain, Jiachen Li, MangTik Chiu, Ali Hassani, Nikita Orlov, Humphrey Shi

    Abstract: Universal Image Segmentation is not a new concept. Past attempts to unify image segmentation in the last decades include scene parsing, panoptic segmentation, and, more recently, new panoptic architectures. However, such panoptic architectures do not truly unify image segmentation because they need to be trained individually on the semantic, instance, or panoptic segmentation to achieve the best p… ▽ More

    Submitted 26 December, 2022; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: Project Page: https://praeclarumjj3.github.io/oneformer

  7. arXiv:2208.03382  [pdf, other

    cs.CV

    Keys to Better Image Inpainting: Structure and Texture Go Hand in Hand

    Authors: Jitesh Jain, Yuqian Zhou, Ning Yu, Humphrey Shi

    Abstract: Deep image inpainting has made impressive progress with recent advances in image generation and processing algorithms. We claim that the performance of inpainting algorithms can be better judged by the generated structures and textures. Structures refer to the generated object boundary or novel geometric structures within the hole, while texture refers to high-frequency details, especially man-mad… ▽ More

    Submitted 2 December, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

    Comments: Project page at https://praeclarumjj3.github.io/fcf-inpainting/

  8. arXiv:2206.13395  [pdf, other

    cs.CV cs.AI

    Gait Cycle Reconstruction and Human Identification from Occluded Sequences

    Authors: Abhishek Paul, Manav Mukesh Jain, **esh Jain, Pratik Chattopadhyay

    Abstract: Gait-based person identification from videos captured at surveillance sites using Computer Vision-based techniques is quite challenging since these walking sequences are usually corrupted with occlusion, and a complete cycle of gait is not always available. In this work, we propose an effective neural network-based model to reconstruct the occluded frames in an input sequence before carrying out g… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

  9. arXiv:2201.07788  [pdf, other

    cs.CV cs.AI cs.LG

    ConDor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes

    Authors: Rahul Sajnani, Adrien Poulenard, Jivitesh Jain, Radhika Dua, Leonidas J. Guibas, Srinath Sridhar

    Abstract: Progress in 3D object understanding has relied on manually canonicalized shape datasets that contain instances with consistent position and orientation (3D pose). This has made it hard to generalize these methods to in-the-wild shapes, eg., from internet model collections or depth sensors. ConDor is a self-supervised method that learns to Canonicalize the 3D orientation and position for full and p… ▽ More

    Submitted 14 April, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

    Comments: Accepted to CVPR 2022, New Orleans, Louisiana. For project page and code, see https://ivl.cs.brown.edu/ConDor/

  10. arXiv:2112.12782  [pdf, other

    cs.CV cs.LG

    SeMask: Semantically Masked Transformers for Semantic Segmentation

    Authors: Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi

    Abstract: Finetuning a pretrained backbone in the encoder part of an image transformer network has been the traditional approach for the semantic segmentation task. However, such an approach leaves out the semantic context that an image provides during the encoding stage. This paper argues that incorporating semantic information of the image into pretrained hierarchical transformer-based backbones while fin… ▽ More

    Submitted 13 April, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

    Comments: Updated experiments with Mix-Transformer (MiT) on ADE20K and added an analysis section

  11. arXiv:2110.12780  [pdf, other

    cs.CL

    Battling Hateful Content in Indic Languages HASOC '21

    Authors: Aditya Kadam, Anmol Goel, Jivitesh Jain, Jushaan Singh Kalra, Mallika Subramanian, Manvith Reddy, Prashant Kodali, T. H. Arjun, Manish Shrivastava, Ponnurangam Kumaraguru

    Abstract: The extensive rise in consumption of online social media (OSMs) by a large number of people poses a critical problem of curbing the spread of hateful content on these platforms. With the growing usage of OSMs in multiple languages, the task of detecting and characterizing hate becomes more complex. The subtle variations of code-mixed texts along with switching scripts only add to the complexity. T… ▽ More

    Submitted 5 November, 2021; v1 submitted 25 October, 2021; originally announced October 2021.

    Comments: 12 pages, 6 figures, 2 tables, Accepted at FIRE 2021, CEUR Workshop Proceedings (http://fire.irsi.res.in/fire/2021/home)

  12. What's Kooking? Characterizing India's Emerging Social Network, Koo

    Authors: Asmit Kumar Singh, Chirag Jain, Jivitesh Jain, Rishi Raj Jain, Shradha Sehgal, Tanisha Pandey, Ponnurangam Kumaraguru

    Abstract: Social media has grown exponentially in a short period, coming to the forefront of communications and online interactions. Despite their rapid growth, social media platforms have been unable to scale to different languages globally and remain inaccessible to many. In this paper, we characterize Koo, a multilingual micro-blogging site that rose in popularity in 2021, as an Indian alternative to Twi… ▽ More

    Submitted 20 January, 2022; v1 submitted 24 March, 2021; originally announced March 2021.

    Comments: 9 pages, 11 figures, 6 tables

  13. arXiv:2101.06914  [pdf, other

    cs.CY cs.SI

    Capitol (Pat)riots: A comparative study of Twitter and Parler

    Authors: Hitkul, Avinash Prabhu, Dipanwita Guhathakurta, Jivitesh jain, Mallika Subramanian, Manvith Reddy, Shradha Sehgal, Tanvi Karandikar, Amogh Gulati, Udit Arora, Rajiv Ratn Shah, Ponnurangam Kumaraguru

    Abstract: On 6 January 2021, a mob of right-wing conservatives stormed the USA Capitol Hill interrupting the session of congress certifying 2020 Presidential election results. Immediately after the start of the event, posts related to the riots started to trend on social media. A social media platform which stood out was a free speech endorsing social media platform Parler; it is being claimed as the platfo… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

  14. arXiv:2009.09206  [pdf, other

    cs.OS cs.AI cs.LG stat.ML

    DEAP Cache: Deep Eviction Admission and Prefetching for Cache

    Authors: Ayush Mangal, Jitesh Jain, Keerat Kaur Guliani, Omkar Bhalerao

    Abstract: Recent approaches for learning policies to improve caching, target just one out of the prefetching, admission and eviction processes. In contrast, we propose an end to end pipeline to learn all three policies using machine learning. We also take inspiration from the success of pretraining on large corpora to learn specialized embeddings for the task. We model prefetching as a sequence prediction t… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

  15. arXiv:1808.09964  [pdf, other

    cs.LG cs.CV stat.ML

    Semi-Metrification of the Dynamic Time War** Distance

    Authors: Brijnesh J. Jain

    Abstract: The dynamic time war** (dtw) distance fails to satisfy the triangle inequality and the identity of indiscernibles. As a consequence, the dtw-distance is not war**-invariant, which in turn results in peculiarities in data mining applications. This article converts the dtw-distance to a semi-metric and shows that its canonical extension is war**-invariant. Empirical results indicate that the n… ▽ More

    Submitted 2 September, 2018; v1 submitted 29 August, 2018; originally announced August 2018.

  16. arXiv:1711.09156  [pdf, other

    cs.LG

    Warped-Linear Models for Time Series Classification

    Authors: Brijnesh J. Jain

    Abstract: This article proposes and studies warped-linear models for time series classification. The proposed models are time-warp invariant analogues of linear models. Their construction is in line with time series averaging and extensions of k-means and learning vector quantization to dynamic time war** (DTW) spaces. The main theoretical result is that warped-linear models correspond to polyhedral class… ▽ More

    Submitted 24 November, 2017; originally announced November 2017.

  17. WebPol: Fine-grained Information Flow Policies for Web Browsers

    Authors: Abhishek Bichhawat, Vineet Rajani, **ank Jain, Deepak Garg, Christian Hammer

    Abstract: In the standard web browser programming model, third-party scripts included in an application execute with the same privilege as the application's own code. This leaves the application's confidential data vulnerable to theft and leakage by malicious code and inadvertent bugs in the third-party scripts. Security mechanisms in modern browsers (the same-origin policy, cross-origin resource sharing an… ▽ More

    Submitted 26 June, 2017; v1 submitted 21 June, 2017; originally announced June 2017.

    Comments: ESORICS '17

  18. arXiv:1705.05681  [pdf, other

    cs.LG cs.AI stat.ML

    Optimal War** Paths are unique for almost every Pair of Time Series

    Authors: Brijnesh J. Jain, David Schultz

    Abstract: Update rules for learning in dynamic time war** spaces are based on optimal war** paths between parameter and input time series. In general, optimal war** paths are not unique resulting in adverse effects in theory and practice. Under the assumption of squared error local costs, we show that no two war** paths have identical costs almost everywhere in a measure-theoretic sense. Two direct… ▽ More

    Submitted 2 March, 2018; v1 submitted 16 May, 2017; originally announced May 2017.

  19. arXiv:1610.04460  [pdf, other

    cs.CV math.OC stat.ML

    On the Existence of a Sample Mean in Dynamic Time War** Spaces

    Authors: Brijnesh J. Jain, David Schultz

    Abstract: The concept of sample mean in dynamic time war** (DTW) spaces has been successfully applied to improve pattern recognition systems and generalize centroid-based clustering algorithms. Its existence has neither been proved nor challenged. This article presents sufficient conditions for existence of a sample mean in DTW spaces. The proposed result justifies prior work on approximate mean algorithm… ▽ More

    Submitted 5 March, 2018; v1 submitted 14 October, 2016; originally announced October 2016.

  20. arXiv:1604.07711  [pdf, ps, other

    stat.ML cs.LG

    Condorcet's Jury Theorem for Consensus Clustering and its Implications for Diversity

    Authors: Brijnesh J. Jain

    Abstract: Condorcet's Jury Theorem has been invoked for ensemble classifiers to indicate that the combination of many classifiers can have better predictive performance than a single classifier. Such a theoretical underpinning is unknown for consensus clustering. This article extends Condorcet's Jury Theorem to the mean partition approach under the additional assumptions that a unique ground-truth partition… ▽ More

    Submitted 10 October, 2016; v1 submitted 26 April, 2016; originally announced April 2016.

  21. arXiv:1604.06626  [pdf, other

    cs.LG cs.CV stat.ML

    The Mean Partition Theorem of Consensus Clustering

    Authors: Brijnesh J. Jain

    Abstract: To devise efficient solutions for approximating a mean partition in consensus clustering, Dimitriadou et al. [3] presented a necessary condition of optimality for a consensus function based on least square distances. We show that their result is pivotal for deriving interesting properties of consensus clustering beyond optimization. For this, we present the necessary condition of optimality in a s… ▽ More

    Submitted 26 April, 2016; v1 submitted 22 April, 2016; originally announced April 2016.

  22. arXiv:1602.02543  [pdf, other

    cs.LG cs.CV

    Homogeneity of Cluster Ensembles

    Authors: Brijnesh J. Jain

    Abstract: The expectation and the mean of partitions generated by a cluster ensemble are not unique in general. This issue poses challenges in statistical inference and cluster stability. In this contribution, we state sufficient conditions for uniqueness of expectation and mean. The proposed conditions show that a unique mean is neither exceptional nor generic. To cope with this issue, we introduce homogen… ▽ More

    Submitted 8 February, 2016; originally announced February 2016.

  23. arXiv:1511.00871  [pdf, other

    cs.CV cs.LG stat.ML

    Properties of the Sample Mean in Graph Spaces and the Majorize-Minimize-Mean Algorithm

    Authors: Brijnesh J. Jain

    Abstract: One of the most fundamental concepts in statistics is the concept of sample mean. Properties of the sample mean that are well-defined in Euclidean spaces become unwieldy or even unclear in graph spaces. Open problems related to the sample mean of graphs include: non-existence, non-uniqueness, statistical inconsistency, lack of convergence results of mean algorithms, non-existence of midpoints, and… ▽ More

    Submitted 3 November, 2015; originally announced November 2015.

  24. arXiv:1505.08071  [pdf, other

    cs.CV math.MG

    Geometry of Graph Edit Distance Spaces

    Authors: Brijnesh J. Jain

    Abstract: In this paper we study the geometry of graph spaces endowed with a special class of graph edit distances. The focus is on geometrical results useful for statistical pattern recognition. The main result is the Graph Representation Theorem. It states that a graph is a point in some geometrical space, called orbit space. Orbit spaces are well investigated and easier to explore than the original graph… ▽ More

    Submitted 29 May, 2015; originally announced May 2015.

  25. arXiv:1403.2295  [pdf, ps, other

    cs.LG cs.CV

    Sublinear Models for Graphs

    Authors: Brijnesh J. Jain

    Abstract: This contribution extends linear models for feature vectors to sublinear models for graphs and analyzes their properties. The results are (i) a geometric interpretation of sublinear classifiers, (ii) a generic learning rule based on the principle of empirical risk minimization, (iii) a convergence theorem for the margin perceptron in the sublinearly separable case, and (iv) the VC-dimension of sub… ▽ More

    Submitted 10 March, 2014; originally announced March 2014.

  26. arXiv:1204.4294  [pdf, ps, other

    cs.LG cs.AI cs.CV

    Learning in Riemannian Orbifolds

    Authors: Brijnesh J. Jain, Klaus Obermayer

    Abstract: Learning in Riemannian orbifolds is motivated by existing machine learning algorithms that directly operate on finite combinatorial structures such as point patterns, trees, and graphs. These methods, however, lack statistical justification. This contribution derives consistency results for learning problems in structured domains and thereby generalizes learning in vector spaces and manifolds.

    Submitted 19 April, 2012; originally announced April 2012.

    Comments: arXiv admin note: substantial text overlap with arXiv:1001.0921

  27. arXiv:1001.0927  [pdf, other

    cs.CV

    Accelerating Competitive Learning Graph Quantization

    Authors: Brijnesh J. Jain, Klaus Obermayer

    Abstract: Vector quantization(VQ) is a lossy data compression technique from signal processing for which simple competitive learning is one standard method to quantize patterns from the input space. Extending competitive learning VQ to the domain of graphs results in competitive learning for quantizing input graphs. In this contribution, we propose an accelerated version of competitive learning graph quan… ▽ More

    Submitted 6 January, 2010; originally announced January 2010.

    Comments: 17 pages; submitted to CVIU

  28. arXiv:1001.0921  [pdf, other

    cs.AI

    Graph Quantization

    Authors: Brijnesh J. Jain, Klaus Obermayer

    Abstract: Vector quantization(VQ) is a lossy data compression technique from signal processing, which is restricted to feature vectors and therefore inapplicable for combinatorial structures. This contribution presents a theoretical foundation of graph quantization (GQ) that extends VQ to the domain of attributed graphs. We present the necessary Lloyd-Max conditions for optimality of a graph quantizer and… ▽ More

    Submitted 6 January, 2010; originally announced January 2010.

    Comments: 24 pages; submitted to CVIU

  29. arXiv:0912.4598  [pdf, other

    cs.AI

    Elkan's k-Means for Graphs

    Authors: Brijnesh J. Jain, Klaus Obermayer

    Abstract: This paper extends k-means algorithms from the Euclidean domain to the domain of graphs. To recompute the centroids, we apply subgradient methods for solving the optimization-based formulation of the sample mean of graphs. To accelerate the k-means algorithm for graphs without trading computational time against solution quality, we avoid unnecessary graph distance calculations by exploiting the… ▽ More

    Submitted 23 December, 2009; originally announced December 2009.

    Comments: 21 pages; submitted to MLJ