Skip to main content

Showing 1–22 of 22 results for author: Vora, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00101  [pdf, other

    cs.LG cs.AI cs.CC cs.DC cs.NE

    Hybrid Approach to Parallel Stochastic Gradient Descent

    Authors: Aakash Sudhirbhai Vora, Dhrumil Chetankumar Joshi, Aksh Kantibhai Patel

    Abstract: Stochastic Gradient Descent is used for large datasets to train models to reduce the training time. On top of that data parallelism is widely used as a method to efficiently train neural networks using multiple worker nodes in parallel. Synchronous and asynchronous approach to data parallelism is used by most systems to train the model in parallel. However, both of them have their drawbacks. We pr… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  2. arXiv:2405.19338  [pdf, other

    eess.SP cs.AI cs.CV

    Accurate Patient Alignment without Unnecessary Imaging Dose via Synthesizing Patient-specific 3D CT Images from 2D kV Images

    Authors: Yuzhen Ding, Jason M. Holmes, Hongying Feng, Baoxin Li, Lisa A. McGee, Jean-Claude M. Rwigema, Sujay A. Vora, Daniel J. Ma, Robert L. Foote, Samir H. Patel, Wei Liu

    Abstract: In radiotherapy, 2D orthogonally projected kV images are used for patient alignment when 3D-on-board imaging(OBI) unavailable. But tumor visibility is constrained due to the projection of patient's anatomy onto a 2D plane, potentially leading to substantial setup errors. In treatment room with 3D-OBI such as cone beam CT(CBCT), the field of view(FOV) of CBCT is limited with unnecessarily high imag… ▽ More

    Submitted 1 April, 2024; originally announced May 2024.

    Comments: 17 pages, 8 figures and tables

  3. arXiv:2312.00975  [pdf

    physics.med-ph cs.LG

    Noisy probing dose facilitated dose prediction for pencil beam scanning proton therapy: physics enhances generalizability

    Authors: Lian Zhang, Jason M. Holmes, Zhengliang Liu, Hongying Feng, Terence T. Sio, Carlos E. Vargas, Sameer R. Keole, Kristin Stützer, Sheng Li, Tianming Liu, Jiajian Shen, William W. Wong, Sujay A. Vora, Wei Liu

    Abstract: Purpose: Prior AI-based dose prediction studies in photon and proton therapy often neglect underlying physics, limiting their generalizability to handle outlier clinical cases, especially for pencil beam scanning proton therapy (PBSPT). Our aim is to design a physics-aware and generalizable AI-based PBSPT dose prediction method that has the underlying physics considered to achieve high generalizab… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  4. arXiv:2311.16052  [pdf, other

    cs.CV

    Exploring Attribute Variations in Style-based GANs using Diffusion Models

    Authors: Rishubh Parihar, Prasanna Balaji, Raghav Magazine, Sarthak Vora, Tejan Karmali, Varun Jampani, R. Venkatesh Babu

    Abstract: Existing attribute editing methods treat semantic attributes as binary, resulting in a single edit per attribute. However, attributes such as eyeglasses, smiles, or hairstyles exhibit a vast range of diversity. In this work, we formulate the task of \textit{diverse attribute editing} by modeling the multidimensional nature of attribute edits. This enables users to generate multiple plausible edits… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Neurips Workshop on Diffusion Models 2023

  5. arXiv:2310.03874  [pdf, other

    physics.med-ph cs.CL

    Benchmarking a foundation LLM on its ability to re-label structure names in accordance with the AAPM TG-263 report

    Authors: Jason Holmes, Lian Zhang, Yuzhen Ding, Hongying Feng, Zhengliang Liu, Tianming Liu, William W. Wong, Sujay A. Vora, Jonathan B. Ashman, Wei Liu

    Abstract: Purpose: To introduce the concept of using large language models (LLMs) to re-label structure names in accordance with the American Association of Physicists in Medicine (AAPM) Task Group (TG)-263 standard, and to establish a benchmark for future studies to reference. Methods and Materials: The Generative Pre-trained Transformer (GPT)-4 application programming interface (API) was implemented as… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: 20 pages, 5 figures, 1 table

  6. arXiv:2309.06987  [pdf, other

    cs.CV

    Instance Adaptive Prototypical Contrastive Embedding for Generalized Zero Shot Learning

    Authors: Riti Paul, Sahil Vora, Baoxin Li

    Abstract: Generalized zero-shot learning(GZSL) aims to classify samples from seen and unseen labels, assuming unseen labels are not accessible during training. Recent advancements in GZSL have been expedited by incorporating contrastive-learning-based (instance-based) embedding in generative networks and leveraging the semantic relationship between data points. However, existing embedding architectures suff… ▽ More

    Submitted 14 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: 7 pages, 4 figures. Accepted in IJCAI 2023 Workshop on Generalizing from Limited Resources in the Open World

  7. arXiv:2307.02644  [pdf, ps, other

    cs.IT

    Achievable Rates for Information Extraction from a Strategic Sender

    Authors: Anuj S. Vora, Ankur A. Kulkarni

    Abstract: We consider a setting of non-cooperative communication where a receiver wants to recover randomly generated sequences of symbols that are observed by a strategic sender. The sender aims to maximize an average utility that may not align with the recovery criterion of the receiver, whereby the received signals may not be truthful. We pose this problem as a sequential game between the sender and the… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: Submitted to IEEE Transactions on Information Theory

  8. arXiv:2302.00833  [pdf, other

    cs.CV cs.LG

    RobustNeRF: Ignoring Distractors with Robust Losses

    Authors: Sara Sabour, Suhani Vora, Daniel Duckworth, Ivan Krasin, David J. Fleet, Andrea Tagliasacchi

    Abstract: Neural radiance fields (NeRF) excel at synthesizing new views given multi-view, calibrated images of a static scene. When scenes include distractors, which are not persistent during image capture (moving objects, lighting variations, shadows), artifacts appear as view-dependent effects or 'floaters'. To cope with distractors, we advocate a form of robust estimation for NeRF training, modeling dist… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  9. arXiv:2204.08744  [pdf, other

    cs.CV

    Proposal-free Lidar Panoptic Segmentation with Pillar-level Affinity

    Authors: Qi Chen, Sourabh Vora

    Abstract: We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pillar-based bird's-eye view representation. The instance classification head learns pairwise affinity between pillars to determine whether the pillars belong to the same instance or not.… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: CVPRW 2022 Workshop on Autonomous Driving

  10. arXiv:2203.03570  [pdf, other

    cs.CV cs.GR cs.LG

    Kubric: A scalable dataset generator

    Authors: Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti, Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi , et al. (10 additional authors not shown)

    Abstract: Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details. But collecting, processing and annotating real data at scale is difficult, expensive, and frequently raises additional privacy, fairness and legal concerns. Synthetic data is a powerful tool with the potential… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: 21 pages, CVPR2022

  11. arXiv:2111.13260  [pdf, other

    cs.CV cs.RO

    NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes

    Authors: Suhani Vora, Noha Radwan, Klaus Greff, Henning Meyer, Kyle Genova, Mehdi S. M. Sajjadi, Etienne Pot, Andrea Tagliasacchi, Daniel Duckworth

    Abstract: We present NeSF, a method for producing 3D semantic fields from posed RGB images alone. In place of classical 3D representations, our method builds on recent work in implicit neural scene representations wherein 3D structure is captured by point-wise functions. We leverage this methodology to recover 3D density fields upon which we then train a 3D semantic segmentation model supervised by posed 2D… ▽ More

    Submitted 2 December, 2021; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: Project website: https://nesf3d.github.io/. Updated with minor edits to text

  12. arXiv:2111.13152  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.RO

    Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

    Authors: Mehdi S. M. Sajjadi, Henning Meyer, Etienne Pot, Urs Bergmann, Klaus Greff, Noha Radwan, Suhani Vora, Mario Lucic, Daniel Duckworth, Alexey Dosovitskiy, Jakob Uszkoreit, Thomas Funkhouser, Andrea Tagliasacchi

    Abstract: A classical problem in computer vision is to infer a 3D scene representation from few images that can be used to render novel views at interactive rates. Previous work focuses on reconstructing pre-defined 3D representations, e.g. textured meshes, or implicit representations, e.g. radiance fields, and often requires input images with precise camera poses and long processing times for each novel sc… ▽ More

    Submitted 29 March, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: Accepted to CVPR 2022, Project website: https://srt-paper.github.io/

    Journal ref: CVPR 2022

  13. arXiv:2106.07545  [pdf, other

    cs.CV cs.RO

    PolarStream: Streaming Lidar Object Detection and Segmentation with Polar Pillars

    Authors: Qi Chen, Sourabh Vora, Oscar Beijbom

    Abstract: Recent works recognized lidars as an inherently streaming data source and showed that the end-to-end latency of lidar perception models can be reduced significantly by operating on wedge-shaped point cloud sectors rather then the full point cloud. However, due to use of cartesian coordinate systems these methods represent the sectors as rectangular regions, wasting memory and compute. In this work… ▽ More

    Submitted 23 March, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021; code and pretrained models available at https://github.com/motional/polarstream

  14. arXiv:2010.15008  [pdf, ps, other

    cs.IR cs.GT math.OC

    Optimal Questionnaires for Screening of Strategic Agents

    Authors: Anuj S. Vora, Ankur A. Kulkarni

    Abstract: During the COVID-$19$ pandemic the health authorities at airports and train stations try to screen and identify the travellers possibly exposed to the virus. However, many individuals avoid getting tested and hence may misreport their travel history. This is a challenge for the health authorities who wish to ascertain the truly susceptible cases in spite of this strategic misreporting. We investig… ▽ More

    Submitted 28 October, 2020; originally announced October 2020.

    Comments: Longer version of our paper submitted to ICASSP 2021

    MSC Class: 91A28; 94D99

  15. arXiv:2006.10641  [pdf, ps, other

    cs.IT cs.GT eess.SY

    Shannon meets Myerson: Information Extraction from a Strategic Sender

    Authors: Anuj S. Vora, Ankur A. Kulkarni

    Abstract: We study a setting where a receiver must design a questionnaire to recover a sequence of symbols known to strategic sender, whose utility may not be incentive compatible. We allow the receiver the possibility of selecting the alternatives presented in the questionnaire, and thereby linking decisions across the components of the sequence. We show that, despite the strategic sender and the noise in… ▽ More

    Submitted 15 September, 2022; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: Submitted to Games and Economic Behaviour

  16. arXiv:1911.10150  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    PointPainting: Sequential Fusion for 3D Object Detection

    Authors: Sourabh Vora, Alex H. Lang, Bassam Helou, Oscar Beijbom

    Abstract: Camera and lidar are important sensor modalities for robotics in general and self-driving cars in particular. The sensors provide complementary information offering an opportunity for tight sensor-fusion. Surprisingly, lidar-only methods outperform fusion methods on the main benchmark datasets, suggesting a gap in the literature. In this work, we propose PointPainting: a sequential fusion method t… ▽ More

    Submitted 6 May, 2020; v1 submitted 22 November, 2019; originally announced November 2019.

    Comments: 11 pages, 6 figures, 8 tables. v1 is initial submission to CVPR 2020. v2 is final version accepted for publication at CVPR 2020

  17. arXiv:1907.05324  [pdf, ps, other

    cs.IT cs.GT math.OC

    Minimax Theorems for Finite Blocklength Lossy Joint Source-Channel Coding over an AVC

    Authors: Anuj S. Vora, Ankur A. Kulkarni

    Abstract: Motivated by applications in the security of cyber-physical systems, we pose the finite blocklength communication problem in the presence of a jammer as a zero-sum game between the encoder-decoder team and the jammer, by allowing the communicating team as well as the jammer only locally randomized strategies. The communicating team's problem is non-convex under locally randomized codes, and hence,… ▽ More

    Submitted 11 July, 2019; originally announced July 2019.

    Comments: Under review with Problems of Information Transmission

    MSC Class: 94A15; 91A99

  18. arXiv:1903.11027  [pdf, other

    cs.LG cs.CV cs.RO stat.ML

    nuScenes: A multimodal dataset for autonomous driving

    Authors: Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, Oscar Beijbom

    Abstract: Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as object detection, tracking and segmentation of agents in the environment. Most autonomous vehicles, however, carry a combination of cameras and range sensors such as lidar and radar. As machine learning base… ▽ More

    Submitted 5 May, 2020; v1 submitted 26 March, 2019; originally announced March 2019.

    Comments: CVPR 2020 camera ready incl. supplementary material

  19. arXiv:1812.05784  [pdf, other

    cs.LG cs.CV stat.ML

    PointPillars: Fast Encoders for Object Detection from Point Clouds

    Authors: Alex H. Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, Oscar Beijbom

    Abstract: Object detection in point clouds is an important aspect of many robotics applications such as autonomous driving. In this paper we consider the problem of encoding a point cloud into a format appropriate for a downstream detection pipeline. Recent literature suggests two types of encoders; fixed encoders tend to be fast but sacrifice accuracy, while encoders that are learned from data are more acc… ▽ More

    Submitted 6 May, 2019; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: 9 pages. v1 is initial submission to CVPR 2019. v2 is final version accepted for publication at CVPR 2019

  20. arXiv:1811.11358  [pdf, other

    cs.CV

    Future Segmentation Using 3D Structure

    Authors: Suhani Vora, Reza Mahjourian, Soeren Pirk, Anelia Angelova

    Abstract: Predicting the future to anticipate the outcome of events and actions is a critical attribute of autonomous agents; particularly for agents which must rely heavily on real time visual data for decision making. Working towards this capability, we address the task of predicting future frame segmentation from a stream of monocular video by leveraging the 3D structure of the scene. Our framework is ba… ▽ More

    Submitted 27 November, 2018; originally announced November 2018.

  21. arXiv:1802.02690  [pdf, other

    cs.CV

    Driver Gaze Zone Estimation using Convolutional Neural Networks: A General Framework and Ablative Analysis

    Authors: Sourabh Vora, Akshay Rangesh, Mohan M. Trivedi

    Abstract: Driver gaze has been shown to be an excellent surrogate for driver attention in intelligent vehicles. With the recent surge of highly autonomous vehicles, driver gaze can be useful for determining the handoff time to a human driver. While there has been significant improvement in personalized driver gaze zone estimation systems, a generalized system which is invariant to different subjects, perspe… ▽ More

    Submitted 24 April, 2018; v1 submitted 7 February, 2018; originally announced February 2018.

  22. arXiv:1802.00066  [pdf, other

    cs.CV

    Dynamics of Driver's Gaze: Explorations in Behavior Modeling & Maneuver Prediction

    Authors: Sujitha Martin, Sourabh Vora, Kevan Yuen, Mohan M. Trivedi

    Abstract: The study and modeling of driver's gaze dynamics is important because, if and how the driver is monitoring the driving environment is vital for driver assistance in manual mode, for take-over requests in highly automated mode and for semantic perception of the surround in fully autonomous mode. We developed a machine vision based framework to classify driver's gaze into context rich zones of inter… ▽ More

    Submitted 31 January, 2018; originally announced February 2018.