Search | arXiv e-print repository

arXiv:2403.12092 [pdf, other]

Methods for Matching English Language Addresses

Abstract: Addresses occupy a niche location within the landscape of textual data, due to the positional importance carried by every word, and the geographical scope it refers to. The task of matching addresses happens everyday and is present in various fields like mail redirection, entity resolution, etc. Our work defines, and formalizes a framework to generate matching and mismatching pairs of addresses in… ▽ More Addresses occupy a niche location within the landscape of textual data, due to the positional importance carried by every word, and the geographical scope it refers to. The task of matching addresses happens everyday and is present in various fields like mail redirection, entity resolution, etc. Our work defines, and formalizes a framework to generate matching and mismatching pairs of addresses in the English language, and use it to evaluate various methods to automatically perform address matching. These methods vary widely from distance based approaches to deep learning models. By studying the Precision, Recall and Accuracy metrics of these approaches, we obtain an understanding of the best suited method for this setting of the address matching task. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2311.12521 [pdf, other]

Classification of Tabular Data by Text Processing

Authors: Keshav Ramani, Daniel Borrajo

Abstract: Natural Language Processing technology has advanced vastly in the past decade. Text processing has been successfully applied to a wide variety of domains. In this paper, we propose a novel framework, Text Based Classification(TBC), that uses state of the art text processing techniques to solve classification tasks on tabular data. We provide a set of controlled experiments where we present the ben… ▽ More Natural Language Processing technology has advanced vastly in the past decade. Text processing has been successfully applied to a wide variety of domains. In this paper, we propose a novel framework, Text Based Classification(TBC), that uses state of the art text processing techniques to solve classification tasks on tabular data. We provide a set of controlled experiments where we present the benefits of using this approach against other classification methods. Experimental results on several data sets also show that this framework achieves comparable performance to that of several state of the art models in accuracy, precision and recall of predicted classes. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2310.13167 [pdf, other]

Visualizing Causality in Mixed Reality for Manual Task Learning: An Exploratory Study

Authors: Rahul Jain, **gyu Shi, Andrew Benton, Moiz Rasheed, Hyungjun Doh, Subramanian Chidambaram, Karthik Ramani

Abstract: Mixed Reality (MR) is gaining prominence in manual task skill learning due to its in-situ, embodied, and immersive experience. To teach manual tasks, current methodologies break the task into hierarchies (tasks into subtasks) and visualize the current subtask and future in terms of causality. Existing psychology literature also shows that humans learn tasks by breaking them into hierarchies. In or… ▽ More Mixed Reality (MR) is gaining prominence in manual task skill learning due to its in-situ, embodied, and immersive experience. To teach manual tasks, current methodologies break the task into hierarchies (tasks into subtasks) and visualize the current subtask and future in terms of causality. Existing psychology literature also shows that humans learn tasks by breaking them into hierarchies. In order to understand the design space of information visualized to the learner for better task understanding, we conducted a user study with 48 users. The study was conducted using a complex assembly task, which involves learning of both actions and tool usage. We aim to explore the effect of visualization of causality in the hierarchy for manual task learning in MR by four options: no causality, event level causality, interaction level causality, and gesture level causality. The results show that the user understands and performs best when all the level of causality is shown to the user. Based on the results, we further provide design recommendations and in-depth discussions for future manual task learning systems. △ Less

Submitted 31 January, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

arXiv:2310.13149 [pdf, other]

Understanding Generative AI in Art: An Interview Study with Artists on G-AI from an HCI Perspective

Authors: **gyu Shi, Rahul Jain, Runlin Duan, Karthik Ramani

Abstract: The emergence of Generative Artificial Intelligence (G-AI) has changed the landscape of creative arts with its power to compose novel artwork and thus brought ethical concerns. Despite the efforts by prior works to address these concerns from technical and societal perspectives, there exists little discussion on this topic from an HCI point of view, considering the artists as human factors. We sou… ▽ More The emergence of Generative Artificial Intelligence (G-AI) has changed the landscape of creative arts with its power to compose novel artwork and thus brought ethical concerns. Despite the efforts by prior works to address these concerns from technical and societal perspectives, there exists little discussion on this topic from an HCI point of view, considering the artists as human factors. We sought to investigate the impact of G-AI on artists, understanding the relationship between artists and G-AI, in order to motivate the underlying HCI research. We conducted semi-structured interviews with artists ($N=25$) from diverse artistic disciplines involved with G-AI in their artistic creation. We found (1) a dilemma among the artists, (2) a disparity in the understanding of G-AI between the artists and the AI developers(3) a tendency to oppose G-AI among the artists. We discuss the future opportunities of HCI research to tackle the problems identified from the interviews. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2310.10547 [pdf, other]

InfoGCN++: Learning Representation by Predicting the Future for Online Human Skeleton-based Action Recognition

Authors: Seunggeun Chi, Hyung-gun Chi, Qixing Huang, Karthik Ramani

Abstract: Skeleton-based action recognition has made significant advancements recently, with models like InfoGCN showcasing remarkable accuracy. However, these models exhibit a key limitation: they necessitate complete action observation prior to classification, which constrains their applicability in real-time situations such as surveillance and robotic systems. To overcome this barrier, we introduce InfoG… ▽ More Skeleton-based action recognition has made significant advancements recently, with models like InfoGCN showcasing remarkable accuracy. However, these models exhibit a key limitation: they necessitate complete action observation prior to classification, which constrains their applicability in real-time situations such as surveillance and robotic systems. To overcome this barrier, we introduce InfoGCN++, an innovative extension of InfoGCN, explicitly developed for online skeleton-based action recognition. InfoGCN++ augments the abilities of the original InfoGCN model by allowing real-time categorization of action types, independent of the observation sequence's length. It transcends conventional approaches by learning from current and anticipated future movements, thereby creating a more thorough representation of the entire sequence. Our approach to prediction is managed as an extrapolation issue, grounded on observed actions. To enable this, InfoGCN++ incorporates Neural Ordinary Differential Equations, a concept that lets it effectively model the continuous evolution of hidden states. Following rigorous evaluations on three skeleton-based action recognition benchmarks, InfoGCN++ demonstrates exceptional performance in online action recognition. It consistently equals or exceeds existing techniques, highlighting its significant potential to reshape the landscape of real-time action recognition applications. Consequently, this work represents a major leap forward from InfoGCN, pushing the limits of what's possible in online, skeleton-based action recognition. The code for InfoGCN++ is publicly available at https://github.com/stnoah1/infogcn2 for further exploration and validation. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2310.07127 [pdf, other]

An HCI-Centric Survey and Taxonomy of Human-Generative-AI Interactions

Authors: **gyu Shi, Rahul Jain, Hyungjun Doh, Ryo Suzuki, Karthik Ramani

Abstract: Generative AI (GenAI) has shown remarkable capabilities in generating diverse and realistic content across different formats like images, videos, and text. In Generative AI, human involvement is essential, thus HCI literature has investigated how to effectively create collaborations between humans and GenAI systems. However, the current literature lacks a comprehensive framework to better understa… ▽ More Generative AI (GenAI) has shown remarkable capabilities in generating diverse and realistic content across different formats like images, videos, and text. In Generative AI, human involvement is essential, thus HCI literature has investigated how to effectively create collaborations between humans and GenAI systems. However, the current literature lacks a comprehensive framework to better understand Human-GenAI Interactions, as the holistic aspects of human-centered GenAI systems are rarely analyzed systematically. In this paper, we present a survey of 291 papers, providing a novel taxonomy and analysis of Human-GenAI Interactions from both human and Gen-AI perspectives. The dimensions of design space include 1) Purposes of Using Generative AI, 2) Feedback from Models to Users, 3) Control from Users to Models, 4) Levels of Engagement, 5) Application Domains, and 6) Evaluation Strategies. Our work is also timely at the current development stage of GenAI, where the Human-GenAI interaction design is of paramount importance. We also highlight challenges and opportunities to guide the design of Gen-AI systems and interactions towards the future design of human-centered Generative AI applications. △ Less

Submitted 12 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2306.05562 [pdf, other]

AircraftVerse: A Large-Scale Multimodal Dataset of Aerial Vehicle Designs

Authors: Adam D. Cobb, Anirban Roy, Daniel Elenius, F. Michael Heim, Brian Swenson, Sydney Whittington, James D. Walker, Theodore Bapty, Joseph Hite, Karthik Ramani, Christopher McComb, Susmit Jha

Abstract: We present AircraftVerse, a publicly available aerial vehicle design dataset. Aircraft design encompasses different physics domains and, hence, multiple modalities of representation. The evaluation of these cyber-physical system (CPS) designs requires the use of scientific analytical and simulation models ranging from computer-aided design tools for structural and manufacturing analysis, computati… ▽ More We present AircraftVerse, a publicly available aerial vehicle design dataset. Aircraft design encompasses different physics domains and, hence, multiple modalities of representation. The evaluation of these cyber-physical system (CPS) designs requires the use of scientific analytical and simulation models ranging from computer-aided design tools for structural and manufacturing analysis, computational fluid dynamics tools for drag and lift computation, battery models for energy estimation, and simulation models for flight control and dynamics. AircraftVerse contains 27,714 diverse air vehicle designs - the largest corpus of engineering designs with this level of complexity. Each design comprises the following artifacts: a symbolic design tree describing topology, propulsion subsystem, battery subsystem, and other design details; a STandard for the Exchange of Product (STEP) model data; a 3D CAD design using a stereolithography (STL) file format; a 3D point cloud for the shape of the design; and evaluation results from high fidelity state-of-the-art physics models that characterize performance metrics such as maximum flight distance and hover-time. We also present baseline surrogate models that use different modalities of design representation to predict design performance metrics, which we provide as part of our dataset release. Finally, we discuss the potential impact of this dataset on the use of learning in aircraft design and, more generally, in CPS. AircraftVerse is accompanied by a data card, and it is released under Creative Commons Attribution-ShareAlike (CC BY-SA) license. The dataset is hosted at https://zenodo.org/record/6525446, baseline models and code at https://github.com/SRI-CSL/AircraftVerse, and the dataset description at https://aircraftverse.onrender.com/. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: The dataset is hosted at https://zenodo.org/record/6525446, baseline models and code at https://github.com/SRI-CSL/AircraftVerse, and the dataset description at https://aircraftverse.onrender.com/

arXiv:2305.15257 [pdf, other]

doi 10.1016/j.cma.2023.116229

Deep Ritz Method with Adaptive Quadrature for Linear Elasticity

Authors: Min Liu, Zhiqiang Cai, Karthik Ramani

Abstract: In this paper, we study the deep Ritz method for solving the linear elasticity equation from a numerical analysis perspective. A modified Ritz formulation using the $H^{1/2}(Γ_D)$ norm is introduced and analyzed for linear elasticity equation in order to deal with the (essential) Dirichlet boundary condition. We show that the resulting deep Ritz method provides the best approximation among the set… ▽ More In this paper, we study the deep Ritz method for solving the linear elasticity equation from a numerical analysis perspective. A modified Ritz formulation using the $H^{1/2}(Γ_D)$ norm is introduced and analyzed for linear elasticity equation in order to deal with the (essential) Dirichlet boundary condition. We show that the resulting deep Ritz method provides the best approximation among the set of deep neural network (DNN) functions with respect to the ``energy'' norm. Furthermore, we demonstrate that the total error of the deep Ritz simulation is bounded by the sum of the network approximation error and the numerical integration error, disregarding the algebraic error. To effectively control the numerical integration error, we propose an adaptive quadrature-based numerical integration technique with a residual-based local error indicator. This approach enables efficient approximation of the modified energy functional. Through numerical experiments involving smooth and singular problems, as well as problems with stress concentration, we validate the effectiveness and efficiency of the proposed deep Ritz method with adaptive quadrature. △ Less

Submitted 24 May, 2023; originally announced May 2023.

arXiv:2109.03783 [pdf, other]

Egocentric View Hand Action Recognition by Leveraging Hand Surface and Hand Grasp Type

Authors: Sangpil Kim, Jihyun Bae, Hyunggun Chi, Sunghee Hong, Byoung Soo Koh, Karthik Ramani

Abstract: We introduce a multi-stage framework that uses mean curvature on a hand surface and focuses on learning interaction between hand and object by analyzing hand grasp type for hand action recognition in egocentric videos. The proposed method does not require 3D information of objects including 6D object poses which are difficult to annotate for learning an object's behavior while it interacts with ha… ▽ More We introduce a multi-stage framework that uses mean curvature on a hand surface and focuses on learning interaction between hand and object by analyzing hand grasp type for hand action recognition in egocentric videos. The proposed method does not require 3D information of objects including 6D object poses which are difficult to annotate for learning an object's behavior while it interacts with hands. Instead, the framework synthesizes the mean curvature of the hand mesh model to encode the hand surface geometry in 3D space. Additionally, our method learns the hand grasp type which is highly correlated with the hand action. From our experiment, we notice that using hand grasp type and mean curvature of hand increases the performance of the hand action recognition. △ Less

Submitted 8 September, 2021; originally announced September 2021.

arXiv:2105.09878 [pdf, other]

doi 10.1016/j.addma.2021.102290

Software Compensation of Undesirable Racking Motion of H-frame 3D Printers using Filtered B-Splines

Authors: Nosakhare Edoimioya, Keval S. Ramani, Chinedum E. Okwudire

Abstract: The H-frame (also known as H-Bot) architecture is a simple and elegant two-axis parallel positioning system used to construct the XY stage of 3D printers. It holds potential for high speed and excellent dynamic performance due to the use of frame-mounted motors that reduce the moving mass of the printer while allowing for the use of (heavy) higher torque motors. However, the H-frame's dynamic accu… ▽ More The H-frame (also known as H-Bot) architecture is a simple and elegant two-axis parallel positioning system used to construct the XY stage of 3D printers. It holds potential for high speed and excellent dynamic performance due to the use of frame-mounted motors that reduce the moving mass of the printer while allowing for the use of (heavy) higher torque motors. However, the H-frame's dynamic accuracy is limited during high-acceleration and high-speed motion due to racking -- i.e., parasitic torsional motions of the printer's gantry due to a force couple. Mechanical solutions to the racking problem are either costly or detract from the simplicity of the H-frame. In this paper, we introduce a feedforward software compensation algorithm, based on the filtered B-splines (FBS) method, that rectifies errors due to racking. The FBS approach expresses the motion command to the machine as a linear combination of B-splines. The B-splines are filtered through an identified model of the machine dynamics and the control points of the B-spline based motion command are optimized such that the tracking error is minimized. To compensate racking using the FBS algorithm, an accurate frequency response function of the racking motion is obtained and coupled to the H-frame's x- and y-axis dynamics with a kinematic model. The result is a coupled linear parameter varying model of the H-frame that is utilized in the FBS framework to compensate racking. An approximation of the proposed racking compensation algorithm, that decouples the x- and y-axis compensation, is developed to significantly improve its computational efficiency with almost no loss of compensation accuracy. Experiments on an H-frame 3D printer demonstrate a 43 percent improvement in the shape accuracy of a printed part using the proposed algorithm compared to the standard FBS approach without racking compensation. △ Less

Submitted 20 May, 2021; originally announced May 2021.

Comments: 12 pages, 11 figures, pending journal publication

arXiv:1907.12022 [pdf, other]

DAR-Net: Dynamic Aggregation Network for Semantic Scene Segmentation

Authors: Zongyue Zhao, Min Liu, Karthik Ramani

Abstract: Traditional grid/neighbor-based static pooling has become a constraint for point cloud geometry analysis. In this paper, we propose DAR-Net, a novel network architecture that focuses on dynamic feature aggregation. The central idea of DAR-Net is generating a self-adaptive pooling skeleton that considers both scene complexity and local geometry features. Providing variable semi-local receptive fiel… ▽ More Traditional grid/neighbor-based static pooling has become a constraint for point cloud geometry analysis. In this paper, we propose DAR-Net, a novel network architecture that focuses on dynamic feature aggregation. The central idea of DAR-Net is generating a self-adaptive pooling skeleton that considers both scene complexity and local geometry features. Providing variable semi-local receptive fields and weights, the skeleton serves as a bridge that connect local convolutional feature extractors and a global recurrent feature integrator. Experimental results on indoor scene datasets show advantages of the proposed approach compared to state-of-the-art architectures that adopt static pooling methods. △ Less

Submitted 25 December, 2019; v1 submitted 28 July, 2019; originally announced July 2019.

MSC Class: I.2.10 ACM Class: I.2.10

arXiv:1902.09714 [pdf, other]

doi 10.1109/MILCOM.2018.8599774

NAC: Automating Access Control via Named Data

Authors: Zhiyi Zhang, Yingdi Yu, Sanjeev Kaushik Ramani, Alex Afanasyev, Lixia Zhang

Abstract: In this paper we present the design of Name-based Access Control (NAC) scheme, which supports data confidentiality and access control in Named Data Networking (NDN) architecture by encrypting content at the time of production, and by automating the distribution of encryption and decryption keys. NAC achieves the above design goals by leveraging specially crafted NDN naming conventions to define an… ▽ More In this paper we present the design of Name-based Access Control (NAC) scheme, which supports data confidentiality and access control in Named Data Networking (NDN) architecture by encrypting content at the time of production, and by automating the distribution of encryption and decryption keys. NAC achieves the above design goals by leveraging specially crafted NDN naming conventions to define and enforce access control policies, and to automate the cryptographic key management. The paper also explains how NDN's hierarchically structured namespace allows NAC to support fine-grained access control policies, and how NDN's Interest-Data exchange can help NAC to function in case of intermittent connectivity. Moreover, we show that NAC design can be further extended to support Attribute-based Encryption (ABE), which supports access control with additional levels of flexibility and scalability. △ Less

Submitted 25 February, 2019; originally announced February 2019.

Comments: Originally published in MILCOM 2018. This version includes some writing improvements

Journal ref: IEEE Military Communications Conference (MILCOM), 2018, 626-633

arXiv:1807.04812 [pdf, other]

Latent Transformations for Object View Points Synthesis

Authors: Sangpil Kim, Nick Winovich, Guang Lin, Karthik Ramani

Abstract: We propose a fully-convolutional conditional generative model, the latent transformation neural network (LTNN), capable of view synthesis using a light-weight neural network suited for real-time applications. In contrast to existing conditional generative models which incorporate conditioning information via concatenation, we introduce a dedicated network component, the conditional transformation… ▽ More We propose a fully-convolutional conditional generative model, the latent transformation neural network (LTNN), capable of view synthesis using a light-weight neural network suited for real-time applications. In contrast to existing conditional generative models which incorporate conditioning information via concatenation, we introduce a dedicated network component, the conditional transformation unit (CTU), designed to learn the latent space transformations corresponding to specified target views. In addition, a consistency loss term is defined to guide the network toward learning the desired latent space map**s, a task-divided decoder is constructed to refine the quality of generated views, and an adaptive discriminator is introduced to improve the adversarial training process. The generality of the proposed methodology is demonstrated on a collection of three diverse tasks: multi-view reconstruction on real hand depth images, view synthesis of real and synthetic faces, and the rotation of rigid objects. The proposed model is shown to exceed state-of-the-art results in each category while simultaneously achieving a reduction in the computational demand required for inference by 30% on average. △ Less

Submitted 28 November, 2018; v1 submitted 12 July, 2018; originally announced July 2018.

arXiv:1712.04426 [pdf, other]

3D Object Classification via Spherical Projections

Authors: Zhangjie Cao, Qixing Huang, Karthik Ramani

Abstract: In this paper, we introduce a new method for classifying 3D objects. Our main idea is to project a 3D object onto a spherical domain centered around its barycenter and develop neural network to classify the spherical projection. We introduce two complementary projections. The first captures depth variations of a 3D object, and the second captures contour-information viewed from different angles. S… ▽ More In this paper, we introduce a new method for classifying 3D objects. Our main idea is to project a 3D object onto a spherical domain centered around its barycenter and develop neural network to classify the spherical projection. We introduce two complementary projections. The first captures depth variations of a 3D object, and the second captures contour-information viewed from different angles. Spherical projections combine key advantages of two main-stream 3D classification methods: image-based and 3D-based. Specifically, spherical projections are locally planar, allowing us to use massive image datasets (e.g, ImageNet) for pre-training. Also spherical projections are similar to voxel-based methods, as they encode complete information of a 3D object in a single neural network capturing dependencies across different views. Our novel network design can fully utilize these advantages. Experimental results on ModelNet40 and ShapeNetCore show that our method is superior to prior methods. △ Less

Submitted 12 December, 2017; originally announced December 2017.

arXiv:1703.04079 [pdf, other]

SurfNet: Generating 3D shape surfaces using deep residual networks

Authors: Ayan Sinha, Asim Unmesh, Qixing Huang, Karthik Ramani

Abstract: 3D shape models are naturally parameterized using vertices and faces, \ie, composed of polygons forming a surface. However, current 3D learning paradigms for predictive and generative tasks using convolutional neural networks focus on a voxelized representation of the object. Lifting convolution operators from the traditional 2D to 3D results in high computational overhead with little additional b… ▽ More 3D shape models are naturally parameterized using vertices and faces, \ie, composed of polygons forming a surface. However, current 3D learning paradigms for predictive and generative tasks using convolutional neural networks focus on a voxelized representation of the object. Lifting convolution operators from the traditional 2D to 3D results in high computational overhead with little additional benefit as most of the geometry information is contained on the surface boundary. Here we study the problem of directly generating the 3D shape surface of rigid and non-rigid shapes using deep convolutional neural networks. We develop a procedure to create consistent `geometry images' representing the shape surface of a category of 3D objects. We then use this consistent representation for category-specific shape surface generation from a parametric representation or an image by develo** novel extensions of deep residual networks for the task of geometry image generation. Our experiments indicate that our network learns a meaningful representation of shape surfaces allowing it to interpolate between shape orientations and poses, invent new shape surfaces and reconstruct 3D shape surfaces from previously unseen images. △ Less

Submitted 12 March, 2017; originally announced March 2017.

Comments: CVPR 2017 paper

arXiv:1703.01049 [pdf, ps, other]

Deconvolving Feedback Loops in Recommender Systems

Authors: Ayan Sinha, David F. Gleich, Karthik Ramani

Abstract: Collaborative filtering is a popular technique to infer users' preferences on new content based on the collective information of all users preferences. Recommender systems then use this information to make personalized suggestions to users. When users accept these recommendations it creates a feedback loop in the recommender system, and these loops iteratively influence the collaborative filtering… ▽ More Collaborative filtering is a popular technique to infer users' preferences on new content based on the collective information of all users preferences. Recommender systems then use this information to make personalized suggestions to users. When users accept these recommendations it creates a feedback loop in the recommender system, and these loops iteratively influence the collaborative filtering algorithm's predictions over time. We investigate whether it is possible to identify items affected by these feedback loops. We state sufficient assumptions to deconvolve the feedback loops while kee** the inverse solution tractable. We furthermore develop a metric to unravel the recommender system's influence on the entire user-item rating matrix. We use this metric on synthetic and real-world datasets to (1) identify the extent to which the recommender system affects the final rating matrix, (2) rank frequently recommended items, and (3) distinguish whether a user's rated item was recommended or an intrinsic preference. Our results indicate that it is possible to recover the ratings matrix of intrinsic user preferences using a single snapshot of the ratings matrix without any temporal information. △ Less

Submitted 3 March, 2017; originally announced March 2017.

Comments: Neural Information Processing Systems, 2016

Showing 1–16 of 16 results for author: Ramani, K