Skip to main content

Showing 1–15 of 15 results for author: Dickens, C

.
  1. arXiv:2404.01182  [pdf, other

    cs.CL cs.SC

    A Neuro-Symbolic Approach to Monitoring Salt Content in Food

    Authors: Anuja Tayal, Barbara Di Eugenio, Devika Salunke, Andrew D. Boyd, Carolyn A Dickens, Eulalia P Abril, Olga Garcia-Bedoya, Paula G Allen-Meares

    Abstract: We propose a dialogue system that enables heart failure patients to inquire about salt content in foods and help them monitor and reduce salt intake. Addressing the lack of specific datasets for food-based salt content inquiries, we develop a template-based conversational dataset. The dataset is structured to ask clarification questions to identify food items and their salt content. Our findings i… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted in CL4Health workshop in LREC-COLING'24

  2. arXiv:2401.09651  [pdf, other

    cs.LG cs.AI math.OC

    Convex and Bilevel Optimization for Neuro-Symbolic Inference and Learning

    Authors: Charles Dickens, Changyu Gao, Connor Pryor, Stephen Wright, Lise Getoor

    Abstract: We leverage convex and bilevel optimization techniques to develop a general gradient-based parameter learning framework for neural-symbolic (NeSy) systems. We demonstrate our framework with NeuPSL, a state-of-the-art NeSy architecture. To achieve this, we propose a smooth primal and dual formulation of NeuPSL inference and show learning gradients are functions of the optimal dual variables. Additi… ▽ More

    Submitted 3 June, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  3. arXiv:2312.15520  [pdf, other

    cs.LG

    Graph Coarsening via Convolution Matching for Scalable Graph Neural Network Training

    Authors: Charles Dickens, Eddie Huang, Aishwarya Reganti, Jiong Zhu, Karthik Subbian, Danai Koutra

    Abstract: Graph summarization as a preprocessing step is an effective and complementary technique for scalable graph neural network (GNN) training. In this work, we propose the Coarsening Via Convolution Matching (CONVMATCH) algorithm and a highly scalable variant, A-CONVMATCH, for creating summarized graphs that preserve the output of graph convolution. We evaluate CONVMATCH on six real-world link predicti… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

  4. arXiv:2312.08981  [pdf, other

    cs.DS cs.DM

    Matching Noisy Keys for Obfuscation

    Authors: Charlie Dickens, Eric Bax

    Abstract: Data sketching has emerged as a key infrastructure for large-scale data analysis on streaming and distributed data. Merging sketches enables efficient estimation of cardinalities and frequency histograms over distributed data. However, merging sketches can require that each sketch stores hash codes for identifiers in different data sets or partitions, in order to perform effective matching. This c… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  5. arXiv:2305.09887  [pdf, other

    cs.LG cs.DC

    Simplifying Distributed Neural Network Training on Massive Graphs: Randomized Partitions Improve Model Aggregation

    Authors: Jiong Zhu, Aishwarya Reganti, Edward Huang, Charles Dickens, Nikhil Rao, Karthik Subbian, Danai Koutra

    Abstract: Distributed training of GNNs enables learning on massive graphs (e.g., social and e-commerce networks) that exceed the storage and computational capacity of a single machine. To reach performance comparable to centralized training, distributed frameworks focus on maximally recovering cross-instance node dependencies with either communication across instances or periodic fallback to centralized tra… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: 14 pages, 3 figures

  6. arXiv:2207.07238  [pdf, other

    cs.LG cs.CL

    Emotion Recognition in Conversation using Probabilistic Soft Logic

    Authors: Eriq Augustine, Pegah Jandaghi, Alon Albalak, Connor Pryor, Charles Dickens, William Wang, Lise Getoor

    Abstract: Creating agents that can both appropriately respond to conversations and understand complex human linguistic tendencies and social cues has been a long standing challenge in the NLP community. A recent pillar of research revolves around emotion recognition in conversation (ERC); a sub-field of emotion recognition that focuses on conversations or dialogues that contain two or more utterances. In th… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  7. arXiv:2205.14268  [pdf, other

    cs.LG

    NeuPSL: Neural Probabilistic Soft Logic

    Authors: Connor Pryor, Charles Dickens, Eriq Augustine, Alon Albalak, William Wang, Lise Getoor

    Abstract: In this paper, we introduce Neural Probabilistic Soft Logic (NeuPSL), a novel neuro-symbolic (NeSy) framework that unites state-of-the-art symbolic reasoning with the low-level perception of deep neural networks. To model the boundary between neural and symbolic representations, we propose a family of energy-based models, NeSy Energy-Based Models, and show that they are general enough to include N… ▽ More

    Submitted 23 May, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

  8. arXiv:2203.15400  [pdf, ps, other

    cs.DS

    Order-Invariant Cardinality Estimators Are Differentially Private

    Authors: Charlie Dickens, Justin Thaler, Daniel Ting

    Abstract: We consider privacy in the context of streaming algorithms for cardinality estimation. We show that a large class of algorithms all satisfy $ε$-differential privacy, so long as (a) the algorithm is combined with a simple down-sampling procedure, and (b) the cardinality of the input stream is $Ω(k/ε)$. Here, $k$ is a certain parameter of the sketch that is always at most the sketch size in bits, bu… ▽ More

    Submitted 3 February, 2023; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: Changed title and updated with camera ready version from conference

  9. arXiv:2101.07546  [pdf, other

    cs.DS cs.CC

    Subspace exploration: Bounds on Projected Frequency Estimation

    Authors: Graham Cormode, Charlie Dickens, David P. Woodruff

    Abstract: Given an $n \times d$ dimensional dataset $A$, a projection query specifies a subset $C \subseteq [d]$ of columns which yields a new $n \times |C|$ array. We study the space complexity of computing data analysis functions over such subspaces, including heavy hitters and norms, when the subspaces are revealed only after observing the data. We show that this important class of problems is typically… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

  10. arXiv:2011.03607  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Ridge Regression with Frequent Directions: Statistical and Optimization Perspectives

    Authors: Charlie Dickens

    Abstract: Despite its impressive theory \& practical performance, Frequent Directions (\acrshort{fd}) has not been widely adopted for large-scale regression tasks. Prior work has shown randomized sketches (i) perform worse in estimating the covariance matrix of the data than \acrshort{fd}; (ii) incur high error when estimating the bias and/or variance on sketched ridge regression. We give the first constant… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

  11. arXiv:2009.08952  [pdf, other

    cs.IR cs.LG stat.ML

    HyperFair: A Soft Approach to Integrating Fairness Criteria

    Authors: Charles Dickens, Rishika Singh, Lise Getoor

    Abstract: Recommender systems are being employed across an increasingly diverse set of domains that can potentially make a significant social and individual impact. For this reason, considering fairness is a critical step in the design and evaluation of such systems. In this paper, we introduce HyperFair, a general framework for enforcing soft fairness constraints in a hybrid recommender system. HyperFair m… ▽ More

    Submitted 5 September, 2020; originally announced September 2020.

  12. arXiv:2008.01505  [pdf, other

    cs.LG stat.ML

    Interpretable Anomaly Detection with Mondrian P{ó}lya Forests on Data Streams

    Authors: Charlie Dickens, Eric Meissner, Pablo G. Moreno, Tom Diethe

    Abstract: Anomaly detection at scale is an extremely challenging problem of great practicality. When data is large and high-dimensional, it can be difficult to detect which observations do not fit the expected behaviour. Recent work has coalesced on variations of (random) $k$\emph{d-trees} to summarise data for anomaly detection. However, these methods rely on ad-hoc score functions that are not easy to int… ▽ More

    Submitted 4 August, 2020; originally announced August 2020.

  13. arXiv:2003.11498  [pdf, other

    cs.LG stat.ML

    Similarity of Neural Networks with Gradients

    Authors: Shuai Tang, Wesley J. Maddox, Charlie Dickens, Tom Diethe, Andreas Damianou

    Abstract: A suitable similarity index for comparing learnt neural networks plays an important role in understanding the behaviour of the highly-nonlinear functions, and can provide insights on further theoretical analysis and empirical studies. We define two key steps when comparing models: firstly, the representation abstracted from the learnt model, where we propose to leverage both feature vectors and gr… ▽ More

    Submitted 25 March, 2020; originally announced March 2020.

  14. arXiv:1910.14166  [pdf, other

    cs.LG cs.DS stat.ML

    Iterative Hessian Sketch in Input Sparsity Time

    Authors: Graham Cormode, Charlie Dickens

    Abstract: Scalable algorithms to solve optimization and regression tasks even approximately, are needed to work with large datasets. In this paper we study efficient techniques from matrix sketching to solve a variety of convex constrained regression problems. We adopt "Iterative Hessian Sketching" (IHS) and show that the fast CountSketch and sparse Johnson-Lindenstrauss Transforms yield state-of-the-art ac… ▽ More

    Submitted 30 October, 2019; originally announced October 2019.

  15. arXiv:1807.02571  [pdf, other

    cs.DS

    Leveraging Well-Conditioned Bases: Streaming \& Distributed Summaries in Minkowski $p$-Norms

    Authors: Graham Cormode, Charlie Dickens, David P. Woodruff

    Abstract: Work on approximate linear algebra has led to efficient distributed and streaming algorithms for problems such as approximate matrix multiplication, low rank approximation, and regression, primarily for the Euclidean norm $\ell_2$. We study other $\ell_p$ norms, which are more robust for $p < 2$, and can be used to find outliers for $p > 2$. Unlike previous algorithms for such norms, we give algor… ▽ More

    Submitted 6 July, 2018; originally announced July 2018.