Skip to main content

Showing 1–8 of 8 results for author: Kender, J R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2207.03554  [pdf, other

    cs.LG cs.AI

    G2L: A Geometric Approach for Generating Pseudo-labels that Improve Transfer Learning

    Authors: John R. Kender, Bishwaranjan Bhattacharjee, Parijat Dube, Brian Belgodere

    Abstract: Transfer learning is a deep-learning technique that ameliorates the problem of learning when human-annotated labels are expensive and limited. In place of such labels, it uses instead the previously trained weights from a well-chosen source model as the initial weights for the training of a base model for a new target dataset. We demonstrate a novel but general technique for automatically creating… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: 21 pages, 6 figures

    MSC Class: 68T07

  2. arXiv:1908.07630  [pdf, other

    cs.LG cs.AI cs.CV

    P2L: Predicting Transfer Learning for Images and Semantic Relations

    Authors: Bishwaranjan Bhattacharjee, John R. Kender, Matthew Hill, Parijat Dube, Siyu Huo, Michael R. Glass, Brian Belgodere, Sharath Pankanti, Noel Codella, Patrick Watson

    Abstract: Transfer learning enhances learning across tasks, by leveraging previously learned representations -- if they are properly chosen. We describe an efficient method to accurately estimate the appropriateness of a previously trained model for use in a new learning task. We use this measure, which we call "Predict To Learn" ("P2L"), in the two very different domains of images and semantic relations, w… ▽ More

    Submitted 15 October, 2020; v1 submitted 20 August, 2019; originally announced August 2019.

    Comments: 10 pages, 8 figures, 4 tables

  3. arXiv:cs/0612139  [pdf

    cs.SD cs.MM

    Alignment of Speech to Highly Imperfect Text Transcriptions

    Authors: Alexander Haubold, John R. Kender

    Abstract: We introduce a novel and inexpensive approach for the temporal alignment of speech to highly imperfect transcripts from automatic speech recognition (ASR). Transcripts are generated for extended lecture and presentation videos, which in some cases feature more than 30 speakers with different accents, resulting in highly varying transcription qualities. In our approach we detect a subset of phone… ▽ More

    Submitted 28 December, 2006; originally announced December 2006.

    ACM Class: H.3.1; H.5.1; H.5.5

  4. arXiv:cs/0612138  [pdf

    cs.SD cs.MM

    Accommodating Sample Size Effect on Similarity Measures in Speaker Clustering

    Authors: Alexander Haubold, John R. Kender

    Abstract: We investigate the symmetric Kullback-Leibler (KL2) distance in speaker clustering and its unreported effects for differently-sized feature matrices. Speaker data is represented as Mel Frequency Cepstral Coefficient (MFCC) vectors, and features are compared using the KL2 metric to form clusters of speech segments for each speaker. We make two observations with respect to clustering based on KL2:… ▽ More

    Submitted 28 December, 2006; originally announced December 2006.

    ACM Class: H.3.3; H.5.1; H.5.5

  5. arXiv:cs/0501044  [pdf

    cs.MM cs.IR

    Augmented Segmentation and Visualization for Presentation Videos

    Authors: Alexander Haubold, John R. Kender

    Abstract: We investigate methods of segmenting, visualizing, and indexing presentation videos by separately considering audio and visual data. The audio track is segmented by speaker, and augmented with key phrases which are extracted using an Automatic Speech Recognizer (ASR). The video track is segmented by visual dissimilarities and augmented by representative key frames. An interactive user interface… ▽ More

    Submitted 20 January, 2005; originally announced January 2005.

    ACM Class: H.2.4; H.3.1

  6. Analysis and Visualization of Index Words from Audio Transcripts of Instructional Videos

    Authors: Alexander Haubold, John R. Kender

    Abstract: We introduce new techniques for extracting, analyzing, and visualizing textual contents from instructional videos of low production quality. Using Automatic Speech Recognition, approximate transcripts (H75% Word Error Rate) are obtained from the originally highly compressed videos of university courses, each comprising between 10 to 30 lectures. Text material in the form of books or papers that… ▽ More

    Submitted 27 August, 2004; originally announced August 2004.

    Comments: 2004 IEEE International Workshop on Multimedia Content-based Analysis and Retrieval; 20 pages, 8 figures, 7 tables

    ACM Class: H.3.1; H.3.3; I.2.10

  7. arXiv:cs/0302024  [pdf

    cs.IR cs.CV

    Analysis and Interface for Instructional Video

    Authors: Alexander Haubold, John R. Kender

    Abstract: We present a new method for segmenting, and a new user interface for indexing and visualizing, the semantic content of extended instructional videos. Using various visual filters, key frames are first assigned a media type (board, class, computer, illustration, podium, and sheet). Key frames of media type board and sheet are then clustered based on contents via an algorithm with near-linear cost… ▽ More

    Submitted 30 August, 2003; v1 submitted 16 February, 2003; originally announced February 2003.

    Comments: 4 pages, 8 figures, ICME 2003

    ACM Class: H.3.1; H.3.3; I.4.8; I.5.3

    Journal ref: Proceedings of 2003 IEEE International Conference on Multimedia & Expo, Volume II, pages 705-708, July 2003

  8. arXiv:cs/0302023  [pdf

    cs.IR cs.CV

    Segmentation, Indexing, and Visualization of Extended Instructional Videos

    Authors: Alexander Haubold, John R. Kender

    Abstract: We present a new method for segmenting, and a new user interface for indexing and visualizing, the semantic content of extended instructional videos. Given a series of key frames from the video, we generate a condensed view of the data by clustering frames according to media type and visual similarities. Using various visual filters, key frames are first assigned a media type (board, class, comp… ▽ More

    Submitted 16 February, 2003; originally announced February 2003.

    Comments: 8 pages, 13 figures

    ACM Class: H.3.1; H.3.3; I.4.8; I.5.3