Skip to main content

Showing 51–88 of 88 results for author: Laptev, I

.
  1. arXiv:1912.06617  [pdf, other

    cs.CV

    Action Modifiers: Learning from Adverbs in Instructional Videos

    Authors: Hazel Doughty, Ivan Laptev, Walterio Mayol-Cuevas, Dima Damen

    Abstract: We present a method to learn a representation for adverbs from instructional videos using weak supervision from the accompanying narrations. Key to our method is the fact that the visual representation of the adverb is highly dependant on the action to which it applies, although the same adverb will modify multiple actions in a similar way. For instance, while 'spread quickly' and 'mix quickly' wi… ▽ More

    Submitted 24 March, 2020; v1 submitted 13 December, 2019; originally announced December 2019.

    Comments: CVPR 2020

  2. arXiv:1912.06430  [pdf, other

    cs.CV

    End-to-End Learning of Visual Representations from Uncurated Instructional Videos

    Authors: Antoine Miech, Jean-Baptiste Alayrac, Lucas Smaira, Ivan Laptev, Josef Sivic, Andrew Zisserman

    Abstract: Annotating videos is cumbersome, expensive and not scalable. Yet, many strong video models still rely on manually annotated data. With the recent introduction of the HowTo100M dataset, narrated videos now offer the possibility of learning video representations without manual supervision. In this work we propose a new learning approach, MIL-NCE, capable of addressing misalignments inherent to narra… ▽ More

    Submitted 23 August, 2020; v1 submitted 13 December, 2019; originally announced December 2019.

    Comments: CVPR'2020 Oral

  3. Synthetic Humans for Action Recognition from Unseen Viewpoints

    Authors: Gül Varol, Ivan Laptev, Cordelia Schmid, Andrew Zisserman

    Abstract: Although synthetic training data has been shown to be beneficial for tasks such as human pose estimation, its use for RGB human action recognition is relatively unexplored. Our goal in this work is to answer the question whether synthetic humans can improve the performance of human action recognition, with a particular focus on generalization to unseen viewpoints. We make use of the recent advance… ▽ More

    Submitted 23 May, 2021; v1 submitted 9 December, 2019; originally announced December 2019.

    Comments: 21 pages

    Journal ref: International Journal of Computer Vision (2021)

  4. arXiv:1908.00722  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Learning to combine primitive skills: A step towards versatile robotic manipulation

    Authors: Robin Strudel, Alexander Pashevich, Igor Kalevatykh, Ivan Laptev, Josef Sivic, Cordelia Schmid

    Abstract: Manipulation tasks such as preparing a meal or assembling furniture remain highly challenging for robotics and vision. Traditional task and motion planning (TAMP) methods can solve complex tasks but require full state observability and are not adapted to dynamic scene changes. Recent learning methods can operate directly on visual inputs but typically require many demonstrations and/or task-specif… ▽ More

    Submitted 20 June, 2020; v1 submitted 2 August, 2019; originally announced August 2019.

    Comments: ICRA 2020. See the project webpage at https://www.di.ens.fr/willow/research/rlbc/

    Journal ref: IEEE ROBOTICS AND AUTOMATION LETTERS, JULY 2020. 4637-4643

  5. arXiv:1906.03327  [pdf, other

    cs.CV

    HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips

    Authors: Antoine Miech, Dimitri Zhukov, Jean-Baptiste Alayrac, Makarand Tapaswi, Ivan Laptev, Josef Sivic

    Abstract: Learning text-video embeddings usually requires a dataset of video clips with manually provided captions. However, such datasets are expensive and time consuming to create and therefore difficult to obtain on a large scale. In this work, we propose instead to learn such embeddings from video data with readily available natural language annotations in the form of automatically transcribed narration… ▽ More

    Submitted 31 July, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

    Comments: Accepted at ICCV 2019

  6. arXiv:1904.10348  [pdf, other

    cs.RO cs.CV

    Monte-Carlo Tree Search for Efficient Visually Guided Rearrangement Planning

    Authors: Yann Labbé, Sergey Zagoruyko, Igor Kalevatykh, Ivan Laptev, Justin Carpentier, Mathieu Aubry, Josef Sivic

    Abstract: We address the problem of visually guided rearrangement planning with many movable objects, i.e., finding a sequence of actions to move a set of objects from an initial arrangement to a desired one, while relying on visual inputs coming from an RGB camera. To do so, we introduce a complete pipeline relying on two key contributions. First, we introduce an efficient and scalable rearrangement planni… ▽ More

    Submitted 1 April, 2020; v1 submitted 23 April, 2019; originally announced April 2019.

    Comments: Accepted for publication in IEEE Robotics and Automation Letters (RA-L)

  7. arXiv:1904.09626  [pdf, other

    cs.CV cs.LG

    Deep Metric Learning Beyond Binary Supervision

    Authors: Sungyeon Kim, Minkyo Seo, Ivan Laptev, Minsu Cho, Suha Kwak

    Abstract: Metric Learning for visual similarity has mostly adopted binary supervision indicating whether a pair of images are of the same class or not. Such a binary indicator covers only a limited subset of image relations, and is not sufficient to represent semantic similarity between images described by continuous and/or structured labels such as object poses, image captions, and scene graphs. Motivated… ▽ More

    Submitted 21 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPR 2019 (Oral)

  8. arXiv:1904.05767  [pdf, other

    cs.CV

    Learning joint reconstruction of hands and manipulated objects

    Authors: Yana Hasson, Gül Varol, Dimitrios Tzionas, Igor Kalevatykh, Michael J. Black, Ivan Laptev, Cordelia Schmid

    Abstract: Estimating hand-object manipulations is essential for interpreting and imitating human actions. Previous work has made significant progress towards reconstruction of hand poses and object shapes in isolation. Yet, reconstructing hands and objects during manipulation is a more challenging task due to significant occlusions of both the hand and object. While presenting challenges, manipulations may… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

    Comments: CVPR 2019

  9. Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video

    Authors: Zongmian Li, Jiri Sedlar, Justin Carpentier, Ivan Laptev, Nicolas Mansard, Josef Sivic

    Abstract: In this paper, we introduce a method to automatically reconstruct the 3D motion of a person interacting with an object from a single RGB video. Our method estimates the 3D poses of the person and the object, contact positions, and forces and torques actuated by the human limbs. The main contributions of this work are three-fold. First, we introduce an approach to jointly estimate the motion and th… ▽ More

    Submitted 17 June, 2019; v1 submitted 4 April, 2019; originally announced April 2019.

  10. arXiv:1903.08225  [pdf, other

    cs.CV

    Cross-task weakly supervised learning from instructional videos

    Authors: Dimitri Zhukov, Jean-Baptiste Alayrac, Ramazan Gokberk Cinbis, David Fouhey, Ivan Laptev, Josef Sivic

    Abstract: In this paper we investigate learning visual models for the steps of ordinary tasks using weak supervision via instructional narrations and an ordered list of steps instead of strong supervision via temporal annotations. At the heart of our approach is the observation that weakly supervised learning may be easier if a model shares components while learning different steps: `pour egg' should be tra… ▽ More

    Submitted 29 April, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

    Comments: 18 pages, 17 figures, to be published in proceedings of the CVPR, 2019

  11. arXiv:1903.07740  [pdf, other

    cs.LG cs.CV stat.ML

    Learning to Augment Synthetic Images for Sim2Real Policy Transfer

    Authors: Alexander Pashevich, Robin Strudel, Igor Kalevatykh, Ivan Laptev, Cordelia Schmid

    Abstract: Vision and learning have made significant progress that could improve robotics policies for complex tasks and environments. Learning deep neural networks for image understanding, however, requires large amounts of domain-specific visual data. While collecting such data from real robots is possible, such an approach limits the scalability as learning policies typically requires thousands of trials.… ▽ More

    Submitted 30 July, 2019; v1 submitted 18 March, 2019; originally announced March 2019.

    Comments: 7 pages

    Journal ref: IROS 2019

  12. arXiv:1812.05736  [pdf, other

    cs.CV

    Detecting unseen visual relations using analogies

    Authors: Julia Peyre, Ivan Laptev, Cordelia Schmid, Josef Sivic

    Abstract: We seek to detect visual relations in images of the form of triplets t = (subject, predicate, object), such as "person riding dog", where training examples of the individual entities are available but their combinations are unseen at training. This is an important set-up due to the combinatorial nature of visual relations : collecting sufficient training data for all possible triplets would be ver… ▽ More

    Submitted 22 September, 2019; v1 submitted 13 December, 2018; originally announced December 2018.

  13. arXiv:1812.02619  [pdf, other

    cs.CV

    Tube-CNN: Modeling temporal evolution of appearance for object detection in video

    Authors: Tuan-Hung Vu, Anton Osokin, Ivan Laptev

    Abstract: Object detection in video is crucial for many applications. Compared to images, video provides additional cues which can help to disambiguate the detection problem. Our goal in this paper is to learn discriminative models for the temporal evolution of object appearance and to use such models for object detection. To model temporal evolution, we introduce space-time tubes corresponding to temporal… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

    Comments: 13 pages, 8 figures, technical report

  14. arXiv:1809.08809  [pdf, other

    cs.CV

    MobileFace: 3D Face Reconstruction with Efficient CNN Regression

    Authors: Nikolai Chinaev, Alexander Chigorin, Ivan Laptev

    Abstract: Estimation of facial shapes plays a central role for face transfer and animation. Accurate 3D face reconstruction, however, often deploys iterative and costly methods preventing real-time applications. In this work we design a compact and fast CNN model enabling real-time face reconstruction on mobile devices. For this purpose, we first study more traditional but slow morphable face models and use… ▽ More

    Submitted 24 September, 2018; originally announced September 2018.

    Comments: ECCV Workshops (PeopleCap) 2018

  15. arXiv:1809.08381  [pdf, other

    cs.CV

    Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

    Authors: Meera Hahn, Nataniel Ruiz, Jean-Baptiste Alayrac, Ivan Laptev, James M. Rehg

    Abstract: Automatic generation of textual video descriptions that are time-aligned with video content is a long-standing goal in computer vision. The task is challenging due to the difficulty of bridging the semantic gap between the visual and natural language domains. This paper addresses the task of automatically generating an alignment between a set of instructions and a first person video demonstrating… ▽ More

    Submitted 22 September, 2018; originally announced September 2018.

  16. arXiv:1806.11328  [pdf, other

    cs.CV

    A flexible model for training action localization with varying levels of supervision

    Authors: Guilhem Chéron, Jean-Baptiste Alayrac, Ivan Laptev, Cordelia Schmid

    Abstract: Spatio-temporal action detection in videos is typically addressed in a fully-supervised setup with manual annotation of training videos required at every frame. Since such annotation is extremely tedious and prohibits scalability, there is a clear need to minimize the amount of manual supervision. In this work we propose a unifying framework that can handle and combine varying types of less-demand… ▽ More

    Submitted 27 November, 2018; v1 submitted 29 June, 2018; originally announced June 2018.

  17. arXiv:1806.11008  [pdf, other

    cs.CV

    Modeling Spatio-Temporal Human Track Structure for Action Localization

    Authors: Guilhem Chéron, Anton Osokin, Ivan Laptev, Cordelia Schmid

    Abstract: This paper addresses spatio-temporal localization of human actions in video. In order to localize actions in time, we propose a recurrent localization network (RecLNet) designed to model the temporal structure of actions on the level of person tracks. Our model is trained to simultaneously recognize and localize action classes in time and is based on two layer gated recurrent units (GRU) applied s… ▽ More

    Submitted 28 June, 2018; originally announced June 2018.

  18. arXiv:1804.04875  [pdf, other

    cs.CV

    BodyNet: Volumetric Inference of 3D Human Body Shapes

    Authors: Gül Varol, Duygu Ceylan, Bryan Russell, Jimei Yang, Ersin Yumer, Ivan Laptev, Cordelia Schmid

    Abstract: Human shape estimation is an important task for video editing, animation and fashion industry. Predicting 3D human body shape from natural images, however, is highly challenging due to factors such as variation in human bodies, clothing and viewpoint. Prior methods addressing this problem typically attempt to fit parametric body models with certain priors on pose and shape. In this work we argue f… ▽ More

    Submitted 18 August, 2018; v1 submitted 13 April, 2018; originally announced April 2018.

    Comments: Appears in: European Conference on Computer Vision 2018 (ECCV 2018). 27 pages

  19. arXiv:1804.02516  [pdf, other

    cs.CV

    Learning a Text-Video Embedding from Incomplete and Heterogeneous Data

    Authors: Antoine Miech, Ivan Laptev, Josef Sivic

    Abstract: Joint understanding of video and language is an active research area with many applications. Prior work in this domain typically relies on learning text-video embeddings. One difficulty with this approach, however, is the lack of large-scale annotated video-caption datasets for training. To address this issue, we aim at learning text-video embeddings from heterogeneous data sources. To this end, w… ▽ More

    Submitted 16 January, 2020; v1 submitted 7 April, 2018; originally announced April 2018.

    Comments: The paper had a major update in January 2020 after a bug we found in the codebase that affected many results

  20. arXiv:1707.09472  [pdf, other

    cs.CV

    Weakly-supervised learning of visual relations

    Authors: Julia Peyre, Ivan Laptev, Cordelia Schmid, Josef Sivic

    Abstract: This paper introduces a novel approach for modeling visual relations between pairs of objects. We call relation a triplet of the form (subject, predicate, object) where the predicate is typically a preposition (eg. 'under', 'in front of') or a verb ('hold', 'ride') that links a pair of objects (subject, object). Learning such relations is challenging as the objects have different spatial configura… ▽ More

    Submitted 29 July, 2017; originally announced July 2017.

  21. arXiv:1707.09074  [pdf, other

    cs.CV

    Learning from Video and Text via Large-Scale Discriminative Clustering

    Authors: Antoine Miech, Jean-Baptiste Alayrac, Piotr Bojanowski, Ivan Laptev, Josef Sivic

    Abstract: Discriminative clustering has been successfully applied to a number of weakly-supervised learning tasks. Such applications include person and action recognition, text-to-video alignment, object co-segmentation and colocalization in videos and images. One drawback of discriminative clustering, however, is its limited scalability. We address this issue and propose an online optimization algorithm ba… ▽ More

    Submitted 27 July, 2017; originally announced July 2017.

    Comments: To appear in ICCV 2017

  22. arXiv:1706.06905  [pdf, other

    cs.CV

    Learnable pooling with Context Gating for video classification

    Authors: Antoine Miech, Ivan Laptev, Josef Sivic

    Abstract: Current methods for video analysis often extract frame-level features using pre-trained convolutional neural networks (CNNs). Such features are then aggregated over time e.g., by simple temporal averaging or more sophisticated recurrent neural networks such as long short-term memory (LSTM) or gated recurrent units (GRU). In this work we revise existing video representations and study alternative m… ▽ More

    Submitted 5 March, 2018; v1 submitted 21 June, 2017; originally announced June 2017.

    Comments: Presented at Youtube 8M CVPR17 Workshop. Kaggle Winning model. Under review for TPAMI

  23. arXiv:1702.02738  [pdf, other

    cs.CV cs.LG

    Joint Discovery of Object States and Manipulation Actions

    Authors: Jean-Baptiste Alayrac, Josev Sivic, Ivan Laptev, Simon Lacoste-Julien

    Abstract: Many human activities involve object manipulations aiming to modify the object state. Examples of common state changes include full/empty bottle, open/closed door, and attached/detached car wheel. In this work, we seek to automatically discover the states of objects and the associated manipulation actions. Given a set of videos for a particular task, we propose a joint model that learns to identif… ▽ More

    Submitted 28 August, 2017; v1 submitted 9 February, 2017; originally announced February 2017.

    Comments: Appears in: International Conference on Computer Vision 2017 (ICCV 2017). 15 pages

    ACM Class: I.5.1; I.5.4; I.2

  24. Learning from Synthetic Humans

    Authors: Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael J. Black, Ivan Laptev, Cordelia Schmid

    Abstract: Estimating human pose, shape, and motion from images and videos are fundamental challenges with many applications. Recent advances in 2D human pose estimation use large amounts of manually-labeled training data for learning convolutional neural networks (CNNs). Such data is time consuming to acquire and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion is impractical. In… ▽ More

    Submitted 19 January, 2018; v1 submitted 5 January, 2017; originally announced January 2017.

    Comments: Appears in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017). 9 pages

  25. arXiv:1609.04331  [pdf, other

    cs.CV

    ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

    Authors: Vadim Kantorov, Maxime Oquab, Minsu Cho, Ivan Laptev

    Abstract: We aim to localize objects in images using image-level supervision only. Previous approaches to this problem mainly focus on discriminative object regions and often fail to locate precise object boundaries. We address this problem by introducing two types of context-aware guidance models, additive and contrastive models, that leverage their surrounding context regions to improve localization. The… ▽ More

    Submitted 14 September, 2016; originally announced September 2016.

    Comments: Accepted paper at ECCV2016. The website and code is at http://www.di.ens.fr/willow/research/contextlocnet

  26. arXiv:1607.07429  [pdf, other

    cs.HC cs.CV

    Much Ado About Time: Exhaustive Annotation of Temporal Data

    Authors: Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta

    Abstract: Large-scale annotated datasets allow AI systems to learn from and build upon the knowledge of the crowd. Many crowdsourcing techniques have been developed for collecting image annotations. These techniques often implicitly rely on the fact that a new input image takes a negligible amount of time to perceive. In contrast, we investigate and determine the most cost-effective way of obtaining high-qu… ▽ More

    Submitted 2 October, 2016; v1 submitted 25 July, 2016; originally announced July 2016.

    Comments: HCOMP 2016 Camera Ready

  27. The THUMOS Challenge on Action Recognition for Videos "in the Wild"

    Authors: Haroon Idrees, Amir R. Zamir, Yu-Gang Jiang, Alex Gorban, Ivan Laptev, Rahul Sukthankar, Mubarak Shah

    Abstract: Automatically recognizing and localizing wide ranges of human actions has crucial importance for video understanding. Towards this goal, the THUMOS challenge was introduced in 2013 to serve as a benchmark for action recognition. Until then, video action recognition, including THUMOS challenge, had focused primarily on the classification of pre-segmented (i.e., trimmed) videos, which is an artifici… ▽ More

    Submitted 21 April, 2016; originally announced April 2016.

    Comments: Preprint submitted to Computer Vision and Image Understanding

  28. arXiv:1604.04494  [pdf, other

    cs.CV

    Long-term Temporal Convolutions for Action Recognition

    Authors: Gül Varol, Ivan Laptev, Cordelia Schmid

    Abstract: Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representatio… ▽ More

    Submitted 2 June, 2017; v1 submitted 15 April, 2016; originally announced April 2016.

  29. arXiv:1604.01753  [pdf, other

    cs.CV

    Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

    Authors: Gunnar A. Sigurdsson, Gül Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, Abhinav Gupta

    Abstract: Computer vision has a great potential to help our daily lives by searching for lost keys, watering flowers or reminding us to take a pill. To succeed with such tasks, computer vision methods need to be trained from real and diverse examples of our daily dynamic scenes. While most of such scenes are not particularly exciting, they typically do not appear on YouTube, in movies or TV broadcasts. So h… ▽ More

    Submitted 26 July, 2016; v1 submitted 6 April, 2016; originally announced April 2016.

  30. arXiv:1511.07917  [pdf, other

    cs.CV cs.LG

    Context-aware CNNs for person head detection

    Authors: Tuan-Hung Vu, Anton Osokin, Ivan Laptev

    Abstract: Person detection is a key problem for many computer vision tasks. While face detection has reached maturity, detecting people under a full variation of camera view-points, human poses, lighting conditions and occlusions is still a difficult challenge. In this work we focus on detecting human heads in natural scenes. Starting from the recent local R-CNN object detector, we extend it with two types… ▽ More

    Submitted 24 November, 2015; originally announced November 2015.

    Comments: To appear in International Conference on Computer Vision (ICCV), 2015

  31. arXiv:1506.09215  [pdf, other

    cs.CV cs.LG

    Unsupervised Learning from Narrated Instruction Videos

    Authors: Jean-Baptiste Alayrac, Piotr Bojanowski, Nishant Agrawal, Josef Sivic, Ivan Laptev, Simon Lacoste-Julien

    Abstract: We address the problem of automatically learning the main steps to complete a certain task, such as changing a car tire, from a set of narrated instruction videos. The contributions of this paper are three-fold. First, we develop a new unsupervised learning approach that takes advantage of the complementary nature of the input video and the associated narration. The method solves two clustering pr… ▽ More

    Submitted 28 June, 2016; v1 submitted 30 June, 2015; originally announced June 2015.

    Comments: Appears in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016). 21 pages

    ACM Class: I.5.1; I.5.4; I.2

  32. arXiv:1506.03607  [pdf, other

    cs.CV

    P-CNN: Pose-based CNN Features for Action Recognition

    Authors: Guilhem Chéron, Ivan Laptev, Cordelia Schmid

    Abstract: This work targets human action recognition in video. While recent methods typically represent actions by statistics of local video features, here we argue for the importance of a representation derived from human pose. To this end we propose a new Pose-based Convolutional Neural Network descriptor (P-CNN) for action recognition. The descriptor aggregates motion and appearance information along tra… ▽ More

    Submitted 23 September, 2015; v1 submitted 11 June, 2015; originally announced June 2015.

    Comments: ICCV, December 2015, Santiago, Chile

  33. arXiv:1505.06027  [pdf, other

    cs.CV cs.CL

    Weakly-Supervised Alignment of Video With Text

    Authors: Piotr Bojanowski, Rémi Lajugie, Edouard Grave, Francis Bach, Ivan Laptev, Jean Ponce, Cordelia Schmid

    Abstract: Suppose that we are given a set of videos, along with natural language descriptions in the form of multiple sentences (e.g., manual annotations, movie scripts, sport summaries etc.), and that these sentences appear in the same temporal order as their visual counterparts. We propose in this paper a method for aligning the two modalities, i.e., automatically providing a time stamp for every sentence… ▽ More

    Submitted 21 December, 2015; v1 submitted 22 May, 2015; originally announced May 2015.

    Comments: ICCV 2015 - IEEE International Conference on Computer Vision, Dec 2015, Santiago, Chile

  34. arXiv:1505.03825  [pdf, other

    cs.CV

    Unsupervised Object Discovery and Tracking in Video Collections

    Authors: Suha Kwak, Minsu Cho, Ivan Laptev, Jean Ponce, Cordelia Schmid

    Abstract: This paper addresses the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision. We formulate the problem as a combination of two complementary processes: discovery and tracking. The first one establishes correspondences between prominent regions across videos, and the second one associates successive simila… ▽ More

    Submitted 14 May, 2015; originally announced May 2015.

  35. arXiv:1408.3304  [pdf, other

    cs.CV math.OC

    On Pairwise Costs for Network Flow Multi-Object Tracking

    Authors: Visesh Chari, Simon Lacoste-Julien, Ivan Laptev, Josef Sivic

    Abstract: Multi-object tracking has been recently approached with the min-cost network flow optimization techniques. Such methods simultaneously resolve multiple object tracks in a video and enable modeling of dependencies among tracks. Min-cost network flow methods also fit well within the "tracking-by-detection" paradigm where object trajectories are obtained by connecting per-frame outputs of an object d… ▽ More

    Submitted 5 May, 2015; v1 submitted 14 August, 2014; originally announced August 2014.

    Journal ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 5537-5545

  36. arXiv:1407.1208  [pdf, other

    cs.CV cs.LG

    Weakly Supervised Action Labeling in Videos Under Ordering Constraints

    Authors: Piotr Bojanowski, Rémi Lajugie, Francis Bach, Ivan Laptev, Jean Ponce, Cordelia Schmid, Josef Sivic

    Abstract: We are given a set of video clips, each one annotated with an {\em ordered} list of actions, such as "walk" then "sit" then "answer phone" extracted from, for example, the associated text script. We seek to temporally localize the individual actions in each clip as well as to learn a discriminative classifier for each action. We formulate the problem as a weakly supervised temporal assignment with… ▽ More

    Submitted 4 July, 2014; originally announced July 2014.

    Comments: 17 pages, completed version of a ECCV2014 conference paper

  37. arXiv:1203.6811  [pdf

    physics.gen-ph

    Thermodynamic Stability of Medium with an Alternating-Sign Pressure

    Authors: V. I. Laptev

    Abstract: The thermodynamic medium is treated as homogeneous phase equilibrium with a special feature: the pressure of one phase is positive and the pressure of the other is negative. From this point of view such medium is neither mixture, usual solution nor gel (or sol as a whole) regardless its components (solvent and dissolved substances). To base this statement the theoretical investigation of the condi… ▽ More

    Submitted 30 March, 2012; originally announced March 2012.

    Comments: 9 pages, 2 figure

  38. arXiv:1203.6752  [pdf

    physics.gen-ph physics.bio-ph q-bio.SC

    Living Cell Cytosol Stability to Segregation and Freezing-Out:Thermodynamic aspect

    Authors: Viktor I. Laptev

    Abstract: The cytosol state in living cell is treated as homogeneous phase equilibrium with a special feature: the pressure of one phase is positive and the pressure of the other is negative. From this point of view the cytosol is neither solution nor gel (or sol as a whole) regardless its components (water and dissolved substances). This is its unique capability for selecting, sorting and transporting reag… ▽ More

    Submitted 30 March, 2012; originally announced March 2012.

    Comments: 5 pages, no figure