-
Dynamic network analysis improves protein 3D structural classification
Authors:
Khalique Newaz,
Jacob Piland,
Patricia L. Clark,
Scott J. Emrich,
Jun Li,
Tijana Milenkovic
Abstract:
Protein structural classification (PSC) is a supervised problem of assigning proteins into pre-defined structural (e.g., CATH or SCOPe) classes based on the proteins' sequence or 3D structural features. We recently proposed PSC approaches that model protein 3D structures as protein structure networks (PSNs) and analyze PSN-based protein features, which performed better than or comparable to state-…
▽ More
Protein structural classification (PSC) is a supervised problem of assigning proteins into pre-defined structural (e.g., CATH or SCOPe) classes based on the proteins' sequence or 3D structural features. We recently proposed PSC approaches that model protein 3D structures as protein structure networks (PSNs) and analyze PSN-based protein features, which performed better than or comparable to state-of-the-art sequence or other 3D structure-based approaches in the task of PSC. However, existing PSN-based PSC approaches model the whole 3D structure of a protein as a static PSN. Because folding of a protein is a dynamic process, where some parts of a protein fold before others, modeling the 3D structure of a protein as a dynamic PSN might further help improve the existing PSC performance. Here, we propose for the first time a way to model 3D structures of proteins as dynamic PSNs, with the hypothesis that this will improve upon the current state-of-the-art PSC approaches that are based on static PSNs (and thus upon the existing state-of-the-art sequence and other 3D structural approaches). Indeed, we confirm this on 71 datasets spanning ~44,000 protein domains from CATH and SCOPe
△ Less
Submitted 14 May, 2021;
originally announced May 2021.
-
AcinoSet: A 3D Pose Estimation Dataset and Baseline Models for Cheetahs in the Wild
Authors:
Daniel Joska,
Liam Clark,
Naoya Muramatsu,
Ricardo Jericevich,
Fred Nicolls,
Alexander Mathis,
Mackenzie W. Mathis,
Amir Patel
Abstract:
Animals are capable of extreme agility, yet understanding their complex dynamics, which have ecological, biomechanical and evolutionary implications, remains challenging. Being able to study this incredible agility will be critical for the development of next-generation autonomous legged robots. In particular, the cheetah (acinonyx jubatus) is supremely fast and maneuverable, yet quantifying its w…
▽ More
Animals are capable of extreme agility, yet understanding their complex dynamics, which have ecological, biomechanical and evolutionary implications, remains challenging. Being able to study this incredible agility will be critical for the development of next-generation autonomous legged robots. In particular, the cheetah (acinonyx jubatus) is supremely fast and maneuverable, yet quantifying its whole-body 3D kinematic data during locomotion in the wild remains a challenge, even with new deep learning-based methods. In this work we present an extensive dataset of free-running cheetahs in the wild, called AcinoSet, that contains 119,490 frames of multi-view synchronized high-speed video footage, camera calibration files and 7,588 human-annotated frames. We utilize markerless animal pose estimation to provide 2D keypoints. Then, we use three methods that serve as strong baselines for 3D pose estimation tool development: traditional sparse bundle adjustment, an Extended Kalman Filter, and a trajectory optimization-based method we call Full Trajectory Estimation. The resulting 3D trajectories, human-checked 3D ground truth, and an interactive tool to inspect the data is also provided. We believe this dataset will be useful for a diverse range of fields such as ecology, neuroscience, robotics, biomechanics as well as computer vision.
△ Less
Submitted 24 March, 2021;
originally announced March 2021.
-
Network approach integrates 3D structural and sequence data to improve protein structural comparison
Authors:
Fazle E. Faisal,
Julie L. Chaney,
Khalique Newaz,
Jun Li,
Scott J. Emrich,
Patricia L. Clark,
Tijana Milenkovic
Abstract:
Initial protein structural comparisons were sequence-based. Since amino acids that are distant in the sequence can be close in the 3-dimensional (3D) structure, 3D contact approaches can complement sequence approaches. Traditional 3D contact approaches study 3D structures directly. Instead, 3D structures can be modeled as protein structure networks (PSNs). Then, network approaches can compare prot…
▽ More
Initial protein structural comparisons were sequence-based. Since amino acids that are distant in the sequence can be close in the 3-dimensional (3D) structure, 3D contact approaches can complement sequence approaches. Traditional 3D contact approaches study 3D structures directly. Instead, 3D structures can be modeled as protein structure networks (PSNs). Then, network approaches can compare proteins by comparing their PSNs. Network approaches may improve upon traditional 3D contact approaches. We cannot use existing PSN approaches to test this, because: 1) They rely on naive measures of network topology. 2) They are not robust to PSN size. They cannot integrate 3) multiple PSN measures or 4) PSN data with sequence data, although this could help because the different data types capture complementary biological knowledge. We address these limitations by: 1) exploiting well-established graphlet measures via a new network approach, 2) introducing normalized graphlet measures to remove the bias of PSN size, 3) allowing for integrating multiple PSN measures, and 4) using ordered graphlets to combine the complementary PSN data and sequence data. We compare both synthetic networks and real-world PSNs more accurately and faster than existing network, 3D contact, or sequence approaches. Our approach finds PSN patterns that may be biochemically interesting.
△ Less
Submitted 27 February, 2017; v1 submitted 23 May, 2016;
originally announced May 2016.