Computer Science > Computer Vision and Pattern Recognition
[Submitted on 5 Aug 2017]
Title:A Novel data Pre-processing method for multi-dimensional and non-uniform data
View PDFAbstract:We are in the era of data analytics and data science which is on full bloom. There is abundance of all kinds of data for example biometrics based data, satellite images data, chip-seq data, social network data, sensor based data etc. from a variety of sources. This data abundance is the result of the fact that storage cost is getting cheaper day by day, so people as well as almost all business or scientific organizations are storing more and more data. Most of the real data is multi-dimensional, non-uniform, and big in size, such that it requires a unique pre-processing before analyzing it. In order to make data useful for any kind of analysis, pre-processing is a very important step. This paper presents a unique and novel pre-processing method for multi-dimensional and non-uniform data with the aim of making it uniform and reduced in size without losing much of its value. We have chosen biometric signature data to demonstrate the proposed method as it qualifies for the attributes of being multi-dimensional, non-uniform and big in size. Biometric signature data does not only captures the structural characteristics of a signature but also its behavioral characteristics that are captured using a dynamic signature capture device. These features like pen pressure, pen tilt angle, time taken to sign a document when collected in real-time turn out to be of varying dimensions. This feature data set along with the structural data needs to be pre-processed in order to use it to train a machine learning based model for signature verification purposes. We demonstrate the success of the proposed method over other methods using experimental results for biometric signature data but the same can be implemented for any other data with similar properties from a different domain.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.