-
Mask-up: Investigating Biases in Face Re-identification for Masked Faces
Authors:
Siddharth D Jaiswal,
Ankit Kr. Verma,
Animesh Mukherjee
Abstract:
AI based Face Recognition Systems (FRSs) are now widely distributed and deployed as MLaaS solutions all over the world, moreso since the COVID-19 pandemic for tasks ranging from validating individuals' faces while buying SIM cards to surveillance of citizens. Extensive biases have been reported against marginalized groups in these systems and have led to highly discriminatory outcomes. The post-pa…
▽ More
AI based Face Recognition Systems (FRSs) are now widely distributed and deployed as MLaaS solutions all over the world, moreso since the COVID-19 pandemic for tasks ranging from validating individuals' faces while buying SIM cards to surveillance of citizens. Extensive biases have been reported against marginalized groups in these systems and have led to highly discriminatory outcomes. The post-pandemic world has normalized wearing face masks but FRSs have not kept up with the changing times. As a result, these systems are susceptible to mask based face occlusion. In this study, we audit four commercial and nine open-source FRSs for the task of face re-identification between different varieties of masked and unmasked images across five benchmark datasets (total 14,722 images). These simulate a realistic validation/surveillance task as deployed in all major countries around the world. Three of the commercial and five of the open-source FRSs are highly inaccurate; they further perpetuate biases against non-White individuals, with the lowest accuracy being 0%. A survey for the same task with 85 human participants also results in a low accuracy of 40%. Thus a human-in-the-loop moderation in the pipeline does not alleviate the concerns, as has been frequently hypothesized in literature. Our large-scale study shows that developers, lawmakers and users of such services need to rethink the design principles behind FRSs, especially for the task of face re-identification, taking cognizance of observed biases.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Auditing Gender Analyzers on Text Data
Authors:
Siddharth D Jaiswal,
Ankit Kumar Verma,
Animesh Mukherjee
Abstract:
AI models have become extremely popular and accessible to the general public. However, they are continuously under the scanner due to their demonstrable biases toward various sections of the society like people of color and non-binary people. In this study, we audit three existing gender analyzers -- uClassify, Readable and HackerFactor, for biases against non-binary individuals. These tools are d…
▽ More
AI models have become extremely popular and accessible to the general public. However, they are continuously under the scanner due to their demonstrable biases toward various sections of the society like people of color and non-binary people. In this study, we audit three existing gender analyzers -- uClassify, Readable and HackerFactor, for biases against non-binary individuals. These tools are designed to predict only the cisgender binary labels, which leads to discrimination against non-binary members of the society. We curate two datasets -- Reddit comments (660k) and, Tumblr posts (2.05M) and our experimental evaluation shows that the tools are highly inaccurate with the overall accuracy being ~50% on all platforms. Predictions for non-binary comments on all platforms are mostly female, thus propagating the societal bias that non-binary individuals are effeminate. To address this, we fine-tune a BERT multi-label classifier on the two datasets in multiple combinations, observe an overall performance of ~77% on the most realistically deployable setting and a surprisingly higher performance of 90% for the non-binary class. We also audit ChatGPT using zero-shot prompts on a small dataset (due to high pricing) and observe an average accuracy of 58% for Reddit and Tumblr combined (with overall better results for Reddit).
Thus, we show that existing systems, including highly advanced ones like ChatGPT are biased, and need better audits and moderation and, that such societal biases can be addressed and alleviated through simple off-the-shelf models like BERT trained on more gender inclusive datasets.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
A One class Classifier based Framework using SVDD : Application to an Imbalanced Geological Dataset
Authors:
Soumi Chaki,
Akhilesh Kumar Verma,
Aurobinda Routray,
William K. Mohanty,
Mamata Jenamani
Abstract:
Evaluation of hydrocarbon reservoir requires classification of petrophysical properties from available dataset. However, characterization of reservoir attributes is difficult due to the nonlinear and heterogeneous nature of the subsurface physical properties. In this context, present study proposes a generalized one class classification framework based on Support Vector Data Description (SVDD) to…
▽ More
Evaluation of hydrocarbon reservoir requires classification of petrophysical properties from available dataset. However, characterization of reservoir attributes is difficult due to the nonlinear and heterogeneous nature of the subsurface physical properties. In this context, present study proposes a generalized one class classification framework based on Support Vector Data Description (SVDD) to classify a reservoir characteristic water saturation into two classes (Class high and Class low) from four logs namely gamma ray, neutron porosity, bulk density, and P sonic using an imbalanced dataset. A comparison is carried out among proposed framework and different supervised classification algorithms in terms of g metric means and execution time. Experimental results show that proposed framework has outperformed other classifiers in terms of these performance evaluators. It is envisaged that the classification analysis performed in this study will be useful in further reservoir modeling.
△ Less
Submitted 2 December, 2016;
originally announced December 2016.
-
A Novel Framework based on SVDD to Classify Water Saturation from Seismic Attributes
Authors:
Soumi Chaki,
Akhilesh Kumar Verma,
Aurobinda Routray,
William K. Mohanty,
Mamata Jenamani
Abstract:
Water saturation is an important property in reservoir engineering domain. Thus, satisfactory classification of water saturation from seismic attributes is beneficial for reservoir characterization. However, diverse and non-linear nature of subsurface attributes makes the classification task difficult. In this context, this paper proposes a generalized Support Vector Data Description (SVDD) based…
▽ More
Water saturation is an important property in reservoir engineering domain. Thus, satisfactory classification of water saturation from seismic attributes is beneficial for reservoir characterization. However, diverse and non-linear nature of subsurface attributes makes the classification task difficult. In this context, this paper proposes a generalized Support Vector Data Description (SVDD) based novel classification framework to classify water saturation into two classes (Class high and Class low) from three seismic attributes seismic impedance, amplitude envelop, and seismic sweetness. G-metric means and program execution time are used to quantify the performance of the proposed framework along with established supervised classifiers. The documented results imply that the proposed framework is superior to existing classifiers. The present study is envisioned to contribute in further reservoir modeling.
△ Less
Submitted 2 December, 2016;
originally announced December 2016.
-
Well Tops Guided Prediction of Reservoir Properties using Modular Neural Network Concept A Case Study from Western Onshore, India
Authors:
Soumi Chaki,
Akhilesh K Verma,
Aurobinda Routray,
William K Mohanty,
Mamata Jenamani
Abstract:
This paper proposes a complete framework consisting pre-processing, modeling, and post-processing stages to carry out well tops guided prediction of a reservoir property (sand fraction) from three seismic attributes (seismic impedance, instantaneous amplitude, and instantaneous frequency) using the concept of modular artificial neural network (MANN). The data set used in this study comprising thre…
▽ More
This paper proposes a complete framework consisting pre-processing, modeling, and post-processing stages to carry out well tops guided prediction of a reservoir property (sand fraction) from three seismic attributes (seismic impedance, instantaneous amplitude, and instantaneous frequency) using the concept of modular artificial neural network (MANN). The data set used in this study comprising three seismic attributes and well log data from eight wells, is acquired from a western onshore hydrocarbon field of India. Firstly, the acquired data set is integrated and normalized. Then, well log analysis and segmentation of the total depth range into three different units (zones) separated by well tops are carried out. Secondly, three different networks are trained corresponding to three different zones using combined data set of seven wells and then trained networks are validated using the remaining test well. The target property of the test well is predicted using three different tuned networks corresponding to three zones; and then the estimated values obtained from three different networks are concatenated to represent the predicted log along the complete depth range of the testing well. The application of multiple simpler networks instead of a single one improves the prediction accuracy in terms of performance metrics such as correlation coefficient, root mean square error, absolute error mean and program execution time.
△ Less
Submitted 23 September, 2015;
originally announced September 2015.
-
Quantification of sand fraction from seismic attributes using Neuro-Fuzzy approach
Authors:
Akhilesh K Verma,
Soumi Chaki,
Aurobinda Routray,
William K Mohanty,
Mamata Jenamani
Abstract:
In this paper, we illustrate the modeling of a reservoir property (sand fraction) from seismic attributes namely seismic impedance, seismic amplitude, and instantaneous frequency using Neuro-Fuzzy (NF) approach. Input dataset includes 3D post-stacked seismic attributes and six well logs acquired from a hydrocarbon field located in the western coast of India. Presence of thin sand and shale layers…
▽ More
In this paper, we illustrate the modeling of a reservoir property (sand fraction) from seismic attributes namely seismic impedance, seismic amplitude, and instantaneous frequency using Neuro-Fuzzy (NF) approach. Input dataset includes 3D post-stacked seismic attributes and six well logs acquired from a hydrocarbon field located in the western coast of India. Presence of thin sand and shale layers in the basin area makes the modeling of reservoir characteristic a challenging task. Though seismic data is helpful in extrapolation of reservoir properties away from boreholes; yet, it could be challenging to delineate thin sand and shale reservoirs using seismic data due to its limited resolvability. Therefore, it is important to develop state-of-art intelligent methods for calibrating a nonlinear map** between seismic data and target reservoir variables. Neural networks have shown its potential to model such nonlinear map**s; however, uncertainties associated with the model and datasets are still a concern. Hence, introduction of Fuzzy Logic (FL) is beneficial for handling these uncertainties. More specifically, hybrid variants of Artificial Neural Network (ANN) and fuzzy logic, i.e., NF methods, are capable for the modeling reservoir characteristics by integrating the explicit knowledge representation power of FL with the learning ability of neural networks. The documented results in this study demonstrate acceptable resemblance between target and predicted variables, and hence, encourage the application of integrated machine learning approaches such as Neuro-Fuzzy in reservoir characterization domain. Furthermore, visualization of the variation of sand probability in the study area would assist in identifying placement of potential wells for future drilling operations.
△ Less
Submitted 23 September, 2015;
originally announced September 2015.
-
Evolving Social Networks via Friend Recommendations
Authors:
Amit Kumar Verma,
Manjish Pal
Abstract:
A social network grows over a period of time with the formation of new connections and relations. In recent years we have witnessed a massive growth of online social networks like Facebook, Twitter etc. So it has become a problem of extreme importance to know the destiny of these networks. Thus predicting the evolution of a social network is a question of extreme importance. A good model for evolu…
▽ More
A social network grows over a period of time with the formation of new connections and relations. In recent years we have witnessed a massive growth of online social networks like Facebook, Twitter etc. So it has become a problem of extreme importance to know the destiny of these networks. Thus predicting the evolution of a social network is a question of extreme importance. A good model for evolution of a social network can help in understanding the properties responsible for the changes occurring in a network structure. In this paper we propose such a model for evolution of social networks. We model the social network as an undirected graph where nodes represent people and edges represent the friendship between them. We define the evolution process as a set of rules which resembles very closely to how a social network grows in real life. We simulate the evolution process and show, how starting from an initial network, a network evolves using this model. We also discuss how our model can be used to model various complex social networks other than online social networks like political networks, various organizations etc..
△ Less
Submitted 24 September, 2015; v1 submitted 17 September, 2015;
originally announced September 2015.
-
Agent enabled Mining of Distributed Protein Data Banks
Authors:
G. S. Bhamra,
A. K. Verma,
R. B. Patel
Abstract:
Mining biological data is an emergent area at the intersection between bioinformatics and data mining (DM). The intelligent agent based model is a popular approach in constructing Distributed Data Mining (DDM) systems to address scalable mining over large scale distributed data. The nature of associations between different amino acids in proteins has also been a subject of great anxiety. There is…
▽ More
Mining biological data is an emergent area at the intersection between bioinformatics and data mining (DM). The intelligent agent based model is a popular approach in constructing Distributed Data Mining (DDM) systems to address scalable mining over large scale distributed data. The nature of associations between different amino acids in proteins has also been a subject of great anxiety. There is a strong need to develop new models and exploit and analyze the available distributed biological data sources. In this study, we have designed and implemented a multi-agent system (MAS) called Agent enriched Quantitative Association Rules Mining for Amino Acids in distributed Protein Data Banks (AeQARM-AAPDB). Such globally strong association rules enhance understanding of protein composition and are desirable for synthesis of artificial proteins. A real protein data bank is used to validate the system.
△ Less
Submitted 19 June, 2015;
originally announced September 2015.
-
Green WSN- Optimization of Energy Use Through Reduction in Communication Workload
Authors:
Vandana **dal,
A. K. Verma,
Seema Bawa
Abstract:
Applications of Wireless Sensor Networks (WSNs) are growing day by day due to the ease of deployment, reduction in costs to affordable levels and versatility of these networks. Besides develo** advanced micro fabrication technologies means are being devised to reduce energy consumption to bring the network setup and operational costs down. With increasing applications amount of energy consumed i…
▽ More
Applications of Wireless Sensor Networks (WSNs) are growing day by day due to the ease of deployment, reduction in costs to affordable levels and versatility of these networks. Besides develo** advanced micro fabrication technologies means are being devised to reduce energy consumption to bring the network setup and operational costs down. With increasing applications amount of energy consumed in these networks is enormous. Even a small savings in energy consumption will result in huge benefits in energy consumption globally. Bulk of the energy is consumed in communication activity of these networks. It is our endeavor to optimize this activity to make these networks energy efficient and thereby reducing their impact on the overall environment in line with the principle Go Green.
△ Less
Submitted 16 June, 2015;
originally announced June 2015.
-
Secure Multipath Routing Scheme using Key Pre-Distribution in Wireless Sensor Networks
Authors:
Kamal Kumar,
A. K. Verma,
R. B. Patel
Abstract:
Multipath routing in WSN has been a long wish in security scenario where nodes on next-hop may be targeted to compromise. Many proposals of Multipath routing has been proposed in ADHOC Networks but under constrained from keying environment most seems ignorant. In WSN where crucial data is reported by nodes in deployment area to their securely located Sink, route security has to be guaranteed. Unde…
▽ More
Multipath routing in WSN has been a long wish in security scenario where nodes on next-hop may be targeted to compromise. Many proposals of Multipath routing has been proposed in ADHOC Networks but under constrained from keying environment most seems ignorant. In WSN where crucial data is reported by nodes in deployment area to their securely located Sink, route security has to be guaranteed. Under dynamic load and selective attacks, availability of multiple secure paths is a boon and increases the attacker efforts by many folds. We propose to build a subset of neighbors as our front towards destination node. We also identified forwarders for query by base station. The front is optimally calculated to maintain the security credential and avail multiple paths. According to our knowledge ours is first secure multipath routing protocol for WSN. We established effectiveness of our proposal with mathematical analysis
△ Less
Submitted 12 August, 2014;
originally announced August 2014.
-
A Review of Power Aware Routing Protocols in Wireless Sensor Networks
Authors:
Sukhchandan Randhawa,
Anil Kumar Verma
Abstract:
WSNs are envisioned to consist of many small devices that can sense the environment and communicate the data as required. The most critical requirement for widespread sensor networks is power efficiency since battery replacement is not viable. Many protocols are proposed to minimize the power consumption by using complex algorithms. However, it is difficult to perform these complex methods since a…
▽ More
WSNs are envisioned to consist of many small devices that can sense the environment and communicate the data as required. The most critical requirement for widespread sensor networks is power efficiency since battery replacement is not viable. Many protocols are proposed to minimize the power consumption by using complex algorithms. However, it is difficult to perform these complex methods since an individual sensor node in sensor networks does not have high computational capacity. On the other hand, many sensor nodes should transfer the data packet to the sink node that collects the required data. Therefore, the operations of the sensor nodes over the route are terminated. It is difficult to deliver the data packet to the sink node even if some sensor nodes are active. In this paper, an introduction of WSNs is presented with a deep insight into the power-aware routing protocol for sensor networks. The protocols considered are LEACH,VGA and PEGASIS. In addition, a comparison of these protocols is also presented.
△ Less
Submitted 5 April, 2014;
originally announced April 2014.
-
List Sort: A New Approach for Sorting List to Reduce Execution Time
Authors:
Adarsh Kumar Verma,
Prashant Kumar
Abstract:
In this paper we are proposing a new sorting algorithm, List Sort algorithm, is based on the dynamic memory allocation. In this research study we have also shown the comparison of various efficient sorting techniques with List sort. Due the dynamic nature of the List sort, it becomes much more fast than some conventional comparison sorting techniques and comparable to Quick Sort and Merge Sort. Li…
▽ More
In this paper we are proposing a new sorting algorithm, List Sort algorithm, is based on the dynamic memory allocation. In this research study we have also shown the comparison of various efficient sorting techniques with List sort. Due the dynamic nature of the List sort, it becomes much more fast than some conventional comparison sorting techniques and comparable to Quick Sort and Merge Sort. List sort takes the advantage of the data which is already sorted either in ascending order or in descending order.
△ Less
Submitted 26 October, 2013;
originally announced October 2013.
-
Implementation of DYMO routing protocol
Authors:
Anuj K. Gupta,
Harsh Sadawarti,
Anil K. Verma
Abstract:
Mobile ad hoc networks communicate without any fixed infrastructure or ant centralized domain. All the nodes are free to move randomly within the network and share information dynamically. To achieve an efficient routing various protocols have been developed so far which vary in their nature and have their own salient properties. In this paper, we have discussed one of the latest protocols i.e. Dy…
▽ More
Mobile ad hoc networks communicate without any fixed infrastructure or ant centralized domain. All the nodes are free to move randomly within the network and share information dynamically. To achieve an efficient routing various protocols have been developed so far which vary in their nature and have their own salient properties. In this paper, we have discussed one of the latest protocols i.e. Dynamic Manet on demand (DYMO) routing Protocol, implemented and analysed its performance with other similar protocols against different parameters. Finally a comparison has been presented between all of them.
△ Less
Submitted 6 June, 2013;
originally announced June 2013.