Encoding Hierarchical Schema via Concept Flow for
Multifaceted Ideology Detection
Abstract
Multifaceted ideology detection (MID) aims to detect the ideological leanings of texts towards multiple facets. Previous studies on ideology detection mainly focus on one generic facet and ignore label semantics and explanatory descriptions of ideologies, which are a kind of instructive information and reveal the specific concepts of ideologies. In this paper, we develop a novel concept semantics-enhanced framework for the MID task. Specifically, we propose a bidirectional iterative concept flow (BICo) method to encode multifaceted ideologies. BICo enables the concepts to flow across levels of the schema tree and enriches concept representations with multi-granularity semantics. Furthermore, we explore concept attentive matching and concept-guided contrastive learning strategies to guide the model to capture ideology features with the learned concept semantics. Extensive experiments on the benchmark dataset show that our approach achieves state-of-the-art performance in MID, including in the cross-topic scenario.111 The source code is available at https://github.com/LST1836/BICo
Encoding Hierarchical Schema via Concept Flow for
Multifaceted Ideology Detection
Songtao Liu1, Bang Wang1,11footnotemark: 1, Wei Xiang1, Han Xu2 and Minghua Xu2,††thanks: Corresponding author: B. Wang and M. Xu 1School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China 2School of Journalism and Information Communication, Huazhong University of Science and Technology, Wuhan, China {liusongtao, wangbang, xiangwei, xuh, xuminghua}@hust.edu.cn
1 Introduction
Multifaceted ideology detection (MID) aims to identify the ideological leanings (e.g., Left, Center, Right, etc.) expressed in texts towards multiple facets, as shown in Figure 1. It is crucial for understanding public opinion and detecting potential extremism (Kannangara, 2018; Grover and Mark, 2019; Demszky et al., 2019), which is helpful for governments and cybersecurity organizations (Stefanov et al., 2020; Aldera et al., 2021). It can also facilitate downstream research and applications in social sciences (Kabir and Madria, 2022).
In most of related work, researches generally focus on modeling the text content with diversified cues, such as sentiment polarities (Bhatia and P, 2018; Kabir and Madria, 2022), named entities (Liu et al., 2022), and discourse structure (Devatine et al., 2023; Hong et al., 2023), or jointly learning with other related tasks (Baly et al., 2019). There are also approaches that incorporate information sources beyond text to facilitate ideology mining. Hyperlink structure (Kulkarni et al., 2018), social networks (Stefanov et al., 2020; Li and Goldwasser, 2021), external knowledge from knowledge graphs (Zhang et al., 2022) as well as information from other modalities (Qiu et al., 2022), are introduced in the task of ideology detection.
![Refer to caption](x1.png)
Although achieving promising performance, those methods limit the ideology prediction to a generic facet. In other words, they only label a text as ideologically left- or right-leaning as a whole, regardless whether the text containing one or more different facets. Furthermore, they ignore a crucial clue, label semantics, that is, what exactly does an ideology mean? In this case, ideological categories are represented as one-hot vectors without any semantic information, and models can only rely on the training data distribution to analyze latent ideology features, which could be unfavorable for the generalization ability of models (Wang et al., 2021; Wen and Hauptmann, 2023).
So how can we effectively detect multifaceted ideologies? And what exactly does an ideology mean? Liu et al. (2023) propose the multifaceted ideology detection task for the first time and design a multifaceted ideology schema which contains 12 facets covering 5 domains in a tree-like hierarchical structure (see Figure 1, details in Appendix A). Each facet, as well as the ideological attributes under each facet, are defined using natural texts, which can be regarded as concepts. These concepts describe the meaning of facets and ideologies, thus making it natural to represent the label semantics. In addition, in the hierarchical schema, higher-level concepts (like Domain and Facet) have general semantics shared by their child concepts, while lower-level concepts (like Ideology and Facet) describe their parents from various views, which can be seen as the semantic divisions of higher-level concepts. This meaningful hierarchical structure can be utilized to enrich the concept semantics.
Based on the motivation above, to incorporate the concept semantics and leverage the hierarchical structure of the schema in MID, we propose a novel Bidirectional Iterative Concept Flow (BICo) method to encode the hierarchical schema. Specifically, BICo allows concepts to flow in two directions on the schema tree, enabling them to perceive both high-level general semantics and low-level specific perspectives. On the one hand, inspired by the relation rotation in complex space (Sun et al., 2018), we design Concept Metapath Diffusion to perform message passing from root to leaf. On the other hand, in the direction of leaf to root, we propose Concept Hierarchy Aggregation to aggregate concept semantics in lower levels to the ones in higher levels based on the parent-child relation. Concept flow in the two directions is iterated multiple times and the final concept representations are enriched by multi-granularity semantics. For example, the Facet representations capture the meanings of different ideologies in the corresponding facet, while the Ideology representations also perceive information about the Facet and Domain they belong to. We match the text and Facet representations based on the attention mechanism to recognize text-related facets. Furthermore, we explore a Concept-Guided Contrastive Learning strategy to learn more distinguishable text representations under the guidance of Ideology concepts.
The main contributions of our work are summarized as follows:
(1) We propose a concept semantics-enhanced MID framework. To our best knowledge, this is the first work that incorporates label semantics and explanatory descriptions in the MID task.
(2) We propose a Bidirectional Iterative Concept Flow (BICo) method to encode the hierarchical schema. Concepts flow on the schema tree in two directions iteratively to capture multi-granularity concept semantics.
(3) We design Concept Attentive Matching and Concept-Guided Contrastive Learning strategies to enable the model to extract ideology features with the help of concept semantics.
(4) Extensive experiments on the MITweet benchmark demonstrate the effectiveness of our approach, including in the cross-topic scenario.
![Refer to caption](x2.png)
2 Task Description
Given an input text and a set of facets, Multifaceted Ideology Detection (MID) is divided into two sub-tasks: (1) Relevance Recognition aims to recognize the facets that the text is related to; (2) Ideology Analysis predicts which ideology the text holds towards the related facets. Formally, a sample instance can be considered as a triple , where is the input text, Related, Unrelated represents the relevance label of -th facet, is the number of given facets. For each facet that the text is related to, we have an ideology label Left, Center, Right, is the number of related facets.
3 Approach
In this section, we first introduce the proposed Bidirectional Iterative Concept Flow (BICo) for encoding the hierarchical schema, and then discuss how we augment multifaceted ideology detection based on the learned concept encodings. Figure 2 illustrates the overall structure of our model.
3.1 Bidirectional Iterative Concept Flow
3.1.1 Concept Hierarchy Tree
Liu et al. (2023) define the first hierarchical schema of multifaceted ideology, which contains 12 facets covering 5 domains. We construct a concept hierarchy tree based on the schema, as shown in Figure 2. The node set contains four types of nodes, i.e., Root, Domain, Facet and Ideology. The edge set indicates the subordination relation between nodes. The Ideology, or leaf, nodes represent the three ideologies (Left, Center, Right) of each facet.
To initialize node embeddings in the concept hierarchy tree, we leverage the concepts of facets and ideologies in the schema. Specifically, we adopt a pre-trained language model as the concept encoder and feed the concepts of facets and ideologies in the schema into the encoder. We then extract the hidden state of [CLS] token as initial representations of Facet and Ideology nodes, i.e., and . For Root and Domain nodes, we obtain their initial embeddings ( and ) by average-pooling their child node embeddings.
3.1.2 Concept Metapath Diffusion
In the concept hierarchy tree, higher-level nodes (like Root and Domain) have general and abstract concepts, which are shared by their child nodes and could be beneficial for enriching the representations of lower-level nodes (like Ideology and Facet). In order to allow lower-level nodes to perceive higher-level abstract semantics, we adopt the relation rotation in complex space (Sun et al., 2018), which is effective for information transfer along edges in a sequential structure.
Specifically, we define concept metapath as a path from root to leaf . Given node representations in a metapath , let be the representation of edge between node and , the concept metapath diffusion from root to leaf through relation rotation is formulated as:
(1) | ||||
(2) | ||||
(3) |
where , and are all complex vectors, is the updated embedding, is the element-wise complex product and performs vector rotation in complex space. Here we can easily interpret a real vector of dimension as a complex vector of dimension by treating the first half of the vector as the real part and the second half as the imaginary part. We perform concept diffusion on all metapaths in the tree and is shared between each two consecutive levels of nodes. Note that in relation rotation, edges represent the rotation angles of vectors in complex space. Therefore, is first randomly initialized in the range of , and then its real and imaginary parts are obtained by the Euler’s formula.
3.1.3 Concept Hierarchy Aggregation
In contrast to metapath diffusion, concept hierarchy aggregation enables concept flow from leaf to root. In the concept hierarchy tree, child nodes describe their parent node from different views, and thus can be regarded as more fine-grained concepts. Through concept hierarchy aggregation, we aggregate concept semantics of child nodes to their parent node, so as to enrich the representations of higher-level nodes.
We utilize the graph attention network (GAT), which aggregates features through attention mechanism in a graph. Considering the characteristics of concept hierarchy tree, we modify it to explicitly model the hierarchical structure and quantitatively measure the compatibility between hierarchies in the tree. Specifically, we only establish aggregation between the parent node and its own child nodes, which is different from aggregating over all one-hop neighbors in GAT. Additionally, we use different attention parameters at different levels to distinguish the aggregation features of each hierarchy.
Formally, for a parent node with embedding , we compute an aggregation weight for each child node and then weighted sum all child nodes’ embeddings:
(4) | ||||
(5) | ||||
(6) |
where is the learnable parameter for aggregation of nodes in level , is the child node set of , is the updated representation for .
3.1.4 Bidirectional Iteration
The root-to-leaf metapath diffusion and leaf-to-root hierarchy aggregation are iterated multiple times to update node encodings. Finally, The new generated concept representations can be fully aware of higher-level general semantics and constructed with concepts from different aspects. Next we will enhance the MID task with the enriched Facet and Ideology representations, and .
3.2 Concept-Enhanced MID
3.2.1 Text Encoder
We select a pre-trained language model as the text encoder. In the subtask of Relevance Recognition, the encoder processes input sequence and outputs a hidden representation for each token: , where is the length of text. For Ideology Analysis, we concatenate the text and its related facet concept, and then feed the sequence into text encoder to acquire the hidden state of [CLS] as text representation .
3.2.2 Concept Attentive Matching
In Relevance Recognition subtask, to enable the text to be aware of label semantics (i.e., Facet concepts) and measure the importance of each token in relevance feature extraction, we adopt the cross-attention mechanism (Vaswani et al., 2017) to match the Facet and input token representations:
(7) |
where is the dimension of vectors in the equation, the superscript represents -th facet, is -th facet representation and is -th facet-aware text representation.
3.2.3 Concept-Guided Contrastive Learning
To inject label semantics (i.e., Ideology concepts) into Ideology Analysis subtask, we further explore a Concept-Guided Contrastive Learning strategy (CGCL), which tries to make intra-ideology representations more compact in the feature space and inter-ideology ones more distinguishable with the ideology concepts as anchors. The motivation is that ideology concepts describe the general meaning of ideologies. In the embedding space, this property can be interpreted as clustering, where an ideology concept anchor is the semantic center of samples with that ideological category.
Specifically, given text representations in a batch ( is the batch size), and three Ideology representations (corresponding to Left, Center and Right respectively) which will be used as concept anchors in the vector space, the concept-guided contrastive loss is formulated as:
(8) | |||
(9) |
where is the ideology label of , is the cosine similarity function, is temperature parameter. Note that is computed for each facet, and we omit the facet superscript for clarity.
3.2.4 Classification and Training
Considering the varying ideology features among different facets, we set up a classification head with a softmax function for each facet in both subtasks:
(10) |
where the superscript represents -th facet, and are trainable parameters.
Note that in Relevance Recognition, we also incorporate contrastive learning (CL), which is similar to the concept-guided CL in Sec. 3.2.3, but the anchors here are text representations themselves:
(11) |
where is the facet-aware text representation, , is the relevance label of , is the batch size, is temperature parameter. Here is also computed for each facet, and we omit the facet superscript for clarity.
Finally, the training loss of both subtasks is the weighted sum of cross-entropy classification loss and contrastive learning loss across all facets:
(12) |
where is the cross-entropy loss of -th facet, is a hyper-parameter controlling the weight of contrastive loss, is the total number of facets.
4 Experiments
4.1 Dataset and Evaluation Metrics
We conduct experiments on the MITweet (Liu et al., 2023) dataset, which contains 12,594 English tweets and covers 14 highly controversial topics in recent years. Each instance in MITweet is annotated with a relevance label and an ideology label (if the relevance label is “Related”) for each of the 12 facets in the multifaceted ideology schema. The statistics of MITweet is shown in Table 6.
We follow the original training/validation/test split and use the same evaluation metrics as Liu et al. (2023). First we calculate the Accuracy (Acc) and F1 score for each facet. Then we utilize both Macro and Micro methods to integrate metrics from all facets to obtain overall results of model performance. Macro-F1 and Macro-Acc are calculated by averaging F1 and Acc across all facets. Micro-F1 and Micro-Acc are the aggregated F1 and Acc scores obtained by concatenating the predictions of all facets. Note that, following existing work, we only report F1-related metrics for Relevance Recognition due to the highly imbalanced data distribution in this subtask.
4.2 Implement Details
The pre-trained BERTweet-base (Nguyen et al., 2020) is used as the concept and text encoder, and the two encoders share weights as this gave better results in preliminary experiments. We train the Relevance Recognition model and the Ideology Analysis model independently. Each model includes the BICo module and is trained end-to-end. We use AdamW (Loshchilov and Hutter, 2018) as the optimizer. The learning rate is set to 2e-5. The batch size is set to 64. The iteration number of BICo is set to 4 for relevance recognition and 2 for ideology analysis. For contrastive loss, we set the temperature parameter to 0.5 for relevance recognition and 0.1 for ideology analysis. The contrastive loss weight is set to 0.3 for both subtasks. The classification head is a two-layer fully connected network, in which the hidden size is 512. The above parameters are selected based on the validation set. We report the average results of 5 runs with different random seeds.
4.3 Comparison Models
We compare our approach with the latest benchmark in the MID task, BERTweetInd (Liu et al., 2023), which uses BERTweet as the backbone and detects indicator words from training set as the textual descriptions of facets. In addition, we test the zero/few-shot performance of advanced large language models (LLMs) in this task. Specifically, we select two popular LLMs, LLaMA2 (Touvron et al., 2023) and ChatGPT 222https://openai.com/blog/chatgpt, which exhibit superior capacities in communicating with humans, including solving a wide range of complex tasks without further training. We use the Llama-2-13b-chat and gpt-3.5-turbo-1106 versions. The prompts designed for LLMs can be found in Appendix B.
We also provide variants of our proposed approach in the ablation study:
Model | Macro-F1 | Micro-F1 | Macro-Acc | Micro-Acc |
Subtask 1: Relevance Recognition | ||||
BERTweetInd | 57.48 | 70.32 | - | - |
LLaMA2-13B∘ | 27.45 | 32.28 | - | - |
ChatGPT∘ | 33.11 | 40.07 | - | - |
LLaMA2-13B△ | 29.35 | 38.17 | - | - |
ChatGPT△ | 38.83 | 44.78 | - | - |
Our approach | - | - | ||
w/o CL | 58.56 | 71.42 | - | - |
w/o BICo | 58.14 | 70.41 | - | - |
w/o CL&BICo | 57.85 | 70.42 | - | - |
Subtask 2: Ideology Analysis | ||||
BERTweetInd | 42.68 | 69.28 | 65.88 | 76.38 |
LLaMA2-13B∘ | 35.60 | 47.33 | 45.98 | 49.69 |
ChatGPT∘ | 37.11 | 53.41 | 48.57 | 57.95 |
LLaMA2-13B△ | 38.51 | 47.22 | 46.13 | 48.90 |
ChatGPT△ | 42.64 | 60.54 | 58.44 | 68.25 |
Our approach | 66.79 | |||
w/o BICo | 46.02 | 68.58 | 66.18 | 76.63 |
w/o Concept anchors | 45.08 | 68.38 | 66.04 | 77.30 |
w/o CGCL | 44.21 | 67.54 | 65.15 | 76.79 |
• Relevance Recognition
(1) “w/o CL” denotes without contrastive learning.
(2) “w/o BICo” denotes without bidirectional iterative concept flow, in which case the facet representations in Concept Attentive Matching are directly from the concept encoder.
(3) “w/o CL&BICo” denotes the combination of the above two cases.
• Ideology Analysis
(1) “w/o BICo” denotes without bidirectional iterative concept flow. In this case, the concept anchors (i.e., ideology representations) are directly from the concept encoder.
(2) “w/o concept anchors” denotes performing the contrastive learning without the guidance of concept anchors, i.e., the anchors are text representations themselves, which is the case of Eq. (11).
(3) “w/o CGCL” denotes discarding the concept-guided contrastive learning.
Model | PoR | SS | EO | EE | EP | CSR | CV | DS | MF | SD | JO | PeR |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Subtask 1: Relevance Recognition | ||||||||||||
BERTweetInd | 46.92 | 32.71 | 71.05 | 63.29 | 82.26 | 35.04 | 19.52 | 62.73 | 85.99 | 44.07 | 75.55 | 70.71 |
LLaMA2-13B△ | 3.33 | 9.41 | 31.30 | 20.48 | 47.23 | 5.19 | 4.33 | 30.82 | 56.42 | 26.32 | 56.06 | 61.34 |
ChatGPT△ | 6.48 | 10.45 | 54.27 | 37.33 | 53.04 | 10.27 | 7.96 | 47.02 | 78.87 | 33.52 | 63.92 | 62.79 |
Our approach | 83.00 | 31.58 | 19.35 | |||||||||
Subtask 2: Ideology Analysis | ||||||||||||
BERTweetInd | 24.40 | 27.59 | 52.26 | 41.25 | 52.43 | 49.93 | 43.37 | 57.00 | 48.39 | 43.92 | 36.55 | 35.04 |
LLaMA2-13B△ | 37.23 | 44.47 | 52.28 | 36.97 | 45.56 | 44.81 | 29.60 | 41.60 | 29.87 | 39.90 | 33.47 | 26.40 |
ChatGPT△ | 24.44 | 43.91 | 50.79 | 43.73 | 54.06 | 33.33 | 51.59 | 52.92 | 33.03 | 36.03 | 41.95 | 45.87 |
Our approach | 33.16† | 50.25 | 35.56 | 49.31† | 47.25 | 37.10 | 41.24† |
4.4 Main Results
We present the overall results of our approach and other models in Table 1. First, we can observe that our concept-enhanced method performs consistently better than other baseline models, including the advanced large language models, indicating the superiority of our approach for the MID task. Second, compared with BERTweetInd, which is also a BERTweet-based model, our approach achieves significant improvements in both subtasks. This suggests that the application of concept semantics in the hierarchical schema helps the model to capture the correlation between text and labels, thus improving the performance. Third, for the LLMs, although ChatGPT performs better than LLaMA2-13B, and the few in-context demonstrations improve the results, there is still a large gap between LLMs and other task-specific models. This indicates that the MID task remains challenging for current LLMs. One possible reason is that, this task requires not only strong text understanding and semantic reasoning abilities, but also the integration of specialized sociological knowledge and background information on relevant topics, which is difficult for general-purpose LLMs.
In more detail, F1 scores of different models on each facet are shown in Table 2. In the subtask of Relevance Recognition, our approach achieves the best results on 10 out of 12 facets, surpassing the second-place by over 4 points on 4 facets (PoR, EE, DS, SD). This again demonstrates the effectiveness of our concept-enhanced framwork in the MID task. However, on the facets of CSR and CV, our approach is inferior to BERTweetInd, especially on CSR. We think this is likely because there are too few related samples in CSR (as shown in Table 6), and our method uses a separate classification head for each facet, resulting in even more insufficient training for CSR. Although this issue affects the results, it is only an edge case. The two LLMs still perform poorly, especially on PoR, SS and CV. By analyzing the responses generated by LLMs, we find that LLMs are more likely to ignore or generalize the definitions in prompts on these facets. For the Ideology Analysis subtask, the baseline models achieve the best or second-best result on some facets. Nevertheless, our approach ranks in the top two on 10 out of 12 facets and shows overall superior performance.
4.5 Ablation Study
We conduct ablation studies to inspect the importance of major components in our model and the results are reported in Table 1. It is clear that the removal of either one of the modules causes a drop in performance. The Micro-F1 decreases by 1.73 and 2.32 points on the two subtasks, respectively, when BICo is removed, which validates that it is important to further model the schema hierarchy and concept interactions on top of the concept encoder. BICo iteratively performs concept diffusion and aggregation on the hierarchy tree, and the updated concept representations are enriched by higher-level general semantics and lower-level concrete perspectives, which are helpful for the model to understand the deep meaning of facet and ideology labels.
In Ideology Analysis, the removal of concept anchors leads to noticeable performance degradation. This suggests that relying solely on text content to identify ideology is insufficient, and injecting label semantics can guide the model to capture ideology features and distinguish among different ideologies more accurately, so as to improve the performance of MID. Moreover, the results of “w/o CL” in Relevance Recognition and “w/o CGCL” in Ideology Analysis verify the effectiveness of contrastive learning strategies in two subtasks.
We also conduct ablation study for the modules of Concept Metapath Diffusion and Concept Hierarchy Aggregation in BICo. The results are presented in Appendix C.
Test Topics | Model | Relevance Recognition | Ideology Analysis | |
---|---|---|---|---|
Micro-F1 | Micro-Acc | Micro-F1 | ||
CHR&GF | BERTweetInd | 59.60 | 70.20 | 52.41 |
LLaMA2-13B | 28.29 | 56.79 | 44.22 | |
ChatGPT | 36.20 | 69.70 | 51.87 | |
Our approach | ||||
BLM&Dm | BERTweetInd | 54.69 | 80.64 | 58.89 |
LLaMA2-13B | 31.90 | 58.93 | 46.27 | |
ChatGPT | 39.54 | 73.45 | 54.09 | |
Our approach |
![Refer to caption](x3.png)
![Refer to caption](x4.png)
![Refer to caption](x5.png)
![Refer to caption](x6.png)
![Refer to caption](x7.png)
4.6 Cross-Topic Generalization
In our approach, label concepts are incorporated to enhance the model and they are enriched by multi-granularity concepts from different levels in the hierarchical schema through BICo. Intuitively, concepts provide a general description of a label. Therefore, our model should have better generalization to new topics with the help of concept semantics. To validate this viewpoint, we test and compare the cross-topic generalization ability of different models.
In the cross-topic scenario, the models are trained on some topics and then tested on the rest topics. To reduce randomness, we conduct experiments on two sets of test topics and the results are shown in Table 3. It can be observed that our approach consistently outperforms other models in both subtasks, which verifies that our approach can better generalize the learning ability to deal with cross-topic scenarios. LLMs lag behind other models by a significant margin. This shows that task-specific models still have advantages even in cross-topic scenarios. However, for the test topics of CHR&GF, ChatGPT performs closely to task-specific models in the Ideology Analysis subtask, indicating that ChatGPT may have practical value in specific cross-topic scenarios.
4.7 Effect of Number of Iterations
To analyze the effect of using different numbers of iterations in BICo, we conduct experiments on both subtasks and present the results in Figure 4. We can observe a clear upward and then downward trend in model performance as the number of iterations increases. The optimal number of iterations for Relevance Recognition is 4 and for Ideology Analysis is 2. One possible reason for this trend is that, when the number of iterations is too small, the concept diffusion and aggregation are insufficient, and the concept representations do not fully perceive the semantics of different granularities in the hierarchical structure. In contrast, when the number of iterations is too large, there will be redundancy in information transfer, and the semantic features of the concept itself will be lost.
4.8 Visualization
To qualitatively examine the role of label semantics (concept anchors) in the concept-guided contrastive learning, we randomly select a facet (Diplomatic Strategy) and show the t-SNE projections of text representations from test set in Figure 3. As observed, for the case of “w/o CGCL”, all samples are almost scattered without separations. There is a similar but better distribution for the model trained with CL. While for our CGCL (i.e., the full model), instances are well clustered by labels with only a slight overlap and the concept anchors are approximately cluster centers. This confirms that concept representations learned from BICo guide the model to better distinguish among different ideologies in the embedding space, which is helpful for subsequent classification.
5 Related Work
Ideology Detection
This task detects the ideology of texts in a generic facet. Many studies rely on text analysis techniques and try to leverage various textual cues (Bhatia and P, 2018; Baly et al., 2019, 2020; Chen et al., 2020; Kabir and Madria, 2022; Liu et al., 2022; Kim and Johnson, 2022; Devatine et al., 2023; Hong et al., 2023; Chen et al., 2023). In addition to text content, social networks (Li and Goldwasser, 2019; Stefanov et al., 2020; Xiao et al., 2020; Li and Goldwasser, 2021), external knowledge (Kulkarni et al., 2018; Zhang et al., 2022) and multimodal information (Dinkov et al., 2019; Qiu et al., 2022) are utilized to identify the ideology of online texts.
Multifaceted Ideology Detection
Considering that some texts may contain descriptions of different issues and reflect the author’s ideology from various aspects, some recent work study ideology detection on multiple facets. Sinno et al. (2022) investigate the political ideology of news articles from three facets, social, economic and foreign. Liu et al. (2023) first propose the MID task and design the first multifaceted ideology schema which defines 5 domains and 12 facets in a hierarchical structure. They also manually annotate a high-quality MITweet dataset and build baselines for MID. We follow Liu et al. (2023) and introduce label semantics into models through encoding the hierarchical schema.
6 Conclusion
In this paper, we have proposed a concept semantics-enhanced framework for the MID task. We have also designed a novel bidirectional iterative concept flow method to capture multi-granularity concept semantics. Moreover, we have explored concept attentive matching and concept-guided contrastive learning strategies to enable the model to extract ideology features with the help of concept semantics. Experiment results have validated the superiority of our approach.
Acknowledgement
This work is supported in part by Major Project of National Social Science Foundation of China: “AI and Precise International Communication” (Grant No. 22&ZD317) and National Natural Science Foundation of China (Grant No. 62172167). The computation is supported by the HPC Platform of Huazhong University of Science and Technology.
Limitations
-
•
Following Liu et al. (2023), we divide multifaceted ideology detection into two subtasks in a pipeline manner. However, this modeling approach increases the computational cost in both training and inference stages. In addition, error propagation in this pipeline mode is also a problem that cannot be ignored. We will investigate how to solve this task in an end-to-end manner in future work.
-
•
While we attempt to tune the concepts defined in the schema to better fit our approach, we are constrained by computational resources and time, so we directly adopt the concepts in the schema. Although these concepts are representative, there may be better ones that could lead to better performance.
Ethical Considerations
We carry out this work and conduct the experiments in accordance with the general ethics in social science research. The proposed concept-enhanced framework could automatically detect the multifaceted ideology of given texts, which is helpful for policy-makers and social statisticians. However, the algorithm is not perfect and may make incorrect predictions. Therefore, researches should realize the potential harm from the misuse of the ideology detection system, and cannot rely solely on the system to make judgments.
References
- Aldera et al. (2021) Saja Aldera, Ahmad Emam, Muhammad Al-Qurishi, Majed Alrubaian, and Abdulrahman Alothaim. 2021. Online extremism detection in textual content: a systematic literature review. IEEE Access, 9:42384–42396.
- Baly et al. (2020) Ramy Baly, Giovanni Da San Martino, James Glass, and Preslav Nakov. 2020. We can detect your bias: Predicting the political ideology of news articles. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4982–4991, Online. Association for Computational Linguistics.
- Baly et al. (2019) Ramy Baly, Georgi Karadzhov, Abdelrhman Saleh, James Glass, and Preslav Nakov. 2019. Multi-task ordinal regression for jointly predicting the trustworthiness and the leading political ideology of news media. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2109–2116, Minneapolis, Minnesota. Association for Computational Linguistics.
- Bhatia and P (2018) Sumit Bhatia and Deepak P. 2018. Topic-specific sentiment analysis can help identify political ideology. In Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 79–84, Brussels, Belgium. Association for Computational Linguistics.
- Chen et al. (2023) Chen Chen, Dylan Walker, and Venkatesh Saligrama. 2023. Ideology prediction from scarce and biased supervision: Learn to disregard the “what” and focus on the “how”! In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9529–9549, Toronto, Canada. Association for Computational Linguistics.
- Chen et al. (2020) Wei-Fan Chen, Khalid Al Khatib, Henning Wachsmuth, and Benno Stein. 2020. Analyzing political bias and unfairness in news articles at different levels of granularity. In Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science, pages 149–154, Online. Association for Computational Linguistics.
- Demszky et al. (2019) Dorottya Demszky, Nikhil Garg, Rob Voigt, James Zou, Jesse Shapiro, Matthew Gentzkow, and Dan Jurafsky. 2019. Analyzing polarization in social media: Method and application to tweets on 21 mass shootings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2970–3005, Minneapolis, Minnesota. Association for Computational Linguistics.
- Devatine et al. (2023) Nicolas Devatine, Philippe Muller, and Chloé Braud. 2023. An integrated approach for political bias prediction and explanation based on discursive structure. In Findings of the Association for Computational Linguistics: ACL 2023, pages 11196–11211, Toronto, Canada. Association for Computational Linguistics.
- Dinkov et al. (2019) Yoan Dinkov, Ahmed Ali, Ivan Koychev, and Preslav Nakov. 2019. Predicting the leading political ideology of youtube channels using acoustic, textual, and metadata information.
- Grover and Mark (2019) Ted Grover and Gloria Mark. 2019. Detecting potential warning behaviors of ideological radicalization in an alt-right subreddit. In Proceedings of the International AAAI Conference on Web and Social Media, volume 13, pages 193–204.
- Hong et al. (2023) Jiwoo Hong, Ye** Cho, Jiyoung Han, Jaemin Jung, and James Thorne. 2023. Disentangling structure and style: Political bias detection in news by inducing document hierarchy. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 5664–5686, Singapore. Association for Computational Linguistics.
- Kabir and Madria (2022) Md Yasin Kabir and Sanjay Madria. 2022. A deep learning approach for ideology detection and polarization analysis using covid-19 tweets. In International Conference on Conceptual Modeling, pages 209–223. Springer.
- Kannangara (2018) Sandeepa Kannangara. 2018. Mining twitter for fine-grained political opinion polarity classification, ideology detection and sarcasm detection. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pages 751–752.
- Kim and Johnson (2022) Michelle Young** Kim and Kristen Marie Johnson. 2022. CLoSE: Contrastive learning of subframe embeddings for political bias classification of news media. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2780–2793, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Kulkarni et al. (2018) Vivek Kulkarni, Junting Ye, Steve Skiena, and William Yang Wang. 2018. Multi-view models for political ideology detection of news articles. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3518–3527, Brussels, Belgium. Association for Computational Linguistics.
- Li and Goldwasser (2019) Chang Li and Dan Goldwasser. 2019. Encoding social information with graph convolutional networks forPolitical perspective detection in news media. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2594–2604, Florence, Italy. Association for Computational Linguistics.
- Li and Goldwasser (2021) Chang Li and Dan Goldwasser. 2021. Using social and linguistic information to adapt pretrained representations for political perspective identification. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 4569–4579, Online. Association for Computational Linguistics.
- Liu et al. (2023) Songtao Liu, Ziling Luo, Minghua Xu, Lixiao Wei, Ziyao Wei, Han Yu, Wei Xiang, and Bang Wang. 2023. Ideology takes multiple looks: A high-quality dataset for multifaceted ideology detection. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 4200–4213, Singapore. Association for Computational Linguistics.
- Liu et al. (2022) Yujian Liu, Xinliang Frederick Zhang, David Wegsman, Nicholas Beauchamp, and Lu Wang. 2022. POLITICS: Pretraining with same-story article comparison for ideology prediction and stance detection. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1354–1374, Seattle, United States. Association for Computational Linguistics.
- Loshchilov and Hutter (2018) Ilya Loshchilov and Frank Hutter. 2018. Decoupled weight decay regularization. In International Conference on Learning Representations.
- Nguyen et al. (2020) Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 9–14, Online. Association for Computational Linguistics.
- Qiu et al. (2022) Changyuan Qiu, Winston Wu, Xinliang Frederick Zhang, and Lu Wang. 2022. Late fusion with triplet margin objective for multimodal ideology prediction and analysis. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9720–9736, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Sinno et al. (2022) Barea Sinno, Bernardo Oviedo, Katherine Atwell, Malihe Alikhani, and Junyi Jessy Li. 2022. Political ideology and polarization: A multi-dimensional approach. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 231–243, Seattle, United States. Association for Computational Linguistics.
- Stefanov et al. (2020) Peter Stefanov, Kareem Darwish, Atanas Atanasov, and Preslav Nakov. 2020. Predicting the topical stance and political leaning of media using tweets. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 527–537, Online. Association for Computational Linguistics.
- Sun et al. (2018) Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. 2018. Rotate: Knowledge graph embedding by relational rotation in complex space. In International Conference on Learning Representations.
- Touvron et al. (2023) Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Vaswani et al. (2017) Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
- Wang et al. (2021) Xuepeng Wang, Li Zhao, Bing Liu, Tao Chen, Feng Zhang, and Di Wang. 2021. Concept-based label embedding via dynamic routing for hierarchical text classification. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5010–5019, Online. Association for Computational Linguistics.
- Wen and Hauptmann (2023) Haoyang Wen and Alexander Hauptmann. 2023. Zero-shot and few-shot stance detection on varied topics via conditional generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 1491–1499, Toronto, Canada. Association for Computational Linguistics.
- Xiao et al. (2020) Zhi** ** Song, Haoyan Xu, Zhicheng Ren, and Yizhou Sun. 2020. Timme: Twitter ideology-detection via multi-task multi-relational embedding. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’20, page 2258–2268, New York, NY, USA. Association for Computing Machinery.
- Zhang et al. (2022) Wenqian Zhang, Shangbin Feng, Zilong Chen, Zhenyu Lei, Jundong Li, and Minnan Luo. 2022. KCD: Knowledge walks and textual cues enhanced political perspective detection in news media. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4129–4140, Seattle, United States. Association for Computational Linguistics.
Domain | Facet | Left | Right |
---|---|---|---|
Politics | Political Regime (PoR) | Socialism | Capitalism |
State Structure (SS) | Centralism | Federalism | |
Economy | Economic Orientation (EO) | Command Economy | Market Economy |
Economic Equality (EE) | Outcome Equality | Opportunity Equality | |
Culture | Ethical Pursuit (EP) | Ethical Liberalism | Ethical Conservatism |
Church-State Relations (CSR) | Secularism | Caesaropapism | |
Cultural Value (CV) | Collectivism | Individualism | |
Diplomacy | Diplomatic Strategy (DS) | Globalism | Isolationism |
Military Force (MF) | Militarism | Pacifism | |
Society | Social Development (SD) | Revolutionism | Reformism |
Justice Orientation (JO) | Result Justice | Procedural Justice | |
Personal Right (PeR) | Social Responsibility | Individual Right |
Appendix A Multifaceted Ideology Schema
We present the multifaceted ideology schema in Table 4. Concepts of facets and ideologies defined in the schema can be found in Liu et al. (2023). Note that the original schema does not give the concepts of “Center”, so we define them based on the concepts of “Left” and “Right”, as follows:
A.1 Domain 1: Politics
-
•
Political Regime (PoR)
Center: A moderate stance advocating for a mix of public and private ownership, seeking a balanced approach to property control and means of production.
-
•
State Structure (SS)
Center: A moderate stance advocating for a balanced power structure, combining elements of central authority and power distribution.
A.2 Domain 2: Economy
-
•
Economic Orientation (EO)
Center: A moderate stance advocating for combining government intervention in important economic decisions with the role of individuals, organizations, and market interactions.
-
•
Economic Equality (EE)
Center: A moderate position advocating for an economic system that balances equal treatment and access to resources with considerations for distribution outcomes among different groups.
A.3 Domain 3: Culture
-
•
Ethical Pursuit (EP)
Center: The mainstream culture should consider individual freedoms and cultural norms while promoting inclusivity dialogue on controversial issues.
-
•
Church-State Relations (CSR)
Center: A moderate position advocating for a balanced and cooperative relationship between the church and state, respecting both religious autonomy and the principles of secular governance.
-
•
Cultural Value (CV)
Center: A moderate stance that recognizes the importance of both social collectives and individual autonomy in sha** and preserving a diverse and inclusive society.
A.4 Domain 4: Diplomacy
-
•
Diplomatic Strategy (DS)
Center: A moderate position that balances international cooperation and national interests, recognizing the value of engagement while cautiously managing political and economic entanglements with other countries.
-
•
Military Force (MF)
Center: A moderate stance that recognizes the need for armed defense and security while prioritizing non-violent resolution for conflicts.
A.5 Domain 5: Society
-
•
Social Development (SD)
Center: A moderate position that advocates combining direct action when necessary with a recognition of the value of gradual and sustainable change to achieve social goals.
-
•
Justice Orientation (JO)
Center: A moderate stance that seeks a balance between fair distribution and fair decision-making, considering both the outcomes and procedure of justice.
-
•
Personal Right (PeR)
Center: A moderate position that recognizes the importance of both fulfilling individual responsibilities and protecting individual rights in an equitable manner.
Appendix B Prompts for LLMs
The prompt templates designed for LLMs in two subtasks are as follows. We fill the templates with the facet names and definitions in the multifaceted ideology schema. In few-shot experiments, we provide LLMs with a few in-context demonstrations, which are manually selected for each facet to ensure diversity. We also provide a brief analysis as chain-of-thought for each demonstration. In zero-shot experiments, the demonstrations in the prompts will be removed.
B.1 Relevance Recognition
-
•
System prompt
You will be provided with a piece of text. Determine if the text is related to "{facet}".
{facet} is defined as: {facet_def}
First give your analysis briefly and then select your answer from ["Related", "Unrelated"].
Here are some demonstrations:
{demonstrations} -
•
User prompt
Text: """{text}"""
B.2 Ideology Analysis
-
•
System prompt
You will be provided with a piece of text. Determine the orientation of the text towards "{facet}".
The orientation towards "{facet}" can be divided into ["Left", "Right", "Center"]. The definitions are as follows:
-Left: {left_def}
-Right: {right_def}
-Center: {center_def}First give your analysis briefly and then select your answer from ["Left", "Right", "Center"].
Here are some demonstrations:
{demonstrations} -
•
User prompt
Text: """{text}"""
Appendix C Additional Ablation Study
As shown in Table 5, the removal of Concept Metapath Diffusion or Concept Hierarchy Aggregation causes a drop in performance. And removing both of them (w/o BICo) leads to a more significant performance degradation. The concept diffusion from root to leaf enables the high-level general semantics to propagate to lower-level nodes, while the concept aggregation from leaf to root allows the high-level nodes to perceive multifaceted concepts from lower levels. Both contribute to enriching label representations. The results further validate the effectiveness of both modules.
Model | Macro-F1 | Micro-F1 | Macro-Acc | Micro-Acc |
Subtask 1: Relevance Recognition | ||||
Our Approach | 59.22 | 72.14 | - | - |
w/o CMD | 57.92 | 70.81 | - | - |
w/o CHA | 58.73 | 71.03 | - | - |
w/o BICo | 58.14 | 70.41 | - | - |
Subtask 2: Ideology Analysis | ||||
Our Approach | 47.32 | 70.90 | 66.79 | 78.60 |
w/o CMD | 46.38 | 68.93 | 65.90 | 77.71 |
w/o CHA | 46.13 | 69.48 | 66.75 | 77.85 |
w/o BICo | 46.02 | 68.58 | 66.18 | 76.63 |
Domain | Facet | Relevance | Ideology | ||
---|---|---|---|---|---|
#Related | #Left | #Center | #Right | ||
Politcs | PoR | 112 | 39 | 14 | 59 |
SS | 291 | 67 | 88 | 136 | |
Economy | EO | 759 | 294 | 297 | 168 |
EE | 672 | 520 | 119 | 33 | |
Culture | EP | 2935 | 1976 | 465 | 494 |
CSR | 68 | 33 | 17 | 18 | |
CV | 154 | 95 | 11 | 48 | |
Diplomacy | DS | 1572 | 711 | 421 | 440 |
MF | 1837 | 132 | 575 | 1130 | |
Society | SD | 1737 | 1236 | 287 | 214 |
JO | 3452 | 3058 | 281 | 113 | |
PeR | 3516 | 171 | 241 | 3104 |