A Comprehensive Survey of Artificial Intelligence Techniques for Talent Analytics

Chuan Qin,  Le Zhang, Yihang Cheng, Rui Zha, Dazhong Shen,
Qi Zhang, Xi Chen, Ying Sun,  Chen Zhu, 
Hengshu Zhu*,  Hui Xiong
This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. C. Qin, Y. Cheng, C. Zhu, and H. Zhu are with the Career Science Lab, BOSS Zhipin, Bei**g, China. E-mail: [email protected], [email protected], [email protected], [email protected]. L. Zhang is with the Business Intelligence Lab, Baidu Inc., Bei**g, China. E-mail: [email protected]. R. Zha and X. Chen are with the University of Science and Technology of China, Anhui, China. E-mail: [email protected], [email protected]. D. Shen and Q. Zhang are with the Shanghai Artificial Intelligence Laboratory. E-mail: [email protected], [email protected]. Y. Sun and H. Xiong are with the Hong Kong University of Science and Technology (Guangzhou), china. E-mail: [email protected], [email protected] H. Zhu and H. Xiong are the corresponding authors.
Abstract

In today’s competitive and fast-evolving business environment, it is a critical time for organizations to rethink how to make talent-related decisions in a quantitative manner. Indeed, the recent development of Big Data and Artificial Intelligence (AI) techniques have revolutionized human resource management. The availability of large-scale talent and management-related data provides unparalleled opportunities for business leaders to comprehend organizational behaviors and gain tangible knowledge from a data science perspective, which in turn delivers intelligence for real-time decision-making and effective talent management at work for their organizations. In the last decade, talent analytics has emerged as a promising field in applied data science for human resource management, garnering significant attention from AI communities and inspiring numerous research efforts. To this end, we present an up-to-date and comprehensive survey on AI technologies used for talent analytics in the field of human resource management. Specifically, we first provide the background knowledge of talent analytics and categorize various pertinent data. Subsequently, we offer a comprehensive taxonomy of relevant research efforts, categorized based on three distinct application-driven scenarios from different level: talent management, organization management, and labor market analysis. In conclusion, we summarize the open challenges and potential prospects for future research directions in the domain of AI-driven talent analytics.

Index Terms:
Artificial intelligence, talent analytics, talent management, organization management, labor market analysis

1 Introduction

In the world of volatility, uncertainty, complexity, and ambiguity (VUCA), talents are always precious treasures and play an important role for business success. To cope with the fast-evolving business environment and maintain competitive edges, it is critical for organizations to rethink how to make talent-related decisions in a quantitative manner. Thanks to the era of big data, the availability of large-scale talent data provides unparalleled opportunities for business leaders to understand the rules of talent and management, which in turn deliver intelligence for effective decision-making and management for their organizations [374, 361]. Along this line, as an emerging applied data science direction in human resource management, talent analytics has attracted a wide range of attention from both academic and industry circles. Specifically, talent analytics, also as known as workforce analysis or people analytics, focuses on leveraging data science technologies to analyze extensive sets of talent-related data, empowering organizations with informed decision-making capabilities that enhance their organizational and operational effectiveness. [206]. In practice, talent analytics plays a pivotal role in strategic human resource management (HRM), encompassing diverse applications such as talent acquisition, development, retention, as well as examining organizational behaviors and external labor market dynamics. Generally, the research directions of talent analytics can be divided into three categories, as illustrated in Figure 1, including talent management, organization management, and labor market analysis.

Refer to caption
Figure 1: Graphical abstract of this survey from data to the proposed methods.

To be specific, first, talent management is a constant strategic process of attracting and hiring the high-potential employees, training their skills, motivating them to improve their performance, and retaining them to keep organizational competitiveness. In this particular scenario, talent analytics primarily focus on individual-level analysis. For instance, it can help human resources managers find the right talents for different jobs in a practical way [334, 140, 444], and it can reasonably make the employee performance or turnover prediction [6, 233]. Second, organization management is the art of fostering collaboration among talents and guiding the organizations toward achieving success. In this scenario, talent analytics can diagnose the health of an organization and measure the organizational performance by leveraging various relationship information between talents or organizations, such as organizational structure, communication patterns, and project collaborations [446, 114]. It can also assist the organization in effectively structuring and optimizing teams [414, 13]. Third, talent analytics can be applied from an external and macro perspective, i.e., applied to labor market analysis scenario. It is crucial to devise talent and organizational strategies. For instance, by analyzing talent demands within the labor market, managers can effectively craft recruitment strategies [484, 467].

Historically, talent analytics is proposed within the conception of HRM around 1920s [191]. Talent analytics is usually manual in the early stages of HRM before 1970 [280]. In this stage, human resource systems (HRS) or human resource management systems (HRMS) is the main method for talent analytics [51]. There are various types of these systems like commitment system or control system [25]. The assessment of the talent performance with these systems are conducted by the human resource managers [138]. At the same time, talent assessment-related theories have formed like human capital theory and resource-based theory [191]. With the occurrence of mechanical automation, human resource information systems (HRIS) has arised around 1940s, however, before 1970, HRIS is based on the sorting and tabulating equipment, at the same time, the main functions of HRIS are kee** employee records automatically and there is no computer support [111]. However, from this stage, talent analytics could be supported by aggregated information. With the development of information systems motivated by more and more data generated during management, HRIS enhanced by computer system have been widely adopted in 1970s [59], in this stage, HRIS is a combination of the database, the computer application, the software and the hardware to record, manage, and operate human resource data. This development trend is verified by a survey, which confirms that 60% of Fortune 500 companies use HRIS to support daily HRM operations [36]. Besides, in this stage, people analytics, refers to a novel, quantitative, evidence-based, and data-driven approach to manage the workforce, is proposed to raise the efficiency of core human resource functions such as recruiting [142]. Some standard statistical methods are adopted in this step like correlation analysis, simple regressions and so on [229], this way is generally called descriptive analytics. At the same time, due to the rich functions of HRIS, many evaluation studies on the specific functions of software have been carried out, including organizational performance, turnover and so on [126, 21].

With the development of artificial intelligence (AI) algorithms, some advanced regression techniques, data mining, text mining, web mining, and forecast calculations are used in talent analytics around 2010s [117], these ways is generally called predictive analytics. Recent literature highlights the importance of more advanced analytical methods and emerging technologies in talent analytics. For instance, Dahlbom et al. [94] emphasise that “new types of data and different algorithms used in AI and machine learning solutions utilized in HRA [Human Resource Analytics]” (p. 123) will transform the field of people analytics. There are two main challenges to facilitate this transformation. On the one hand, talent analytics is facing digital disruption, which enables the availability of large-scale relevant data. For instance, Indeed, a world-renowned job search site, had 11.3 million active jobs as of January 2022 [61]. Meanwhile, Linkedin, the largest online professional network, had 774 million members from around 200 countries as of March 2022 [248], building up a wealth of labor market data. Moreover, numerous enterprises are setting up their Digital Human Resource Management Systems (Digital HRMS), enabling the collection, storage, and processing of a huge amount of talent and organizational information in a digital environment [393]. On the other hand, with the advent of talent-related big data, advanced AI techniques have rapidly revolutionized a series of research and practices in this field at an alarming rate, which in turn deliver intelligence for decision-making and management for their organizations. In this stage, the deep learning methods have enabled the new paradigm in person-job fit [334, 485, 46, 444] and person-organization fit [381, 383], so as to achieve the efficient and accurate talent selection and development. Text mining methods have been adopted in the employer brand analysis based on the large-scale labor market data [246, 245], which enable the forward-looking strategic plans created for the business. At the same time, several high-tech companies are gradually incorporating AI technologies into their HRMS. As an illustration, IBM leverages AI technology to achieve a remarkable 95 percent accuracy in predicting employees who are considering leaving their positions, which saved IBM $300 million in retention costs [345]. With the strong automation capability of large language model (LLM), autonomous people analytics is conducted around 2020s [315].

Accordingly, AI in talent analytics in this paper includes supervised learning, unsupervised learning, deep learning, reinforcement learning, knowledge representation, natural language processing, and so on [43]. These techniques construct different business capabilities in people analytics, which are automation of structured (or semistructured) work processes, engagement with employees and managers, decision-making through extensive analysis of a large amount of data, creation of novel outcomes [43]. This survey attempts to provide a comprehensive review of the rapidly evolving AI techniques for talent analytics. Based on our investigation, we first provide a detailed taxonomy of relevant data laying a data foundation for leveraging AI techniques to understand talents, organizations, and corresponding management better. Generally, talents’ behaviors reflect in three levels, including individual level, organizational level and market level. Accordingly, the research efforts of the AI techniques for talent analytics from corresponding three aspects, including talent management, organization management, and labor market analysis. Finally, we identify challenges for future AI-based talent analytics and suggest potential research directions.

Moreover, in order to help the readers learn more effectively, we highlight the systematic resources provided in this survey as follows,

  • Table I summarizes the data for talent analytics.

  • Table III summarizes the recent AI-based talent analytics efforts in the talent management scenario.

  • Table IV summarizes the recent AI-based talent analytics efforts in the organization management scenario.

  • Table V summarizes the recent AI-based talent analytics efforts in the labor market analysis scenario.

2 Data for Talent Analytics

Nowadays, as enterprises undergo an accelerated digital transformation, a large amount of talent analytics-related data has been accumulated. In this section, we will introduce the data collected across various scenarios, providing readers with a foundational understanding of the related research data and the motivation for model design. Generally, the data can be divided into internal data, which are collected from the internal enterprise management system, and external data, which are collected from the external labor market.

2.1 Internal Data

Based on the described objects, internal data can be broadly divided into three categories: recruitment data, employee data, and organizational data.

2.1.1 Recruitment Data

Recruitment data in pre-employment mainly includes the following types:

Resume: A resume or Curriculum Vitae (CV) is a document that outlines a person’s background, skills, and accomplishments, which plays a vital role in the recruitment process as it serves to facilitate talent screening and assessment [334, 363]. It serves as an important tool for job seekers to showcase their qualifications and suitability for a job position. Recently, a large amount of resume data, in either Word or PDF format, has accumulated with the development of online recruitment. As shown in Figure 3, a resume typically comprises structured information such as gender, age, and education, as well as semi-structured information like educational experience, work experience, and project experience. Accordingly, several resume parsing techniques have been developed to extract the redundant information  [74, 75, 443]. On such basis, substantial efforts are posed in talent analytics with resume data from different perspectives. For instance, Yao et al. [442] introduced a keyphrase extraction approach to explore job seekers’ skills in resumes, and Pena et al. [323] used image information in resume data to improve screening performance and explore fairness issue. Moreover, several studies have proposed leveraging the text mining techniques to determine the matching degree between jobs and job seekers based to their resumes [334, 485, 46]. In addition, the resumes also encompass the career trajectories of the job seekers. As illustrated on the right side of Figure 3, the candidate’s profile showcases three job experiences, including a job change at Microsoft and work experience at Google. In this phase, Zhang et al. introduced the ResumeVis system to visualize the individual career trajectory and mobility within different organizations [460]. And the researchers further analyzed the sequential patterns of the career trajectory and proposed personalized career development recommendation [289, 465, 400].

Refer to caption
Figure 2: An example of parsing the resume.
Refer to caption
Figure 3: An example of parsing the job posting.

Job Posting: A job posting is an advertisement for a vacant position that provides job seekers with information on the job description and requirements. The posting offers applicants a clear understanding of what the position is responsible for and what qualifications are necessary. Recently, the proliferation of online recruitment services has made it increasingly common to publish job postings as web pages. Figure 3 illustrates a typical job posting that comprises structured information, such as the salary range and education requirements, as well as the semi-structured content that includes job duties descriptions and abilities requirements. Nevertheless, it is still difficult to deal with such a large corpus of data by Human Resource (HR) experts manually. To this end, researchers have attempted to reduce the dependence on manual labor by using neural network-based techniques, particularly NLP, on voluminous job postings. As mentioned before, considerable effort has been posed in Person-Job Fit  [334, 335, 46, 485, 263], which aims to match the job postings with suitable resumes. Moreover, Shen et al. leveraged the latent variable model to jointly model the job description, candidate resume, and interview assessment, which can further benefit several downstream applications such as person-job fit and interview question recommendation [364, 363]. In order to reduce the expense of manual screening, researchers also extracted the job entities from postings and generated interview questions automatically  [336, 365, 333, 239]. Apart from these in-firm applications, some studies are carried out to provide comprehensive insights into the global labor market. For instance, researchers have proposed several data-driven methods for salary analysis across different companies and positions [210, 82, 288]. Zhang et al. utilized large-scale job postings from one of the largest Chinese online recruitment websites and forecast fine-grained talent demand in the recruitment market [467]. Moreover, some studies aim to measure the popularity of job skills and forecast their evolving trends [426, 422]. Along this line, Sun et al. further focus on measuring the values of job skills based on massive job postings, contributing to the quantitative assessment of job skills [382].

Interview-related Data: Interview-related data is typically collected during the interview process and serves the purpose of evaluating applicants’ overall qualifications for the position they are applying for. In general, an interview can be conducted either in-person or through video, resulting in textual or video-based assessments, respectively. Both of these two kinds of data enable comprehensive evaluations for the candidates and facilitate the integration of AI in HR. To address the subjectivity of traditional interviews, Shen et al. utilized the latent variable model to explore the relationship among job descriptions, candidate resumes, and textual interview assessments [364, 363]. The results provide an interpretable understanding of job interview assessments. Indeed, textual interview assessments within a company are usually private and sensitive, whereas video assessments draw more attention. For instance, several studies extract multimodal features from the videos for automatic analysis of job interviews [78, 80]. In addition, Hemamou et al. proposed a hierarchical attention model to predict the employability of the candidates using multimodal information, including text, audio, and video [172]. Along this line, Chen et al. leveraged a hierarchical reasoning graph neural network to automatically score candidate competencies using textual features in asynchronous video interviews [77].

2.1.2 Employee Data

Regarding the development of employees within a company, a significant amount of employee data has been accumulated, including training records and individual work outcomes. An overview of employee-related data is shown in Figure 4.

Employee Profiles: Employee profiles typically describe an individual based on two main aspects: demographic characteristics and individual work outcomes. In specific, the former branch includes characteristics such as age, gender, and education levels [322], which can be used to enhance employee representations and benefit various downstream analyses, such as career mobility prediction  [337, 477] and performance forecasting [271]. In addition to these static variables, individual work outcomes depict the dynamic career development from different dimensions, such as performance appraisals, promotion, and turnover records. In particular, a performance appraisal is a systematic evaluation of an employee’s job performance and productivity that is typically conducted by line managers. Besides, the promotion and turnover records show employee movements within and across companies, respectively. All of this information contributes to further insights into employee dynamics. For instance, researchers have leveraged performance appraisals to identify the high-potential talents within a company [447]. Li et al. utilized static profiles, performance appraisals, and reporting lines of employees to model career development within a company, focusing on turnover and career progression [233]. Sun et al. proposed to capture the dynamic nature of person-organization fit based on individual profiles, reporting lines, and communication records [381]. To investigate the contagious effect of turnovers, researchers have utilized both employee profiles and turnover data [387, 386]. Furthermore, Hang et al. leveraged five kinds of standardized data, including employee turnover records, to predict the turnover probability and period [165].

Refer to caption
Figure 4: An overview of employee-related data.

Training Records: Employee training is a program designed to improve the performance of employees by equip** them with specific skills. Ongoing employee training has proven to be crucial in attracting and retaining top talent [351]. Typically, the training record describes the learning path of an employee, which is a sequence of different skills. Based on these training records, considerable effort has been devoted to exploring the learning patterns of employees. For instance, Wang et al. utilized both learning records and skill profiles of employees from a high-tech company in China to develop a personalized online course recommendation system [402, 401]. Along this line, Srivastava et al. collected employees’ training and work history from a large multinational IT organization to provide personalized next training recommendations [377]. In addition, some researchers also provided insights into employee competency study [234, 421, 252, 177]. For instance, multi-dimensional features, including learning and training dimensions, were collected from a Chinese state-owned enterprise to provide competency assessment for employees [252].

2.1.3 Organizational Data

An organizational structure is a system that outlines how activities are directed toward the achievement of organizational aims [330], which plays an important role in decision-making and knowledge management. Generally, an organization is commonly represented as a hierarchical tree structure, which can take on diverse forms, such as matrix, flat, and network structures. Figure 5 shows several common types of organizational structures. Typically, existing studies explore these complex structures from various dimensions, such as reporting lines and in-firm social networks.

Reporting lines are generally the most representative aspect of an organizational structure, which delineates how authority and responsibility are allocated in an organization. Regarding this point, Sun et al. developed an organization structure-ware convolutional neural network to hierarchically extract compatibility features for measuring person-organization fit and its impact on talent management [381, 383]. Nevertheless, due to privacy concerns, mainstream studies utilize in-firm social networks for human resource management. In general, an in-firm social network can be formed from email or Instant Messaging (IM) records across employees. For example, text-based communication has been used by several machine learning classifiers to identify group mood [213]. Besides, Cao et al. leveraged the lasso regression model to explore team viability using text conversations of online teams [62]. In addition to social networks, researchers have also taken other information into account. For example, Ye et al. utilized both email communication and a high-potential talent list to identify employees with high potential [447]. Along this line, Teng et al. further utilized datasets from three sources for organizational turnover prediction, including profile and turnover, social network, and job levels [386].

Refer to caption
Figure 5: Three common types of organization structure.
TABLE I: The table of collected papers related to talent analytic-related data.
Categories Data Reference
Internal: Recruitment Resume  [442, 323, 460, 289, 465, 400]
Job Posting  [334, 46, 485, 467, 426, 422, 382]
Interview-related  [364, 363, 78, 80, 172, 77]
Internal: Employee Employee Profiles  [447, 233, 381, 387, 386, 165, 465]
Training Records  [402, 401, 377, 234, 421, 252, 177]
Internal: Organization Reporting Lines  [381, 383]
In-firm Social Network  [213, 62, 447, 386]
External Social Media  [188, 376]
Job Search Websites  [246, 245, 31, 357, 318]

2.2 External Data

Apart from the aforementioned internal data, external sources also contribute to a comprehensive understanding of the labor market, which can be broadly classified into two categories: social media platforms and job search websites.

Social Media: Widely-used social media platforms that contribute to a comprehensive understanding of the labor market include Twitter 111https://twitter.com, Facebook 222https://www.facebook.com, and news reports. With the help of NLP and Topic Model techniques [49], numerous studies have been carried out to explore the semantic information in this corpus. For example, more than 60,000 tweets related to nine energy companies were collected for sentiment analysis expressed on Twitter [188]. To gain further insight into the impact of public opinion, Spears et al. [376] collected earnings reports and news articles spanning eight years from four companies. The results indicate that companies may face a decline in valuation when they receive negative publicity.

Job Search Websites: Recent years have witnessed the rapid growth of job search websites, such as Indeed 333https://www.indeed.com, LinkedIn 444https://www.linkedin.com, and Glassdoor 555https://www.glassdoor.com. Specifically, Indeed and Glassdoor allow users to comment on a company, providing an overall understanding of the employer brand. For instance, Lin et al. [246, 245] collaboratively modeled both textual (i.e., reviews) and numerical information (i.e., salaries and ratings) for learning latent structural patterns of employer brands. In addition, Bajpai [31] leveraged the data from Glassdoor to perform aspect-level sentiment analysis. Along this line, large-scale reviews of Fortune 500 companies are collected to identify topics that matter to employees  [357]. Differently, LinkedIn provides a wide range of business services, including job listings, professional profile creation, and career development services, with personal profiles being the most analyzed, as they describe users’ employment history. For example, Park et al. [318] used LinkedIn’s employment history data from more than 500 million users over 25 years to construct a labor flow network of over 4 million firms worldwide, demonstrating a strong association between the influx of educated workers and financial performance in detected geo-industrial clusters.

Furthermore, there are also several third-party business investigation platforms that offer detailed information about companies and their board members’ relationships, such as Crunchbase 666https://www.crunchbase.com, Owler 777https://www.owler.com, Tianyancha 888https://www.tianyancha.com and Aiqicha 999https://www.aiqicha.com. These details can be viewed as complementary information to the job search websites. Building upon this foundation, one can gain deeper insights into the aligned companies and conduct more relevant research, such as analyzing cooperative and competitive relationships [96] and providing investment target recommendations [81].

2.3 Data Processing

2.3.1 Data Collection

The source of recruitment data can be categorized into two types: internal data and external data. Correspondingly, data collection methodologies align with these source categories.

Internel Data. The Current business environment is typically dependent on data systems [219]. Internal data are collected from the internal enterprise management systems, which are known as enterprise resource planning (ERP) system [362], customer relationship management (CRM) system [127] and applicant tracking system (ATS) system [300]. ERP systems are comprehensive business management tools integrating various functions such as finance, sales, materials management, HR, production planning, and supply chain. CRM systems facilitate customer interaction and communication, encompassing customer information management, sales opportunities, and customer service. An ATS is computer software that human resource departments use to process the overwhelming number of applications they receive for job openings. These systems orderly store recruitment, employee, and organizational data, facilitating efficient collection and processing.

Externel Data. As online services rapidly evolve, a growing number of individuals are turning to social media and job search websites to exchange job-related information and explore employment opportunities. These interactive platforms host a vast array of talent information due to their extensive user base. Additionally, third-party business investigation platforms provide detailed insights into companies and the relationships of their board members. Data acquisition methods on these platforms vary. Third-party data collection websites typically aggregate data from their participating members. Meanwhile, web crawlers [215] can extract rich information from website pages, provided legal and regulatory compliance is ensured. Moreover, many job search websites retain substantial amounts of user data that isn’t publicly disclosed. Typically, this data can be utilized for scientific research purposes following encryption and other privacy safeguards.

2.3.2 Data Preprocessing

After collecting a large amount of recruitment data, it is essential to preprocess the data for downstream applications, especially removing noisy, redundant, irrelevant, and potentially toxic data. In this part, we review the detailed data preprocessing strategies to improve the quality of the collected data according to various data types.

Structured Data. Structured data, in simple terms, is a database such as ERP system [120], that has a standardized format for efficient access by software and humans alike. Besides the information of employees and companies are orderly collected into the database. On some data collection websites, a lot of user information is also strictly stored through databases, such as records of interactions between users and the platform, including clicking and browsing, etc. Since these data have been stored in a standardized format, filtering and integration of corresponding data can generally be achieved by connecting different tables and setting data filtering conditions on the tables [32].

Semi-Structured Data. Semi-structured data, or partially structured data, diverges from the conventional tabular format characteristic of relational databases or other tabular data forms [390]. Instead, it incorporates tags and metadata to delineate semantic elements and establish hierarchical relationships among records and fields. This type of data is prevalent in the labor market [370], encompassing employee resumes [461], individual interviews [56], job postings [90], and web pages [106]. Due to the absence of standardized formats and the diversity of semi-structured data types, it necessitates thorough exploration of commonalities, extraction of multidimensional information, and comprehensive filtering to transform it into structured data. For example, Sun et al. collected job postings from an online recruitment website [382]. On this website, each job opening is displayed in HTML, which contains information of salary range, company, location, time, and job description text. They parsed the HTML and obtained structured job posting information.

Unstructured Data. Unstructured data [396] refers to information that does not conform to the conventional row-column structure found in traditional databases. Within the labor market context, data primarily manifests as text, comprising a blend of structured and unstructured fields. Structured fields denote specific categories like job titles (e.g., occupation), location, etc., while unstructured fields provide a broader description of vacancy content. Approximately 80% of data held by firms today is unstructured [35], expanding at a rate fifteen times faster than structured data. While some approaches skip the processing of natural text, opting instead for direct classification tasks on such text, as observed in the classification of web job vacancies [55], others require extraction and subsequent processing of information from unstructured data into structured data. For instance, extracting skill information from job descriptions [83] involves the identification of skill words through regular expression matching. These identified words are then subjected to expert evaluation for further refinement, resulting in a set of relevant skill words for each job description. Subsequent statistical analysis across all jobs yields a comprehensive frequency distribution of skill words as the structured data, facilitating downstream tasks such as skill demand forecasting.

2.3.3 Data Cleaning and Debias

After initial data preprocessing, standard structured data is obtained. However, given the potential for noise introduced during the data acquisition process, coupled with non-uniform acquisition methods, the provided data may not be of high quality. Therefore, a cleanup and debiasing of the data becomes imperative. In this part, we first introduce several data quality issues commonly seen in AI-driven talent analytics. Subsequently, several clean and debias methods are introduced to solve these issues.

Data Quality Issues. There are various data quality issues: missing data, duplicated data, extraneous data, and inconsistent data [93]. These issues will introduce biases into analyses and lead to inaccurate conclusions. We introduced these issues as follows:

  • Missing Data: Missing data occurs when essential data is absent from a dataset, which can result from factors like data corruption, or failure to capture specific information. Within the talent market, missing values often stem from inconsistent information sources. For instance, job postings that should contain recruitment requirements and descriptions may be missing key fields [184]. Resumes also frequently omit crucial information such as email addresses and physical addresses [360].

  • Duplicated Data: Duplicated data refers to the presence of identical or nearly identical records within a dataset, which can arise from data entry errors, erroneous dataset merging, or technical malfunctions. Duplicated data can distort statistical analyses and exaggerate the significance of certain data points. In the talent market, this issue can arise due to data sources providing overly homogeneous information. Repetitive job postings may be erroneously interpreted as multiple postings for the same position in certain analysis and prediction scenarios [382].

  • Extraneous Data: Extraneous data comprises irrelevant or unnecessary information within a dataset, often included mistakenly due to human errors or incorrect data integration processes. Such data can complicate analyses, waste computational resources, and yield inaccurate results. This data often requires further filtering to retain only relevant fields and eliminate irrelevant information, as noted in [34, 44]. Redundancies in job offers can hinder the accuracy of downstream classification tasks [44].

  • Inconsistent Data: Inconsistent data encompasses conflicting or contradictory information within a dataset, stemming from sources like data entry errors, incompatible formats, or changes in data collection methods over time [184, 34, 131]. Such inconsistencies impede meaningful insights and necessitate thorough validation and cleansing to ensure data integrity and accuracy. Common inconsistencies include misspellings and variations in job titles, which require resolution to maintain consistency across professional documents [34, 131]. Besides, erroneous labeling of job postings as job titles is also a common problem [131].

TABLE II: The table of collected talent analytic-related open datasets.
Categories Dataset Link Note
Internal:Recruitment Kaggle-Entity_Recognition_Resumes https://www.kaggle.com/datasets/ dataturks/resume-entities-for-ner 220 resumes; 10 categories; Resume Understanding
LinkedIn-Job-Scraper https://www.kaggle.com/datasets /arshkon/linkedin-job-postings/data 33,000+ job postings; 27 valuable attributes including the title, job description, salary, location, etc
Job Dataset https://www.kaggle.com/datasets/ ravindrasinghrana/job-description-dataset synthetic job postings; 23 attributes including job title, job salary, job skill, etc
Linkedin Jobs & Skills (2024) https://www.kaggle.com/datasets/asaniczka/ 1-3m-linkedin-jobs-and-skills-2024 12,96,381 job postings; skills map**, job recommendation systems
Internal:Employee HR Analytics https://www.kaggle.com/datasets /colara/hr-analytics 14999 employees; 10 attributes,including satisfaction_level, salary, etc; turnover prediction
Human Resources Data Set https://www.kaggle.com/datasets/ rhuebner/human-resources-data-set/data 311 records; 36 attributes, including name, salary, etc

Cleaning and Debias Methods. Data cleaning and debias is an iterative process tailored to the requirements and semantics of specific analysis tasks [218]. This process consists of transforming raw data into consistent data that can be analyzed. Herein, we delineate several frequently employed methodologies in talent analytics for data cleaning and debias.

  • Data Selection: Data selection involves the meticulous identification and extraction of pertinent data subsets from a broader dataset, guided by specific criteria or requisites [251]. This approach serves to streamline the data analysis process by honing in on the most relevant information, thereby mitigating noise and extraneous data. For instance, Zhang et al. [467] opted to discard company-position pairs with a monthly averaged talent demand of less than 2, considering scenarios where certain companies may not extensively recruit for particular research positions, resulting in consistently low demand. Shao et al. conducted data cleaning by eliminating job posts and resumes with incomplete attributes [360]. Balaji et al. focused on extracting useful fields from job titles to curate the most pertinent and distinctive set of work activities corresponding to selected company job titles [34].

  • Data Filtering: Data filtering encompasses the systematic elimination or exclusion of undesirable or extraneous data from a dataset, thereby diminishing noise and refining its quality. In typical text cleaning tasks, the compilation of a stop word list proves indispensable for sieving out irrelevant information [156]. Moreover, the adoption of regular expressions to delineate fields and detect anomalies is widely embraced in personalized resume-job matching systems [156] and job search engines [301].

  • Data Clustering: Data clustering involves grou** similar data points together based on certain characteristics or features. Wakchaure et al. [398] leverages the Levenshtein edit distance, a metric gauging string similarity, to designate matches exceeding 0.85 as identical individuals. Gaikwad et al. [131] employs Levenshtein edit distance for duplicate detection within XML documents. Sun et al. [382] utilize text embedding and measure edit distance to approximate similar job descriptions.Zhang et al. [467] employ a classification approach to categorize original job titles into 16 distinct categories, thereby mitigating noise and standardizing job titles.

  • Data Synthesis: Data synthesis encompasses the generation of fresh data points to augment extant datasets, proving instrumental in addressing gaps or amplifying the efficacy of analysis and model functionalities [184]. For instance, Magron et al. [267] devised synthetic job postings to refine skill alignment, while Skondras et al. [375] utilized Large Language Models to craft synthetic resume data, thereby bolstering job description classification.

  • Data Normalization: Data normalization is a crucial procedure aimed at standardizing the scale or distribution of data values within a dataset, thereby facilitating precise comparisons and analyses while ensuring uniformity across variables. It finds widespread application in various domains, particularly in time series forecasting. For instance, Liu et al. conducted normalization operations, which involve subtracting the minimum value of each node and dividing it by the difference between the maximum and minimum values of the node [249]. Similarly, Zhang et al. normalized the number of job transitions along the company axis to achieve consistency [466].

2.3.4 Limitations

Existing data and processing methods in the field of talent analytics continue to face many limitations and challenges, which constrain the development of AI-based approaches. We summarize these issues as follows.

  • Lack of benchmark datasets: Most current research in talent analytics heavily depends on proprietary data related to job applicants, employees, and organizations. The sensitive nature of this data often prevents it from being publicly accessible, severely restricting the development of publicly available benchmark datasets. This limitation impedes the standardization of problem definitions and comparative analyses of methodologies, thus slowing down technological advancements in the field. While some job-related data has been made open-source, it remains limited in scale and temporal coverage. Consequently, there is an urgent need to create comprehensive open-source benchmark datasets for pivotal tasks within this field. We present several open datasets, as shown in Table II, primarily related to resume parsing and employee turnover prediction. These datasets can help researchers in develo** standardized datasets for uniform comparisons. Additionally, the data anonymization and de-identification methods require more extensive consideration to facilitate the construction of more open-source datasets for various scenarios.

  • Diversity deficiencies in datasets: The datasets commonly used in AI-based talent analytics often lack adequate consideration of diverse populations, especially underrepresented minorities. This deficiency hampers efforts to evaluate and ensure algorithmic fairness. Models developed from such datasets may exhibit biases, leading to inequitable outcomes among various demographic groups. Therefore, it is essential to integrate more diverse datasets to promote fairness and enhance the overall performance of AI algorithms in talent analytics.

  • Challenges with subjective data: In many talent analytics scenarios, such as employee promotion decisions, outcomes are frequently based on the subjective judgment of supervisors. This practice can introduce inherent biases and noise into the real datasets collected from these decisions. Traditional data preprocessing techniques often fail to detect and address these anomalies effectively. Therefore, there is a pressing need for specialized noise detection algorithms or the development of robust AI-based talent analytics models that can handle such irregularities. These approaches are essential to ensure the accuracy and fairness of the analytics outcomes.

3 Talent Management

Talent management, which focuses on placing the right person in the right job at the right time, has emerged as a predominant human capital topic in the early twenty-first century [66, 63]. Specifically, the management process includes the whole process of talent entering the organization to development. Accordingly, first, we describe various intelligent talent recruitment scenarios, such as job posting generation [254, 332] and talent searching [162, 274, 160]. After entering the organization, in order to ensure the sustainable development of talent, timely and accurate feedback is critical. As a result, second, we discuss two primary issues in talent assessment: interview question recommendation [363, 336, 365] and assessment scoring [304, 171, 77, 234]. Finally, from the whole development process, career development is important for managing human capital resources and individual development. Accordingly, we outline several post-employment career development problems, including course recommendations [377, 402, 401] for employee training and employee dynamics analysis [477, 259, 386]. In the following sections, we will delve into these issues in detail.

TABLE III: The table of collected papers related to talent management.
Task Method Data Reference
Talent Recruitment
Job Posting Generation RNN Job posting [254, 332]
Job Posting Generation LLMs Job posting [260, 52]
Resume Understanding Rule-base method Resume [216]
Resume Understanding HMM Resume [449]
Resume Understanding SVM Resume [449, 89]
Resume Understanding CRF Resume [321, 74]
Resume Understanding LSTM,CNN Resume [27]
Resume Understanding LSTM,CRF Resume [327, 327, 240, 262]
Resume Understanding RoBERTa,GCN Resume [411]
Resume Understanding Multimodal pre-trained model Resume [443, 196]
Talent Searching Keywords matching Query, Resume [274]
Talent Searching Keywords matching, knowledge graph Query, Resume [407]
Talent Searching Traditional classifiers Query, Resume [19, 311]
Talent Searching Topic model, multi-armed bandit algorithm Query, Resume [139]
Talent Searching Learning-to-rank algorithm Query, Resume [161, 162]
Talent Searching DNN, learning-to-rank algorithm Query, Resume [340, 440]
Person-Job Fitting Latent variable model Job posting, resume [273]
Person-Job Fitting CNN Job posting, resume [485, 475, 169, 481]
Person-Job Fitting RNN Job posting, resume [334, 335, 435]
Person-Job Fitting CNN, RNN Job posting, resume [46, 263, 197]
Person-Job Fitting BERT Job posting, resume
[226, 2, 250, 360],
[208, 123, 152, 147]
Person-Job Fitting GNN, BERT Job posting, resume [45, 444, 409, 437, 73]
Person-Job Fitting Attention mechanism Job posting, resume [168, 129, 473, 178]
Person-Job Fitting Reinforcement learning Job posting, resume [128]
Person-Job Fitting Federated learning Job posting, resume [472]
Person-Job Fitting LLMs Job posting, resume [479, 141, 419, 115, 450]
Person-Job Fitting Ranking
Job posting, resume,
social media
[122, 121]
Person-Job Fitting K-means Social media [145]
Person-Job Fitting Traditional classifiers Social media [290]
Person-Job Fitting Gamma-Poisson model User behavior [53]
Talent Assessment
Interview Question Recommendation Topic model
Job posting, resume,
assessment report
[364, 363]
Interview Question Recommendation Knowledge graph
Job posting, resume,
search engine log
[336]
Interview Question Recommendation
Knowledge graph,
integer linear programming
Question bank [102]
Interview Question Recommendation BERT Job posting [365]
Interview Question Recommendation GCN KSC, search engine log [333]
Assessment Scoring Regression models Interview videos [307, 303, 304]
Assessment Scoring Doc2Vec Interview videos [78, 80]
Assessment Scoring GNN Interview records [77]
Assessment Scoring Attention mechanism Interview videos [172, 171, 174]
Assessment Scoring Transformer Interview videos [372, 395]
Assessment Scoring Adversarial learning Interview videos [173]
Assessment Scoring traditional classifiers Employee profiles [252, 177, 234]
Career Development
Course Recommendation Collaborative filtering Trainees’ profiles [402, 401]
Course Recommendation Markov decision process Learning records [377]
Course Recommendation Neural Network Learning records [320]
Course Recommendation Reinforcement Learning Learning records [480]
Course Recommendation KG-based Transformer Learning records, Knowledge Graph [439]
Promotion Prediction Traditional classifiers Social network [453]
Promotion Prediction Traditional classifiers
Personal profile,
job posting log
[259]
Promotion Prediction Multiple classification Employee’s Detail Record [253]
Promotion Prediction Survival analysis Personal profile, career paths [233]
Turnover Prediction Traditional classifiers HR dataset [373, 302, 10]
Turnover Prediction GNN, RNN profile, turnover records [387]
Turnover Prediction neural network profile, turnover records [386]
Turnover Prediction Traditional classifiers HR Information Systems [6]
Turnover Prediction GNN, RNN, survival analysis
job description, organizational tree,
profile, turnover records
[165]
Job Satisfaction Traditional classifiers
Personal profile,
job profile
[22]
Job Satisfaction Traditional classifiers Social media [346]
Career Mobility Prediction RNN Career paths [236, 170]
Career Mobility Prediction Attention mechanism Career paths, employee profile [289]
Career Mobility Prediction Transformer Career paths, employee profile [433]
Career Mobility Prediction Collaborative filtering Career paths, employee profile [400]
Career Mobility Prediction GNN Career paths, employee profile [465]
Career Mobility Prediction Representation learning Career paths, employee profile [108, 459]
Career Mobility Prediction Reinforcement learning Career paths, employee profile [154, 155, 26]
  • Traditional classifiers represent one or multiple traditional classifiers, including SVM, Logistic regression, GBDT, XGBoost, Decision tree, Adaboost, KNN and so on.

3.1 Talent Recruitment

Talent recruitment is a critical component of talent management, as it involves identifying the right candidates for positions within an organization. The quality of this function can significantly impact the organization’s future development, which is why considerable human and material resources have been invested in ensuring the efficiency and effectiveness of related procedures. According to a Forbes article, US corporations spend nearly 72 billion annually on various recruiting services, and the global amount is likely three times larger  101010https://www.forbes.com/sites/joshbersin/2013/05/23/corporate-recruitmenttransformed-new-breed-of-service-providers/. However, traditional talent recruitment methods rely heavily on the personal knowledge and experience of recruiters, which may introduce bias due to the subjective nature of the process. This potential bias can be exacerbated by varying levels of experience and personal qualities among different recruiters. Fortunately, the rapid development of online recruitment platforms, such as LinkedIn and Lagou, has ushered in a new era of data-driven talent recruitment, empowered by AI technologies.

3.1.1 Job Posting Generation

A job posting comprises both job duties, which describe the responsibilities and tasks of the role to candidates, and job requirements, which outline the professional experience, skills, and domain knowledge that an employer expects from the ideal candidate to perform the role. Job duties are tailored to each specific position. However, identifying the necessary capabilities and aligning them with job duties can be challenging, particularly when recruiters have limited experience and domain knowledge. A research report from Allegis Group indicates that many organizations face difficulties when writing job descriptions, as only 39% of candidates find these descriptions clear and easy to understand [149, 332].

Job Requirement Generation. To improve the quality of job postings, many researchers are focusing on generating job requirements from job duties to more accurately attract suitable candidates [254, 332]. They approach this task from a data-driven perspective by collecting high-quality training data and framing it as a text generation problem. Specifically, the task of generating job postings from the perspective of job requirements can be formalized as follows:

Refer to caption
Figure 6: The illustration of job requirement generation.
Definition 3.1 (Job Requirement Generation)

Given a set of job postings, denoted as 𝒞𝒞\mathcal{C}caligraphic_C, where each posting Ci𝒞subscript𝐶𝑖𝒞C_{i}\in\mathcal{C}italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_C includes a job duty Xisubscript𝑋𝑖X_{i}italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and a job requirement Yisubscript𝑌𝑖Y_{i}italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the goal of job requirement generation is to train a model M𝑀Mitalic_M. This model should be capable of generating a fluent and rational job requirement Ynewsubscript𝑌𝑛𝑒𝑤Y_{new}italic_Y start_POSTSUBSCRIPT italic_n italic_e italic_w end_POSTSUBSCRIPT when provided with a new job duty Xnewsubscript𝑋𝑛𝑒𝑤X_{new}italic_X start_POSTSUBSCRIPT italic_n italic_e italic_w end_POSTSUBSCRIPT.

Technically, this task can be viewed as a text-to-text generation problem, where the job duties and job requirements are typically long sequences of text. To address this task, sequence-to-sequence models using an encoder-decoder architecture are commonly employed, as illustrated in Figure 6. For instance, Liu et al. applied two Long-Short Time Memory (LSTM) layers as the encoder and decoder, respectively, to extract the key information from job duties and generate the job requirement, respectively [254]. Since it is important to precisely use and organize skill-related keywords in job requirements, they implemented the decoder in a two-pass manner. The first-pass decoder focuses on predicting skill-related keywords, while the second-pass decoder uses these predicted skills to guide the generation of fluent text. In particular, the attention mechanism is employed to integrate hidden states from the LSTM layer of the encoder with context information, which enhances skill prediction in the decoder. Furthermore, Qin et al. trained a neural topic model in job duties to capture the global topic information [332]. The topic distribution of each job duty is used as the context information to guide the generation of each word in the LSTM-based decoder.

Writing Assistants. Additionally, unlike the approaches that require models to learn the relationship between job duties and job requirements, some researchers focus on develo** job posting writing assistants with an emphasis on content quality [260, 14]. Most existing research adopts a data-to-text perspective to construct job posting generators. These models convert structured inputs, such as salary and working hours [260], into natural language text that adheres to the conventional style of job postings, which can significantly alleviate the workload of human resources personnel. For instance, with inputs like “MINSALARY = 12k, MAXSALARY = 15k, SALARYTYPE = per month”, the model can generate a sentence for a job posting such as “You will receive a monthly salary between $12,000 and $15,000.” Leveraging the advanced text understanding and generation capabilities of large language models (LLMs), Lorincz et al. fine-tuned the mT5 model [431] to generate the benefits section of job postings from structured data [260]. They transformed structured data into a format compatible with mT5 by concatenating feature names with their corresponding values.

Moreover, Borchers et al. attempted to generate job postings directly from job names using LLMs [52]. Furthermore, the authors employed prompt engineering and gathered high-quality, gender-unbiased training data to enable the generator to produce unbiased job postings.

Evaluation. The performance of job posting generation models can be assessed primarily from several perspectives: validity, fluency, realism, and bias.

  • Validity. Does the generated job posting include appropriate recruitment information, such as skill requirements and benefits? Qin et al. evaluated the performance of various models through human assessments [332], while Liu et al. assessed the skills listed in generated job postings against a predefined skill vocabulary, measuring discrepancies with ground truth job postings [254]. They measured in terms of precision, recall, and F1-value.

  • Fluency. Does the generated job posting read smoothly and fluently? This dimension is typically assessed through human evaluation [332].

  • Realism. Can the generated job postings be distinguished from those written by humans? Realism, a concept first introduced by Borchers et al., is employed to evaluate the quality of generated job postings [52]. The authors developed a Multinomial Naive-Bayes (MNB) classifier to serve as a discriminator, assessing whether it can distinguish between job postings written by humans and those generated by machines.

  • Bias. Does the generated job posting contain potential biases, such as gender bias? Borchers et al. leveraged several existing lists of gender-laden words, including GenderCoded Word Prevalence [135], Superlative Prevalence [356, 394], GenderLaden Scoring [352], Connotation Frames [352], and NRC VAD Lexicon [293], to determine whether generated job postings exhibit a bias toward one gender over another.

Furthermore, as job posting generation constitutes a text generation task, most existing studies [254, 332, 260] evaluate the quality of generated postings by comparing them to ground truth job postings. These ground truth postings, typically crafted by human resources experts and refined through data filtering, serve as benchmarks for automated assessment. Specifically, the standard ROUGE metric [244], including ROUGE-1, ROUGE-2, and ROUGE-L, can be used. These measure the overlap of unigrams, bigrams, and the Longest Common Subsequence (LCS) between the ground truth and generated job postings. Additionally, the BLEU metric [316] can be employed for evaluation, which assesses the co-occurrences of n-grams between the ground truth and the generation.

Recent studies have demonstrated that ChatGPT and other LM-based models can be utilized for text evaluation, achieving performance close to that of human evaluators [143, 298]. This advancement opens new possibilities for the automated assessment of fluency, realism, and other dimensions in generated job postings.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) Existing AI-based methods for generating job postings not only effectively reduce the workload of human resources staff but also produce high-quality job postings that help attract suitable candidates. (2) With the rapid advancement of LLM technology, there has been a notable enhancement in both the ease of implementation and the quality of generated job postings. (3) AI-based generative models can potentially create unbiased job postings.

  • Limitations and future directions: (1) Existing methods enhance the readability of generated job postings, such as fluency, from a data quality perspective. However, there remains a deficiency in solutions from a modeling perspective. (2) Existing research has focused only on specific individual scenarios in job posting generation, lacking a unified model that can handle multiple generation scenarios simultaneously. Recent advancements in instruction learning have equipped LLMs with robust multitasking capabilities [445, 76], making the development of more powerful job posting writing assistants feasible. (3) Existing research lacks an analysis of how job seekers’ behaviors are influenced by reading AI-generated job postings.

3.1.2 Resume Understanding

Resume understanding, also known as resume parsing aims to extract semantically structured information from resume documents, which can facilitate a wide range of intelligent talent analysis applications, such as talent searching [161], person-job fitting [334, 473], and talent assessment [239]. As illustrated in Figure 3, a resume typically includes the candidate’s personal information (e.g., name, phone number), educational background (e.g., major), work experience (e.g., company name), and various other relevant details. Specifically, this task can be formalized as follows:

Definition 3.2 (Resume Understanding)

Given a set of resume documents \mathcal{R}caligraphic_R, the target of resume understanding is to learn a model M𝑀Mitalic_M, which can extract relevant segment Si,jsubscript𝑆𝑖𝑗S_{i,j}italic_S start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT, corresponding to specific type of factual information Yjsubscript𝑌𝑗Y_{j}italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, such as names and experience, from each resume Risubscript𝑅𝑖R_{i}\in\mathcal{R}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_R.

Typically, resume understanding begins by converting various forms of resume files, into textual formats. This conversion is accomplished using technologies or tools such as Optical Character Recognition (OCR) [69] and PDF parsers [443]. Consequently, resume understanding becomes a text-mining task, enabling the extraction of structured information from the textual data. The advancement of resume understanding technology is closely linked to innovations in information extraction techniques. Initially, Kopparapu utilized keyword searches and rule-based matching to extract six major fields of information from resumes: “name”, “software skills”, “qualification”, “experience”, and “email” [216]. Considering the hierarchical structure of resume information, for example, the “personal information” usually includes details such as “name” and “gender”. In [449], Yu et al. first divided a resume into different blocks (e.g., “personal information” and “education information”) using a Hidden Markov Model (HMM). They subsequently employed both Support Vector Machines (SVM) and Hidden Markov Models (HMM) to extract detailed information from these blocks. By leveraging traditional machine learning models such as HMM [449], SVM [449, 89], and Conditional Random Fields (CRF) [321, 74], resume understanding has achieved an accuracy rate of over 80% across most fields.

Although these traditional models achieve good performance, they incur the cost of extensive feature engineering. With the development of deep learning and its broad application in information extraction, many researchers have shifted to utilizing deep learning to construct more effective resume understanding models [27, 327, 198, 240, 262]. Most deep-learning approaches view resume understanding as a Named Entity Recognition (NER) task. Ayishathahira et al. utilized a Bidirectional LSTM-CNN architecture to perform resume information extraction [27]. Van et al. leveraged Convolutional Neural Network (CNN) to learn the character-level representations of words in resumes, and utilized the classic neural NER architecture, BiLSTM-CRF, to enhance resume information extraction performance [327]. Meanwhile, ** et al. integrated a highway network and self-attention mechanism into BiLSTM-CRF [198]. Subsequently, the introduction of pretrained language models such as BERT has significantly improved the representation of resume text, further enhancing the performance of deep learning-based resume understanding models [240, 262]. In experiments with a Chinese resume dataset, the methods have achieved an F1 score of up to 96%.

Prior research has primarily transformed resume documents into plain text inputs, thereby overlooking the multimodal information inherent in these files, such as layout information, which is crucial for resume understanding. Wei et al combined the pre-trained model RoBERTa with a Graph Convolutional Network (GCN) to extract structured information from resumes, taking into account layout and positional features [411]. Recently, the integration of multi-modal features with pretrained models for document understanding has gained widespread adoption [196, 428, 427, 429]. Inspired by the BERT architecture, LayoutLM was the first to enhance the masked language modeling task by incorporating the 2D coordinates of each token as layout embeddings [428]. This approach enables the model to capture interactions between text and layout information, leading to an improvement of approximately 15% compared to text-only methods. Subsequently, advancements in methods such as DocFormer [20], LayoutLMv2 [427], LayoutLMv3 [185], and LayoutXLM [429] have progressively enhanced the performance of multimodal pretrained models in document understanding. These technologies have also been extensively adapted for resume understanding. Given that resume documents are more text-centric and voluminous compared to traditional document types like receipts, Yao et al. proposed a hierarchical multi-modal pre-training model tailored for long document understanding [443]. Additionally, to tackle the scarcity of annotated training data for resumes, the authors developed a distantly supervised sequence labeling method. This method, trained within a self-distillation based self-training learning framework, significantly improved the model’s performance in resume understanding with limited training data. Jiang et al. introduced a novel layout-aware multi-modal fusion transformer that encodes resume segments by integrating textual, visual, and layout information [196]. Additionally, they developed a multi-granularity sequence labeling task to address the hierarchical relationships among the fields to be parsed.

Evaluation. The performance of resume understanding models is typically evaluated at the instance level [327, 443, 196]. This evaluation assesses whether an extracted instance Si,jsubscript𝑆𝑖𝑗S_{i,j}italic_S start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT for the specific type of factual information Yjsubscript𝑌𝑗Y_{j}italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is considered correct only when it is identical to a hand-annotated instance.

Precision=The number of true positive predictionTotal number of positive prediction,Recall=The number of true positive predictionTotal number of actual positive,F1=2PrecisionRecallPrecision+Recall.formulae-sequencePrecisionThe number of true positive predictionTotal number of positive predictionformulae-sequenceRecallThe number of true positive predictionTotal number of actual positivesubscript𝐹12PrecisionRecallPrecisionRecall\begin{split}{\color[rgb]{0,0,0}\text{Precision}}&{\color[rgb]{0,0,0}=\frac{% \text{The number of true positive prediction}}{\text{Total number of positive % prediction}},}\\ {\color[rgb]{0,0,0}\text{Recall}}&{\color[rgb]{0,0,0}=\frac{\text{The number % of true positive prediction}}{\text{Total number of actual positive}},}\\ {\color[rgb]{0,0,0}F_{1}}&{\color[rgb]{0,0,0}=\frac{2*\text{Precision}*\text{% Recall}}{\text{Precision}+\text{Recall}}.}\end{split}start_ROW start_CELL Precision end_CELL start_CELL = divide start_ARG The number of true positive prediction end_ARG start_ARG Total number of positive prediction end_ARG , end_CELL end_ROW start_ROW start_CELL Recall end_CELL start_CELL = divide start_ARG The number of true positive prediction end_ARG start_ARG Total number of actual positive end_ARG , end_CELL end_ROW start_ROW start_CELL italic_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG 2 ∗ Precision ∗ Recall end_ARG start_ARG Precision + Recall end_ARG . end_CELL end_ROW (1)

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI-based resume understanding technology has become a foundational technology for ATS and online recruitment platforms, enabling a range of intelligent recruitment services. (2) Multimodal pre-trained models have emerged as the SOTA solution paradigm.

  • Limitations and future directions: (1) Although Yao et al. considered performance optimization of resume understanding models under limited annotated data [443], there is still a lack of systematic research in low-resource scenarios. (2) Recent advancements have demonstrated that generative LLMs possess strong capabilities for zero/few-shot learning and text annotation [412, 143]. The potential of LLM-based resume parsing models remains an area ripe for exploration.

3.1.3 Talent Searching

Talent searching is designed to identify suitable candidates based on search queries provided by recruiters or hiring managers [209]. Formally, considering the candidate set ={R1,..,Rn}\mathcal{R}=\{R_{1},..,R_{n}\}caligraphic_R = { italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , . . , italic_R start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT }, where each candidate Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denotes his/her resume consisting of work experience, education experiences, and so on 111111Here, Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT usually refers to the structured data derived from the process of resume understanding.. Then, talent searching can be approached as an information retrieval task 121212When the query directly corresponds to a job position, it aligns with the person-job fitting task, which is detailed in the following subsection and will not be further discussed here.:

Definition 3.3 (Talent Searching)

Given the candidate set \mathcal{R}caligraphic_R and a searching query q𝑞qitalic_q consisting of search criteria, the goal of talent searching is to determine a subset of candidates 𝒰𝒰\mathcal{U}\subset\mathcal{R}caligraphic_U ⊂ caligraphic_R satisfying the search criteria, and rank those candidates based on the fitness.

The key to talent search lies in measuring the fitness between the search query q𝑞qitalic_q and each resume Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Manad et al. established connections between queries and candidates through skill matching. They extracted skill information from both queries and resumes, then ranked candidates based on scores that reflect the level of skill proficiency demonstrated in the resumes [274]. Wang et al. focused on queries composed of competency keywords, such as skill and activity keywords. They utilized the BERT model to extract competency keywords from resumes and applied the weighted average method to calculate candidates’ scores for different competency keywords. Meanwhile, the authors enhanced the effectiveness of talent search by leveraging a structured Competence Map (CMAP) to explore the relationships between various competency keywords [407]. In [19], Anca et al. developed several independent classifiers based on traditional machine learning models, such as KNN and LDA, to categorize candidate attributes in areas such as Education (e.g., University and Faculty), Programming Languages (e.g., JavaScript and Python), and Foreign Languages (e.g., English, French, German), thus enabling recruiters to search suitable candidates based on these aspects.

As talent search systems accumulate substantial historical data, researchers attempt to enhance retrieval performance through supervised training. For instance, Ozcaglar et al. proposed a two-level ranking system to integrate structured candidate features by combining Generalized Linear Mixed (GLMix) models and Gradient Boosted Decision Tree (GBDT) models, using recruiter actions as supervised information to learn candidate ranking [311]. Geyik et al. involved recruiters’ immediate feedback on recommendation results to cluster candidates according to recruiters’ intents [139]. Subsequently, they developed a multi-armed bandit-based approach to select the appropriate intent cluster for each recruiter, which was then used to rank the candidates within that cluster. Considering that queries in talent search scenarios can be quite complex, combining several structured fields (such as canonical skills and company names) and unstructured fields (such as free-text and keywords), Ramanath et al. have developed deep learning-based embedding models for both queries and candidates, and they explored learning to rank approaches with DNN models [340]. Yang et al. further proposed a three-stage cascaded ranking model, integrating DNN and BERT, to enhance a deep learning-based talent search system, and accounted for personalized preferences of different users in the final stage [440].

Considering the challenge recruiters face in crafting effective search queries, Ha et al. developed a novel talent search system on the LinkedIn platform that allows users to select ideal candidate examples as queries [161, 162]. They extracted keywords related to titles, skills, and companies from these candidates to reconstruct queries, built training data for a Query-By-Example retrieval system using historical data from a Query-By-Keyword system, and employed the Coordinate Ascent algorithm to optimize the model.

Evaluation. Since talent searching is a classic retrieval task, its performance is typically evaluated using ranking metrics such as Precision@N and NDCG@N [311, 161, 162], where N denotes the top-N results produced by each model.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI-based talent search models can significantly enhance recruiters’ efficiency in selecting candidates and are crucial components in intelligent ATS, online recruitment platforms, and employment-focused social media platforms. (2) Existing AI-based studies enable talent search tasks that incorporate queries for jobs, keywords, and various complex requirements.

  • Limitations and future directions: (1) Existing supervised talent search models heavily depend on abundant training data and struggle in cold-start or data-sparse scenarios. Potential solutions, such as data augmentation and transfer learning techniques, merit further exploration. (2) Technologies related to novel talent search scenarios, such as conversational search, require further investigation. (3) With the rapid development of LLMs recently, there has been increasing research aimed at leveraging LLMs to enhance information retrieval systems, such as search engines [487, 488, 276]. However, there is still limited research in the context of talent search scenarios.

3.1.4 Person-Job Fitting

Person-job fitting (PJF) has always been a crucial task in the recruitment process, focusing on measuring the degree of match between a job posting—which includes job duties and requirements—and a candidate’s resume, which details their work and educational experiences. Effective PJF assists employers in selecting the right talent for appropriate positions, thereby enhancing employee performance and job satisfaction. This ultimately contributes to the mutual success of both the organization and its employees. However, the traditional solution relies heavily on recruiters’ domain expertise and subjective judgment, making it challenging to achieve both efficiency and objectivity [334]. Recently, with the accumulation of extensive historical recruitment data through HRIS like ATS, many researchers have leveraged AI technology to develop supervised models for PJF. This advancement has significantly improved the efficiency and effectiveness of the recruitment process [334, 435, 286, 472]. Specifically, the task of Person-Job Fitting can be formally defined as follows:

Definition 3.4 (Person-Job Fitting)

Given a set of job applications 𝒮𝒮\mathcal{S}caligraphic_S, where each application Si,j𝒮subscript𝑆𝑖𝑗𝒮S_{i,j}\in\mathcal{S}italic_S start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ∈ caligraphic_S contains a resume Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and a job posting Jjsubscript𝐽𝑗J_{j}italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, along with a corresponding recruitment result label yi,jsubscript𝑦𝑖𝑗y_{i,j}italic_y start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT, the target of PJF is to learn a predictive model M𝑀Mitalic_M for measuring the matching degree between J𝐽Jitalic_J and R𝑅Ritalic_R, enabling the prediction of the result label.

Note that the recruitment result label yi,jsubscript𝑦𝑖𝑗y_{i,j}italic_y start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT varies across different recruitment contexts. When building intelligent recruitment systems for companies, the recruitment result label can be set based on the full progression of a job applicant’s process, including stages such as interview, offer, and onboarding stages. For instance, in [334], yi,jsubscript𝑦𝑖𝑗y_{i,j}italic_y start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT is a binary label, where y=1𝑦1y=1italic_y = 1 indicates that the candidate has been selected for further interviews. Conversely, within online recruitment platforms, the recruitment result label yi,jsubscript𝑦𝑖𝑗y_{i,j}italic_y start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT is typically configured to indicate whether a user Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT has applied for a specific position Jjsubscript𝐽𝑗J_{j}italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT [348, 349]. Additionally, some online recruitment scenarios account for the reciprocal outcomes between employers and job seekers. This encompasses whether the employer of a job Jjsubscript𝐽𝑗J_{j}italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, intends to advance the recruitment process after receiving an application from job seeker Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and whether a job seeker Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT accepts an invitation to interview for a specific position Jjsubscript𝐽𝑗J_{j}italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT after being approached by the employer [129, 181].

Early research efforts in the field of PJF using AI-based technologies can be traced back to [273]. Malinowski et al. proposed that the compatibility between a job and a candidate often hinges on underlying factors not explicitly stated in the job posting or the candidate’s resume. To address this issue, they developed a latent variable model to represent job requirements and candidate abilities, and formulated PJF as a bilateral matching problem from a demands-abilities perspective. Since then, several studies have explored the use of AI-based technologies to extract job and candidate profiles from textual data. For instance, Zhu et al. introduced a CNN based neural network to extract representation vectors from job postings and resumes [485]. They then evaluated the fit between a candidate’s qualifications and the job requirements by measuring the similarity between these vectors, achieving an average AUC performance of around 75% on real-world data from 2013 to 2016. In [475], the authors utilized CNN combined with an attention layer to enhance the representation of textual information for both positions and candidates. Qin et al. used LSTM to model the text sequence and applied ability-aware attention strategies to measure the importance of each job requirement or candidate’s ability on the final PJF decision, resulting in an improvement of approximately 10% compared to previous methods [334, 335]. Similarly, in [46] the authors treated job postings and resumes as multi-sentence documents and utilized Bidirectional Gated Recurrent Units (BIGRU) and word-level attention to model sentences and documents. Wang et al. utilized co-attention mechanisms and Graph Neural Networks (GNN) to better leverage historical recruitment data, thereby enhancing the representations of jobs and resumes and subsequently improving PJF performance [409]. Luo et al. expanded upon these efforts by integrating LSTM, CNN, and attention models to process various types of structured textual information, such as skills and experiences [263]. They proposed an adversarial learning-based framework to enhance the expressive representations of this data.

Recently, the rapid advancement of pretrained language models like BERT has significantly enhanced textual data representation, enabling researchers to develop more effective PJF models [226, 2, 250, 360, 208, 123, 152, 147]. For instance, Abdollahnejad et al. serialized concatenated each pair of job Jjsubscript𝐽𝑗J_{j}italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and resume Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT as inputs to BERT, using the representation of the [CLS]delimited-[]CLS[\text{CLS}][ CLS ] token at the end the sequence to signify the joint representation of the application Si,jsubscript𝑆𝑖𝑗S_{i,j}italic_S start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT [2]. Subsequently, they employed a fully-connected layer to predict the recruitment label yi,jsubscript𝑦𝑖𝑗y_{i,j}italic_y start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT, ultimately fine-tuning the BERT-based PJF model with historical interaction data. A similar model structure has also been employed by Kaya et al. [208]. Lavi et al. fine-tuned the BERT model using the Siamese networks and matched jobs and candidates with both classification and similarity-based objectives [226]. Distinct from previous approaches that directly utilized BERT’s architecture, Fang et al. proposed a skill-aware, prompt-based pretraining framework that significantly enhanced the representation of recruitment domain corpora [123]. Compared to BERT and other backbone models, this framework effectively improved performance on several downstream tasks including PJF.

Based on a wealth of high-quality historical data, the aforementioned supervised PJF models are powerful. However, in many recruitment scenarios, job-resume interaction data is sparse and noisy, which significantly undermines the effectiveness of PJF models [46, 45, 450, 73]. To address this, researchers have approached the problem from various perspectives. For instance, in [45], the authors constructed a Job-Resume Relation Graph to enhance the representation of jobs and resumes, where the edges in the graph were created based on historical interactions, category labels, and keywords. Additionally, they designed a Multi-View Co-Teaching Network to learn text-based and relation-based matching modules simultaneously. Bian et al. addressed this issue from the perspective of transfer learning, utilizing potential semantic relations among different job categories and domain adaptation technologies to enhance the performance of PJF models in target domains with sparse interactions [46]. In [450], the authors employed data augmentation techniques based on EDA [410] and ChatGPT [95] to construct additional synthesized job-resume pairs, effectively addressing the issue of data sparsity.

In addition to capturing textual features from application data, various studies have incorporated additional related information to enhance the performance on the PJF task. For instance, it has been noted that certain numerical and categorical attributes of jobs and candidates—such as career level, education, company name, tag, and region—are also crucial and beneficial [197, 169, 168]. Consequently, several neural network-based approaches have been designed to capture the comprehensive interaction of different types of data, such as Factorization Machine [197], CNN [169], and Self-Attention [168]. Furthermore, many researchers have enhanced the performance of PJF models by incorporating knowledge graphs from the recruitment domain as side-information [444, 47, 481]. Moreover, user interaction records on online recruitment services serve as important complementary features to represent both jobs and candidates. Specifically, Yan et al. and Jiang et al. integrated historical interviewed applicants for a job posting and historically applied jobs for a particular talent to complement and enhance the representation learning for jobs and candidates [435, 197]. The hidden idea is that historical interview choices and job applicants reveal the preference of jobs and candidates for each other. Meanwhile, in [178], the authors leveraged job seekers’ click behaviors and search histories on recruitment websites to better model their job-seeking intentions, thereby enhancing the performance of the PJF task. Inspired by existing multi-behavior recommendation methods, Saito et al. utilized job seekers’ interactions—including viewing, favoriting, and applying—to implement a multi-behavior job recommendation system [348, 349]. In [128], the authors integrated the supply-demand situation of various companies at specific times as supplementary labor market information into the PJF model. They utilized hierarchical reinforcement learning to address the dynamic components of the PJF task, particularly by incorporating the temporal factors of the application process.

Applications for Talent Recommendation. One intuitive application of person-job fit is talent recommendation, which involves finding suitable candidates for a specific job. Given a specific job posting J𝐽Jitalic_J and a set of candidates {R1,R2,,Rn}subscript𝑅1subscript𝑅2subscript𝑅𝑛\{R_{1},R_{2},...,R_{n}\}{ italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_R start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT }, a well-trained PJF architecture can be used to measure the fitness of each pair (Jj,Ri)subscript𝐽𝑗subscript𝑅𝑖(J_{j},R_{i})( italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) and rank the candidates. While the PJF models mentioned previously can all be applied in this context, some studies have attempted to enhance candidate ranking with a more comprehensive evaluation. For instance, personality traits have been identified as critical success factors for job performance and organizational effectiveness [179]. Researchers have mined these traits through linguistic analysis of social media text data [122, 121, 145, 290]. Moreover, by integrating various measurements and data sources, researchers and companies have developed electronic recruitment (e-recruitment) systems for more effective and efficient recruitment, particularly in the talent pre-screening stage [122, 121]. These systems have been demonstrated to impact recruitment processes positively [392].

Applications for Job Recommendation. As a dual problem of talent recommendation, job recommendation aims to recommend jobs for a specific candidate. The PJF architecture, which can be built on any PFJ models mentioned previously, can output the fitness of each job in the job set {J1,J2,,Jn}subscript𝐽1subscript𝐽2subscript𝐽𝑛\{J_{1},J_{2},...,J_{n}\}{ italic_J start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_J start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_J start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } for the given candidate Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. This can be valuable in recruitment scenarios where job application redistribution is necessary, such as position assignments for candidates in campus recruitment or for employees in internal position adjustment. Job recommendation can also be used in online recruitment services to assist job seekers in finding suitable job opportunities [9, 107, 182].

Furthermore, some online recruitment systems offer recommendation services to both employers and job seekers. In response, researchers have developed joint models to address these dual recommendation scenarios [437, 129, 181, 478]. For instance, Yang et al. proposed a unified dual-perspective interaction graph that incorporates two distinct nodes for each candidate (or job) to characterize both successful and failed matching [437]. Subsequently, the authors employed BERT and GCNs to construct the PJF model. It was trained using a quadruple-based loss and a dual-perspective contrastive learning loss to enable bidirectional recommendation. In [129], the authors investigated the dynamic preferences of users, including browsing, clicking, and online chat behaviors, within dual recommendation scenarios. They introduced a Bilateral Cascade Multi-Task Learning framework to implement dynamic PJF effectively.

Evaluation. The performance of PJF is evaluated using two categories of metrics: classification (Accuracy, F1-score, AUC) and ranking (NDCG@N, Precision@N, Recall@N).

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) The use of AI technology to solve PJF tasks has garnered extensive attention from both academia and industry, leading to its widespread application in corporate recruitment systems and online recruitment platforms, effectively achieving precise matching between job seekers and positions. (2) Current mainstream AI-based PJF models can consider not only the match between job seekers and positions based on rich textual information but also historical interaction data and personalized preferences. Consequently, advancements in PJF models are influenced by developments in representation techniques and recommendation system technologies. (3) AI-based PJF models play a vital role in addressing some social challenges, such as integrating migrants and refugees into society [308].

  • Limitations and future directions: (1) Most current PJF models prioritize enhancing model accuracy, often overlooking interpretability. While some researchers employ strategies like attention mechanisms to highlight key features [473, 41], the actual benefits to recruitment system users remain uncertain. Consequently, there is a significant gap in systematic research focused on improving the interpretability of PJF models to boost the operational efficiency of recruitment systems. (2) Cardoso et al. identified a Matching Scarcity Problem (MSP) in talent/job recommendation systems, characterized by candidates or jobs experiencing a lack of matches within the system [64]. Mashayekhi et al. associated this issue with congestion problems prevalent in recommendation systems [278]. Indeed, while intelligent PJF services facilitate new avenues for information exchange between labor supply and demand, they also pose the risk of creating imbalances in information distribution. There is a notable gap in the existing research regarding the systematic integration of intelligent PJF with resource allocation strategies in the labor market. (3) With the gradual implementation of AI regulatory frameworks in regions like Europe, the inherent risks of AI-based PJF models—including concerns related to fairness [222] and user privacy [472]—are increasingly under scrutiny by researchers. As a result, the development of responsible AI practices in PJF technology has become a critical focus of research. (4) With the rapid advancement of generative LLM technologies, recent studies have begun to leverage their robust text understanding and generation capabilities to develop more powerful PJF models [479, 141, 419, 115]. However, this field is still in its infancy and requires further research. Additionally, issues such as hallucinations and low training and inference efficiency in LLM technology also impact the effectiveness of current LLM-based PJF models.

3.2 Talent Assessment

Talent assessment is a crucial process for companies to identify the competency of candidates. In this paper, we discuss two primary branches of talent assessment, i.e., interview question recommendation and assessment scoring.

3.2.1 Interview Question Recommendation

Job interview aims to assess the fitness of candidates and the job positions by evaluating their skills and experiences. A critical aspect of this process is designing appropriate questions to comprehensively assess the competencies of potential employees. In this phase, personalized question recommendation has emerged as a feasible approach, that selects the right questions from the question set based on the candidate and the job position. Along this line, Qin et al. proposed to recommend personalized question sets for various applications, taking into consideration both job requirements and candidates’ experiences [336]. In particular, to enhance the performance of the recommendation system, a knowledge graph of job skills was built using query logs from the biggest search engine, Baidu. Datta et al. further took into account the difficulty of interview questions and designed an interview assistant system using knowledge graphs and Integer Linear Programming techniques [102]. To cope with the substantial risk of bias arising from the subjective nature of traditional in-person interviews, Shen et al. proposed to learn the representative perspectives of in-person interviews from the successful job interview records in history [364]. With the help of topic models, they represented job postings, resumes, and interview assessment reports in an interpretable way. The potential relationships among them are also mined to recommend questions or skills that should be estimated during interviews [364, 363]. Substantially, Shi et al. proposed an automated system for recommending personalized screening questionnaires based on job postings [365]. They encoded the job posting with the BERT model, selected the question templates with a Multi-Layer Perception (MLP) Classifier, and extracted necessary parameters from the templates using feature-based regression models. In [333], the authors introduced a scalable, skill-oriented interview question generation model that leverages external knowledge from an online Knowledge-Sharing Community (KSC). Additionally, they developed a GNN-based interview question recommendation system that tailors interview questions to user queries.

Evaluation. Generally, the effectiveness of interview question recommendation systems is validated through online experiments, which examine whether recommended questions improve talent screening [336] or are adopted by interviewers and recruiters [365]. Moreover, offline experiments can evaluate the performance of recommendation algorithms by constructing a standardized test set using historical assessment reports from interviewers on various candidates [364, 363].

\diamondsuit Takeaway.

  • Advantages of AI technology: AI algorithms can assist interviewers in efficiently preparing personalized interview questions for candidates, facilitating tailored talent.

  • Limitations and future directions: (1) Recently, some studies have furthered the development of fully automated interview robots by employing LLMs to devise algorithms that generate follow-up questions in conversational interview settings [312, 354]. Nonetheless, research that integrates a comprehensive database of interview questions with conversational interviewing techniques is still insufficient. (2) In the field of intelligence education, Computerized Adaptive Testing (CAT) is widely studied for assessing students’ knowledge mastery using a minimal number of test items [265]. Despite its applicability to talent assessment scenarios, where it shares several methodological parallels, there is a notable deficiency in studies exploring CAT methodologies specifically tailored for interviewing contexts. (3) Given the scarcity of high-quality interview-related data [239], there is a lack of systematic research into effectively addressing the recommendation of interview questions in low-resource scenarios using data augmentation and transfer learning techniques. (4) Currently, due to significant variations in data types across specific scenarios, there is no unified definition for interview question recommendation. In the future, the development of standardized datasets will significantly advance research in this field.

3.2.2 Assessment Scoring

Assessment scoring is another critical problem in talent management, evaluating the competency of candidates or employees based on their performance. Typically, this problem can be formulated as a binary classification problem as follows:

Definition 3.5 (Assessment Scoring)

Based on the observed attributes xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of the employee Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the target is to predict whether he/she is competent for the current job, i.e., P(Yi=1xi)𝑃subscript𝑌𝑖conditional1subscript𝑥𝑖P(Y_{i}=1\mid x_{i})italic_P ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ∣ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ).

Substantial effort has been invested during the interviewing phase [303, 78, 80, 172, 171, 77, 174, 264]. Within this context, a major area of research involves using recorded face-to-face job interviews or simulated asynchronous video interview (AVI) data to assess candidates’ competencies in various aspects. For instance, Nguyen et al. were the first to focus on the automated prediction of employment interview outcomes based on both audio and visual nonverbal cues from the interviewee and interviewer [307]. Naim et al. extracted 82 features from 138 recorded interview videos, covering three dimensions: prosodic, lexical, and facial information [303, 304]. They employed regression models such as Support Vector Regression (SVR) and Lasso to predict overall interview scores. Chen et al. focused on monologue-style responses to Structured Interview (SI) questions and constructed an AVI dataset [78, 80]. Subsequently, they utilized the Linguistic Inquiry Word Count (LIWC) [324] to extract lexical features from text-based interview content, used automated speech analysis and transcription [79, 456] to generate features assessing various dimensions of speaking, such as fluency, rhythm, intonation & stress, and pronunciation, and extracted visual features related to facial expressions using Visage SDK’s FaceTrack and FACET SDK. Finally, several shallow classification models, including Support Vector Machine (SVM) and Random Forest (RF), were employed to predict interviewee traits such as agreeableness, conscientiousness, emotional stability, extraversion, and openness. Hemamou et al. collected a corpus of over 7,000 candidates who participated in asynchronous video job interviews for real positions, recording themselves answering a set of questions [172]. Utilizing this extensive dataset, they designed a hierarchical attention model to predict candidates’ hireability. Additionally, to pinpoint the relevant segments of an answer, the authors developed attention mechanisms that extract fine-grained temporal information from the responses [171].

With the European Union’s approval of the world’s first AI technology regulation, researchers are now reconsidering the ethical risks associated with AI-based assessment scoring methods used in interview processes. For instance, unlike previous work that utilized multimodal information from interview recordings, Chen et al. solely employed automatic speech recognition transcriptions from AVI data [77]. They introduced GNNs to construct dependency relations between questions, enabling the learning of a model that scores multiple question-answer pairs. Besides, Singhania et al. were the first to investigate fairness concerning gender and race in video interviews [372]. Moreover, in [173], the authors proposed an approach using adversarial methods to remove sensitive information from the latent representations of neural networks, which was applied to interview assessments to promote fairness in job selection.

In addition to the interviewing phase, there is still a focus on predicting competency based on employee profile information. For instance, Liu et al. and Hong et al. applied the SVM model to predict the competence level of civil servants and highway construction foremen, respectively [252, 177]. Furthermore, Li et al. evaluated the performance of traditional classifiers, including SVM, RF, and Adaboost, in competency evaluation and found that prediction results based solely on structured personal static data were suboptimal [234]. For improved performance, incorporating more unstructured and dynamic data, such as textual data or social networks, may represent a significant direction for future research.

Evaluation. Generally, the performance of assessment scoring models is evaluated by calculating the difference between predicted results and expert scores. Therefore, classification metrics (e.g., F1-score) and regression metrics (e.g., RMSE) are commonly used.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) Existing AI-based assessment scoring models offer an automated and objective method for evaluating candidates’ abilities in professional skills, social skills, and other overall competencies. (2) AI-based assessment scoring models can analyze candidates’ performances in simulated interview scenarios, aiding them in honing their interview skills and increasing their chances of success in actual job interviews.

  • Limitations and future directions: (1) Although some researchers are increasingly aware of the potential ethical risks associated with the use of data and models in AI-based assessment scoring, there is still a lack of comprehensive research in this area. (2) Most existing research focuses on training supervised models trained with original scores from interviewers or managers for automated scoring predictions. However, these studies often overlook the inherent biases embedded within the data. Therefore, further research is needed on how to train unbiased assessment scoring models from biased data. (3) Cognitive diagnosis has been widely validated in computational education for learning the knowledge profiles of students and predicting their future exercise performance [474, 451]. However, research concerning the application of these techniques to competency diagnostics in talent assessment contexts, such as interviews, remains limited.

3.3 Career Development

Career development is the process of acquiring and experiencing planned and unplanned activities that support the attainment of life and work goals [283]. In this paper, from the development process view, talent training is the first step of career development and behavior management is the next step. Therefore, we investigate the training process, e.g., course recommendation, and behavior management, including promotion, turnover, and career mobility.

3.3.1 Course Recommendation

The organization offers a diverse array of training programs, including technical, project management, quality, leadership, specialized, and soft skills training. Hence, it is crucial to assist employees in selecting the training courses that align closely with their background and objectives [377]. Course Recommendation aims to provide personalized courses based on the different preferences and needs of users in various aspects. Much research in this direction is concerned with student education [317, 118, 247, 110, 97, 491]. Recently, some effort has also been posed in the talent management field [377, 402, 401, 320, 480, 439]. There is a growing focus on the advancement of enterprise Learning Management Systems (LMS), designed to enhance both individual and organizational performance by customized online training programs aimed at augmenting employees’ skills and knowledge [480]. Typically, we formalize this problem as follows:

Definition 3.6 (Course Recommendation)

Given the sequential history learning record isubscript𝑖\mathcal{H}_{i}caligraphic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of the employee Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and her/his profile 𝒮isubscript𝒮𝑖\mathcal{S}_{i}caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the target is to predict the rating Yi,jsubscript𝑌𝑖𝑗Y_{i,j}italic_Y start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT of employee Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT on course Cjsubscript𝐶𝑗C_{j}italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, i.e., E[Yi,ji,𝒮i,Cj]𝐸delimited-[]conditionalsubscript𝑌𝑖𝑗subscript𝑖subscript𝒮𝑖subscript𝐶𝑗E[Y_{i,j}\mid\mathcal{H}_{i},\mathcal{S}_{i},C_{j}]italic_E [ italic_Y start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ∣ caligraphic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ].

As a recommendation task, most of the methods in the recommendation system can be utilized to conduct course recommendations. However, unlike traditional item recommendation, where the decision is determined by the users’ rating, more attention should be paid to mining employees’ competencies and their needs for further development from side information such as the employee’s profiles. Therefore, Wang et al. used a topic model to extract the latent interpretable representations of the employee’s current competencies from their skill profiles, as well as a recognition mechanism to explore the personal demands from learning records [402]. Then, they integrated the collaborative filtering algorithm with the Variational AutoEncoder (VAE) to develop an explainable course recommendation system. In addition to learning records, Srivastava et al. also introduced work history into the learning content recommendation and defined a Markov Decision Process (MDP) to extract the past training patterns [377]. Moreover, Sharvesh et al. proposed an adaptive recommendation model that incorporates employees’ styles and goals to suggest the most suitable courses, considering their interactions and performance [320]. Zheng et al. proposed a novel generative recommendation framework called Generative Learning for Adaptive Recommendations (GLAD) [480], which integrates reinforcement learning. It concurrently considers enhancing employee performance and ensuring the rationality of generated recommendations. Yang et al. introduced a contextualized knowledge graph embedding to recommend training courses to the talent in an explainable manner [439].

Evaluation. Typically, the performance evaluation metrics for course recommendation models align with those utilized in recommender systems, including AUC, Recall@N, Precision@N, F1@N, Hit@N, NDCG@N and MAP@N. The N denotes the top-N results produced by each model.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) The AI-based course recommendation system integrates employee and course profiles to provide tailored course recommendations, thereby enhancing learning outcomes for employees. (2) Several AI-based course recommendation systems not only offer personalized course suggestions to employees but also offer suitable justifications for these recommendations, thus facilitating employees’ decision-making in selecting training courses. (3) The AI-based course recommendation system recommends courses to employees and also assesses their competency level, thereby facilitating their subsequent career development.

  • Limitations and future directions: (1) While many AI-based course recommendation models prioritize courses based on employee preferences, these preferences may not always align with the employee’s career development needs. As a result, the recommended courses may not effectively improve work performance [480]. (2) While existing AI-based course recommendation systems offer personalized recommendations to employees, they often fail to consider the complex motivations driving employees’ course selections [439]. Further research is needed to understand the diverse motivations behind employees’ course choices. (3) Most current course recommendation systems prioritize student course recommendations, leaving limited research on recommending courses for employee training. As employee training advances, there is a pressing need for further research on employee course recommendation systems.

3.3.2 Promotion Prediction

Promotions serve two essential roles in the organization, that is, assigning individuals to the jobs for which they are best suited and providing incentives for lower-level employees  [33]. Traditionally, promotion is decided by only managers. Promotion prediction is mainly used to better judge the development potential of employees, so as to better work allocation efficiency and better output. This prediction or judgment is generally decided through questionnaires, which is labor-intensive. To this end, AI-based research is concerned with identifying the features that are correlated with promotion and applying machine learning methods to predict promotion. Typically, the problem can be formulated as a classification task:

Definition 3.7 (Promotion Prediction)

Given the history record itsuperscriptsubscript𝑖𝑡\mathcal{H}_{i}^{t}caligraphic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT in a certain period t𝑡titalic_t of the employee Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the target is to predict the probability of a promotion P(Yi=1it)𝑃subscript𝑌𝑖conditional1superscriptsubscript𝑖𝑡P(Y_{i}=1\mid\mathcal{H}_{i}^{t})italic_P ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ∣ caligraphic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ).

In this phase, some basic machine learning algorithms have been adopted to solve this problem like KNN, SVM, random forest, Adaboost and so on [189, 192]. Generally, the majority of machine learning methods achieve an accuracy of over 90% in promotion prediction. At the same time, to solve the data imbalance issue in promotion data, various sampling methods have been adopted like random oversampling, under-sampling, or hybrid sampling combining multiple sampling methods [347, 358]. With increasingly complex features introduced, Yuan et al. suggested that work-related interactions and online social connections are strongly predictive and correlate with promotion and resignation [453]. Along this line, Long et al. used Random Forest to predict promotions based on demographic and job features [259]. However, these methods may not fully capture the complexity and dynamism of career development. To address this, Li et al. proposed a novel survival analysis approach, where predicting promotion events is transformed into estimating the expected duration of time until promotion occurs [233]. Liu et al. defined the development potential with various dimensions like the number of expertise [253]. Through dimensions distribution and clustering algorithms, the model could divide development status into four categories so that promotion prediction results could be generated. Guarino et al. designed a method for optimizing the growth path with Deep Q-Learning [153]. In this case, experience results, which are the reward value, are evaluated by the competences development. Through this system, employees’ development potential is judged so that we can confirm whether this employee should be promoted.

Evaluation. Generally, the promotion prediction is a classification problem. On the one hand, in the case of a predicted promotion, promotion prediction is a binary classification problem, so accuracy, precision, recall, and f1-score are often used [259, 189, 192]. On the other hand, in the case of a predicted development potential, promotion prediction is transformed into a development potential judgment with various categories in different works. Hence, accuracy, macro-recall, macro-precision, and macro-F1 are used [253]. Moreover, in this case, offline experiments on the development potential or development path optimization system could be used to evaluate the managers’ or employees’ satisfaction [153].

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI-based algorithms can assist managers in assigning employees to the fit jobs, facilitating the improvement of organizational efficiency. (2) AI-based method could accurately judge the development potential of employees, which is helpful for many downstream tasks in the field of HR, such as high potential talent assessment, position system design, and so on.

  • Limitations and future directions: (1) Generally, there are many kinds of promotion such as horizontal or vertical [189], how to judge the development potential of different types of promotion process should be considered. (2) Given the inherent bias in promotion data, addressing data imbalance using data mining methods has been a continuously explored research direction [266, 214]. (3) Promotion is a continuous and dynamic process. How to consider the potential change over time after promotion, that is, to maintain the development potential continuously is an important issue. (4) LLM-based methods could be more extensively considered for promotion assessments, such as evaluating experience completion.

3.3.3 Turnover Prediction

In the stable development of any organization, employee turnover stands as a critical influencing factor. Whenever a fully integrated employee departs from the organization for any reason, it leads to varying degrees of structural incompleteness, thereby resulting in a decline in productivity [1]. However, early identification of employee turnover tendencies, combined with strategic preparation, can substantially alleviate the productivity losses linked to employee turnover. In recent years, an expanding body of literature on talent management and organizational behavior has emerged, with a plethora of machine learning methods being employed to predict employee turnover, yielding remarkable results [146, 7]. Mathematically, the turnover prediction problem can be formulated as follows:

Definition 3.8 (Turnover Prediction)

Given the history record itsuperscriptsubscript𝑖𝑡\mathcal{H}_{i}^{t}caligraphic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT in a certain period t𝑡titalic_t of the employee Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the target is to predict the probability of a turnover P(Yi=1it)𝑃subscript𝑌𝑖conditional1superscriptsubscript𝑖𝑡P(Y_{i}=1\mid\mathcal{H}_{i}^{t})italic_P ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ∣ caligraphic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ).

Many data mining techniques have been introduced to investigate employees turnover prediction [10, 373, 477, 133, 144]. For instance, Nagadevara et al. used five data mining techniques to predict turnover, including MLP, Logistic Regression, Decision Tree, etc [302]. The results reveal that absenteeism and lateness, job content, demographics, and experience in the current team are strong predictors of turnover. Considering the potential noise introduced by a multitude of employee characteristics, some studies employ techniques such as feature filtering algorithms [432, 200] and feature weighting [452, 72] to eliminate or reduce unnecessary information. Meanwhile, oversampling techniques such as synthetic minority over-sampling technique, adaptive synthetic, and borderline synthetic minority oversampling technique have been validated to effectively address the issue of data imbalance in turnover prediction [266, 214]. Additionally, some scholars classify employees before conducting turnover prediction to effectively enhance predictive accuracy by accounting for inter-individual variability [193, 416]. Tailoring feature modeling to the specific characteristics of particular industries has also yielded favorable outcomes in turnover prediction, spanning industries such as apparel [344], IT [38, 292] and hospitality [203].

Besides predicting turnover based on static features, considerable attention is invested in capturing the dynamic factors, especially the evolving neural network-based methods, which have demonstrated significant expressive ability [233]. In this phase, Teng et al. investigated the contagious effect of employee turnover on an individual and organizational level, respectively [387, 386]. Specifically, they developed two LSTM cells to process peers’ turnover sequence and environmental change, as well as a global attention mechanism to evaluate the heterogeneous impact on potential turnover behavior. The experiments conducted on the dataset provided by a high-tech company in China demonstrate the effectiveness of their proposed framework, including profile information and turnover records. Similarly, the authors in  [465] utilized vast career trajectory data to build a heterogeneous company job network, integrating macro-level company position information into individual career transition prediction using the Dual-GRU model. On this basis, Hang et al. further modeled employee turnover from both internal and external views [165]. For the internal component, they captured the influence of close collaborators and colleagues with similar skills using a graph convolutional network. From the external-market view, they connected employees and external job markets through shared job skills. Finally, both internal and external information is fed into BiLSTM and survival analysis for turnover predictions. Furthermore, Subha et al. employed principal component analysis for feature extraction, followed by turnover prediction using CRF-BiLSTM-CNN, yielding significantly improved performance [380].

Job Satisfaction In this phase, another problem also receives considerable attention, namely, how to measure the employee’s job satisfaction. Employee’s job satisfaction is a significant determinant of employee turnover. Effectively measuring employee job satisfaction helps reduce the likelihood of turnover and maintains organizational stability. [285] Traditional approaches in this direction are based on self-reported questionnaire surveys, which are time-consuming and cannot be applied to large organizations. Recently, AI-based technologies have been introduced to automatically analyze job satisfaction from various aspects. Overall, the prediction of job satisfaction can be formulated as a binary classification problem, defined as follows:

Definition 3.9 (Job Satisfaction Prediction)

Given a set of independent variables Xisubscript𝑋𝑖X_{i}italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of the employee Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the target is to predict the probability of employee job satisfaction P(Yi=1Xi)𝑃subscript𝑌𝑖conditional1subscript𝑋𝑖P(Y_{i}=1\mid{X_{i}})italic_P ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 ∣ italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ).

For instance, Arambepola et al. explored the influence of job-specific factors on job satisfaction level by combining both the employee’s background data and company-related factors, where several classifiers, including Random Forest, Logistic Regression, and SVM, were used for the prediction of the job satisfaction level of software developers [22]. Saha et al. proposed to assess job satisfaction by leveraging large-scale social media, i.e., employees’ Twitter post dataset, where word frequency statistics, lexical analysis, and sentiment analysis have been conducted to extract features from textual data [346]. Accordingly, multiple classifiers, such as SVM and MLP, were used to predict employees’ job satisfaction. Furthermore, Saleh et al. utilized an online questionnaire for data collection and employed the artificial neural network and decision tree for predicting employee job satisfaction [350]. In addition to the aforementioned classification models, Mcjames et al. developed a causal inference machine learning approach to identify practical interventions for improving job satisfaction [284]. This approach was based on the TALIS 2018 dataset, which provides a representative sample of school teachers. Devi et al. achieved improved prediction of employee job satisfaction by employing a weighted averaging ensemble method with logistic regression [112].

Evaluation. Generally, both turnover prediction and job satisfaction prediction are binary classification tasks. Common evaluation metrics include accuracy, precision, recall, F1-score, and ROC-AUC.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI-based algorithms can help organizations prepare for employee turnover in advance, thus reducing the productivity loss resulting. (2) They can assist organizations perceive employee satisfaction, enabling them to adapt talent strategies to retain valuable employees, thereby maintaining the stability of organizational development.

  • Limitations and future directions: (1) Current turnover prediction research relies on limited real-world data, creating a gap between research and industrial application, significantly impacting organizational practicality [60, 130]. (2) Current research on this topic has limitations in the factors it considers. Future investigations could explore the impact of globalization, digitalization, and online business factors on individual turnover rates and job satisfaction for a more comprehensive understanding [146].

3.3.4 Career Mobility Prediction

In the rapidly evolving labor market, the deployment of AI for career mobility prediction significantly enhances personalized career planning. By analyzing extensive datasets on career trajectories in the labor market, AI-based methods can provide tailored recommendations that align with future employment opportunities. Following [465, 433], we define the career mobility prediction problem as follows:

Definition 3.10 (Career Mobility Prediction)

Given the career path {i,i}subscript𝑖subscript𝑖\{\mathcal{H}_{i},\mathcal{I}_{i}\}{ caligraphic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , caligraphic_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } of employee Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, where it={cit,pit,dit}subscriptsuperscript𝑡𝑖subscriptsuperscript𝑐𝑡𝑖subscriptsuperscript𝑝𝑡𝑖subscriptsuperscript𝑑𝑡𝑖\mathcal{H}^{t}_{i}=\{c^{t}_{i},p^{t}_{i},d^{t}_{i}\}caligraphic_H start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = { italic_c start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_p start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_d start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } records the work experience of Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT at company cit,subscriptsuperscript𝑐𝑡𝑖c^{t}_{i},italic_c start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ,, position pitsubscriptsuperscript𝑝𝑡𝑖p^{t}_{i}italic_p start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with duration ditsubscriptsuperscript𝑑𝑡𝑖d^{t}_{i}italic_d start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and isubscript𝑖\mathcal{I}_{i}caligraphic_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT stands for the personal information. The goal of career mobility prediction is to learn a model M𝑀Mitalic_M, which can predict the employee Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT’s next K𝐾Kitalic_K-step career move, including future company cit+Ksubscriptsuperscript𝑐𝑡𝐾𝑖c^{t+K}_{i}italic_c start_POSTSUPERSCRIPT italic_t + italic_K end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, position pit+Ksubscriptsuperscript𝑝𝑡𝐾𝑖p^{t+K}_{i}italic_p start_POSTSUPERSCRIPT italic_t + italic_K end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, duration dit+Ksubscriptsuperscript𝑑𝑡𝐾𝑖d^{t+K}_{i}italic_d start_POSTSUPERSCRIPT italic_t + italic_K end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and other relevant information.

Refer to caption
Figure 7: The illustration of Career Mobility Prediction.

Generally, as illustrated in Figure 7, the task of career mobility prediction can be addressed through time series analysis, where the career path (also known as career trajectory) is treated as a sequence of events [236, 289, 433]. Most existing work focuses on predicting next-step career mobility, i.e., K=1𝐾1K=1italic_K = 1. Specifically, Li et al. were the first to design a contextual LSTM model that integrates the profile context and career path dynamics simultaneously to predict the next company/position of talents [236]. To provide a fine-grained prediction, researchers have developed several methods for modeling career trajectories that predict both the next employer and the corresponding job duration. Meng et al. employed a hierarchical neural network structure with an embedded attention mechanism to characterize internal and external job mobility [289]. Moreover, Wang et al. introduced a temporal encoding mechanism that handles dynamic temporal information [400]. Macro-level job transition behavior may also impact individual career choices. To address this, Zhang et al. constructed a heterogeneous company-position network based on massive career trajectory data and integrated macro information from the company-position into personal career move prediction [465]. Meanwhile, He et al. expanded the prediction tasks to include forecasting salary levels and company sizes, utilizing LSTM and CNN to construct the model [170]. Recently, several studies have focused on employing pre-training techniques to model career trajectory representations, aiming to enhance performance in career mobility prediction [108, 459]. For instance, Decorte et al. developed CareerBERT, a model built upon the BERT framework, to analyze sequences of work experiences—including job titles and descriptions—alongside corresponding ESCO occupation sequences within career trajectory data [108]. They utilized contrastive learning to fine-tune this pre-trained model. Following this, the authors introduced a map** network designed to predict future ESCO occupations based on sequences of work experiences.

Unlike previous work that focused solely on predicting the “immediate” next career move, Yamashita et al. utilized the transformer architecture to predict one’s future career pathway as a “sequence”—specifically, the next K𝐾Kitalic_K steps of career movement [433]. Furthermore, some researchers have incorporated various trajectory rewards, such as company ratings, staying probabilities, and salary ranges, into the prediction of future career paths [154, 155, 26]. For instance, Guo et al. propose an intelligent sequential career planning system via stochastic subsampling reinforcement learning, which is capable of finding globally optimal career paths for talents [154].

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI-based methods can leverage the career trajectory data inherent in the labor market to provide personalized career path predictions and recommendations for individuals. (2) Existing AI-based career mobility prediction models can forecast career movements over short and long-term time spans.

  • Limitations and future directions: (1) As emerging events like COVID-19 significantly disrupt the labor market, the challenge of develo** cross-industry or cross-domain career planning has emerged as a pressing issue. Given that career trajectory data are often sparse in cross-industry mobility contexts, existing methods may struggle to handle such scenarios. Consequently, further exploration of career mobility prediction techniques, particularly those based on transfer learning and few-shot learning, is necessary. (2) Due to unique events or reasons, some career trajectories often lack generalizability. Addressing the noise from these special factors in data is a critical research direction. (3) Existing research often overlooks the interpretability of recommended results. Although some studies [355] have discussed this issue, a significant gap exists in systematic research on this topic. (4) After analyzing career mobility, existing methods fall short in providing relevant upskilling pathways that could assist institutions and employers in develo** proactive training programs.

3.4 Summary

In summary, AI-based approaches have been applied in talent management for recruitment, assessment, and career development. Specifically, the majority of research efforts have focused on talent recruitment, which may be attributed to the rapidly growing demand brought about by the development of HRIS like ATS and online recruitment platforms. Moreover, the application of AI-based approaches to comprehensively assess talent and plan career development has also attracted increasing attention in this era of information. Compared to traditional methods, AI-based models enhance the efficiency, objectivity, and accuracy of various decision-making scenarios in talent management by leveraging not only extensive historical data but also powerful technologies such as deep learning and pretrained language models. Although AI-based models are rapidly progressing in a range of talent management applications, there remains a substantial gap in research concerning the fairness and interpretability of these models, as well as issues related to data privacy protection and the challenges associated with data sparsity and bias in some specific contexts.

4 Organization Management

The organization management is arguably the art of getting talents to cooperate and lead the entire organization toward a common predefined goal. Roughly, organization management consists of the management about organizational processes, structures, technologies, identities, forms, and people. Among these components, people’s behavior is mainly reflected in the study of process. At the same time, since organizational network is an important formulated element of these study, we first summarized the AI-related analysis in organizational networks. Generally, the complex relationships among employees and organizations will naturally form a network structure, and the AI-related techniques for organizational network analysis aim to help understand the importance of critical connections and flows in an organization by modeling the special network structure [446], thus serves the downstream management applications, such as organizational turnover prediction [386] and high-potential talents identification [447]. Next, AI-related analysis in organizational structures is what we focus on. Because the stability and development of organizational structure is the most important part, we mainly focus on organizational stability analysis. To study the stability of the organization, some studies propose to analyze the composition of the organization from the formation and optimization perspectives [99, 116]. Besides, several studies explore the compatibility between employees and organizations [381, 383]. Finally, we have summarized an important part of organizational forms, which is the incentive. In order to motivate the talents in the organization to perform well, some AI-related studies conduct the organizational incentive analysis, which mainly focuses on two important tasks in human resources, namely job title benchmarking [463] and job salary benchmarking [288] respectively.

4.1 Organizational Network Analysis

In modern organizations, it is common for employees to build informal “go-to” teams to facilitate business collaboration beyond the organizational structure. Organizational social networks often emerge spontaneously, forming communicative and socio-technical connections. In this context, Organizational Network Analysis (ONA) serves a crucial role. It aids in understanding the significance of these critical connections and information flows within an organization, making leaders aware of the importance of vibrant communities and employees to be more targeted and effective in business operations. In this subsection, we will introduce AI-related techniques for organizational network modeling and introduce a classic application for ONA, namely high-potential talent identification.

TABLE IV: The table of collected papers related to organizational management.
Task Method Data Reference
Organizational Network Analysis
Organizational Network Modeling Representation learning Organizational network dataset  [446]
Organizational Network Modeling GCN, RNN Working related records  [386]
Organizational Network Modeling Community Search Public data, in-firm data  [114]
High-potential Talent Identification GNN, LSTM Organizational network dataset  [447]
Organizational Stability
Team formation Heuristic algorithm DBLP, IMDb [205]
Team formation Greedy algorithm DBLP [489]
Team formation Variational Bayes Network DBLP [164]
Team formation Hybrid approach Crowdsourcing data [256]
Team formation Greedy algorithm Freelancer dataset [40]
Team formation Adaptive algorithm Crowdsourcing data [18]
Team optimization Graph kernel DBLP, Movie and NBA [237, 242]
Team optimization Deep learning GitHub, DBLP [476]
Team optimization Reinforcement learning Movie, DBLP [482]
Person-Organization Fit CNN, LSTM In-firm data  [381, 383]
Person-Organization Fit KNN Talent data  [24]
Organizational Incentive Analysis
Job title benchmarking Representation Learning OPNs  [463]
Job title benchmarking Semantic relatedness modeling CEDEFOP classify Online Job Vacancies  [272]
Job title benchmarking Graph learning Working experiences and resume data  [486]
Job title benchmarking BiLSTM, Unsupervised representation learning A labeled job similarity dataset  [455]
Job title benchmarking Transformer, RNN Employee career paths in IT field  [250]
Job title benchmarking Bert, Graph embedding Resume data  [434]
Job salary benchmarking Matrix completion Job posting  [288, 287]
Job salary benchmarking Matrix equation Job post and review data  [186]

4.1.1 Organizational Network Modeling

In the real scenario, abundant talent data can be utilized to construct an organizational network, unveiling the intricate relationships among employees as they form project teams or forge alliances across different groups. An illustration of an organizational network is shown in Figure 8. However, extracting useful information to construct the network and obtaining meaningful knowledge to support managerial decisions are still difficult. Traditionally, organizational network is usually constructed by questionnaires like formal or informal relationships, which are labor-intensive and subjective. To improve the accuracy and efficiency of organizational network construction, many researchers are focusing on using network embedding techniques to model the organizational network and capture knowledge from this [326, 150]. Without loss of generality, the generalized organization network can be defined as follows:

Definition 4.1 (Organizational Network Modeling)

Organizational network is defined as G=(V,E)𝐺𝑉𝐸G=(V,E)italic_G = ( italic_V , italic_E ), where V𝑉Vitalic_V denotes the node set representing employees and departments (or organizations), and E𝐸Eitalic_E denotes the edge set representing the relationship between nodes, such as the belonging relationship between employees and department, the frequency of communication (e.g., email or instant message) and the reporting lines between employees. Besides, the employee node has a specific profession, which indicates the type of their work (e.g., engineer and product). Meanwhile, each employee and department node has some work-related attributes (e.g., length of service, job level). Based on the organizational network, organizational network modeling aims to capture knowledge from this network for supporting talent management-related tasks.

Refer to caption
Figure 8: The illustration of the organizational network.

The network embedding describes the network as low-dimension vectors and further serves the downstream applications. For instance, Ye et al. proposed a multiplex attentive network embedding approach for modeling organizational networks in a holistic way [446]. In their work, the organizational network is composed of multiple communication interactions among employees. They generated embedding for employees based on the random walk strategy [326] with k-core and approximated shortest path algorithm. Furthermore, they proposed a relational transition-based approach to represent each department. In this way, the learned representation can be leveraged for several talent management tasks, including employee turnover, performance prediction, and department performance prediction. Besides, Teng et al. exploited the network fusion technique for organizational turnover prediction [386]. They concentrated on modeling the relationships among organizations. Specifically, they demonstrated the correlation between the topology of organizational networks and organizational turnover. To this end, they constructed a turnover similarity network based on the multiple organizational social networks, and took advantage of the GNN to learn comprehensive knowledge from these topological structures, which were further used for organizational turnover prediction. Apart from that, there are also some other applications. For example, Dong et al. studied the problem of cross-group community search on the labeled graph, namely Butterfly-Core Community (BCC) search [114]. Specifically, they demonstrated the BCC problem is NP-hard, and proposed an approximated algorithm to solve it. The proposed algorithm was evaluated on a real-world organizational network from Baidu Inc, which can effectively find communities formed by cross-group collaborations given two employees with different positions.

Evaluation. Generally, the performance of Organizational Network Modeling is assessed by various downstream tasks in the human resource management field like turnover prediction task or department performance task [446], which are binary classification problems, in these cases, the evaluation metrics indeed are Accuracy, F1-score and AUC. Furthermore, the turnover prediction is also often an application scenario of organizational network modeling, in this case, the evaluation metrics could be MAE and MSE for the percent of employees who quit in the organization [386].

\diamondsuit Takeaway.

  • Advantages of AI technology:(1) Existing AI-based methods for organizational network modeling could efficiently represent employees or departments in the organization avoid subjective biases, which could help the human resources staff make suitable decision. (2) AI-based methods could generate the complete social information of employees or departments rather than the direct communication targets traditionally. (3) AI-based modeling methods could avoid subjective biases due to individual preference.

  • Limitations and future directions: (1) Existing methods mostly represent the edge of an organizational network as communication, but there are various relationships in the organization, such as formal and informal communications, project cooperation and so on. (2) Existing modeling evaluation is mainly connected with classic downstream tasks such as turnover prediction, but an important application of modeling is the visual presentation after modeling, which should involve more expert evaluation metrics. (3) In the Human-AI symbiosis era, the interaction and corresponding evaluation should be considered more [469]. (4) Although dynamic network is considered in some research, the environment inside and outside the organization changes rapidly, more fine-grained time slicing or continuous-time organizational network modeling should be paid more attention to in the future.

4.1.2 High-potential Talents Identification

High-potential talents (HIPOs) possess leadership abilities, business acumen, and a strong drive for success, making them more likely to emerge as future leaders within organizations when compared to their peers [231]. Identifying HIPOs has always been a major issue in human resource management, it plays significant roles in the execution of organizational strategy and the optimization of organizational structure [371]. At the same time, they are strategic assets for companies to achieve sustainable competitive advantages [457]. Due to their strategic importance, high-potential talent identification and retention are considered as the most critical component of business strategy.

The traditional methods for HIPOs identification usually rely on subjective selection of HR experts. They primarily focus on evaluating certain talent factors, such as communication skills, teamwork, and self-learning [325, 137]. However, these manually selected factors may lead to unintentional bias and inconsistencies [313]. Recently, with the development of ONA, objective data-driven HIPOs identification has become possible. The rationale behind this is that HIPOs usually perform more actively and have higher competencies than their peers to accumulate their social capital during their daily work [447]. We can detect HIPOs implicitly through social information. Formally, the HIPOs identification problem based on the organizational network can be formulated as:

Definition 4.2 (HIPOs Identification)

Given a new employee v𝑣vitalic_v, who joined the company in the t𝑡titalic_t-th time slice, and a set of organizational network G={Gt,Gt+1,Gt+k}𝐺subscript𝐺𝑡subscript𝐺𝑡1subscript𝐺𝑡𝑘G=\{G_{t},G_{t+1},...G_{t+k}\}italic_G = { italic_G start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_G start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT , … italic_G start_POSTSUBSCRIPT italic_t + italic_k end_POSTSUBSCRIPT }, where Gtsubscript𝐺𝑡G_{t}italic_G start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT represents organizational network of time slice t𝑡titalic_t. The objective is to develop a model f(v,G)=y𝑓𝑣𝐺𝑦f(v,G)=yitalic_f ( italic_v , italic_G ) = italic_y to predict whether v𝑣vitalic_v is a HIPO (i.e., y=1𝑦1y=1italic_y = 1) or not (i.e., y=0𝑦0y=0italic_y = 0).

To solve this problem, Ye et al. proposed a neural network-based dynamic social profiling approach for quantitative identification of HIPOs, which focuses on modeling the dynamics of employees’ behaviors within the organizational network [447]. In particular, they applied GCN and social centrality analysis to extract both local and global information in the organizational network as social profiles for each employee. Then they adopt LSTM with a global attention mechanism to capture the profile dynamics of employees during their early careers. Finally, they evaluated their model on real-world talent data, which clearly validates the effectiveness and interpretability of the proposed model. This method combines longitudinal social network data to embed employees’ work performance during different periods, reflecting the employees’ development and growth over time like social capital increasing. Furthermore, there are another network construction methods using the cooperation experience. For example, as for high-potential scholars identification, Yin et al. constructed the innovation potential with the disruption of technology and science of each paper, then used the co-authorship network in each year to model the development of scholars. Combining social capital theory and LSTM, they measured social capitals in different years to predict future innovative potential [448].

Evaluation. As introduced in Definition 4.2, HIPOs identification is a binary classification problem, therefore, the evaluation metrics are usually Accuracy, Precision, Recall, and F1 score [448, 447].

\diamondsuit Takeaway.

  • Advantages of AI technology:(1) Existing AI-based methods could significantly reduce the labor cost in 360-degree feedback of HIPOs and improve the decision speed. (2) HIPOs identification usually includes leaders’ biases, AI-based modeling methods could avoid subjective biases with fixed features or networks.

  • Limitations and future directions: (1) Existing methods mostly concentrate on the social information extraction, however, there are many other indicators for HIPOs like abilities, competencies, traits, leadership and so on [201], these soft skills should be considered into the model. (2) Similar with the organizational network modeling, although a dynamic network is constructed in month, year and other time slices, more fine-grained time slicing or continuous-time social network information extraction should be paid more attention to in the future. (3) As for evaluation, HIPOs could be assessed by various aspects like promotion, turnover, or performance after selection [84], therefore, selection results are usually not the final real result of if these employees are HIPOs, and the real work performance of these employees or expert evaluation should be used as the real evaluation result to optimize the prediction model.

4.2 Organizational Stability Analysis

Generally, organizational stability means the stability of the structure of the organization itself and the compatibility between employees and organizations. An organizational structure defines how activities such as task allocation, coordination, and supervision are directed toward the achievement of organizational aims [331]. Considering the availability of data, existing AI-related research has largely explored the formation and optimization of organizations (e.g., teams) under certain goals. In this part, we will introduce AI-related techniques for organizational stability from three perspectives, namely team formation, team optimization, and person-organization fit.

Refer to caption
Figure 9: The illustration of team formation.

4.2.1 Team Formation.

Team formation is an important issue in organizational management. On the one hand, team formation is effective for responding to urgent tasks, on the other hand, team formation mainly could help improve the efficiency of the entire organization. In brief, team formation could be defined as discovering a team of experts that collectively cover all the required skills for a given project, as shown in Figure 9. Whereas it is proven to be NP-hard [224], this requirement still needs to be solved in many real-world scenarios, such as team discovery in a social network, which contains professionals who provide specialized skills or services.

Definition 4.2 (Team Formation). Given a team collaboration in the t𝑡titalic_t-th time slice as 𝒞t={(s,e)ts𝒮,e}subscript𝒞𝑡conditional-setsubscript𝑠𝑒𝑡formulae-sequence𝑠𝒮𝑒\mathcal{C}_{t}=\left\{(s,e)_{t}\mid s\subseteq\mathcal{S},e\subseteq\mathcal{% E}\right\}caligraphic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = { ( italic_s , italic_e ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∣ italic_s ⊆ caligraphic_S , italic_e ⊆ caligraphic_E }, where 𝒮𝒮\mathcal{S}caligraphic_S and \mathcal{E}caligraphic_E represent the sets of skills and experts. The target of Team Formation is to minimize costs in teamwork (e.g., communication costs) or to succeed in their goals.

For instance, Kargar et al. proposed a method to find the object team with minimal communication cost as well as personnel cost of the project [205]. Specifically, they used a graph to model a social network where nodes represent experts and formulate the task as a constrained bi-criteria optimization problem. Since it is proved that the problem of minimizing the combined cost function is still NP-hard, the authors efficiently solve the problem with an approximation algorithm and three heuristic algorithms in polynomial time. Later, Zihayat et al. took both communication cost and experts’ authority into account and proposed greedy algorithms to solve the optimization problem [489]. Since these team formation algorithms are based on very different criteria and performance metrics, Wang et al. implemented these algorithms using a common platform and evaluated their performance with several real datasets [405]. However, these studies have limitations in terms of scalability and fail to effectively manage the dynamic nature of expert networks. To this end, instead of searching over the graph representation of the expert network, Hamidi et al. searched for variational distributions of experts and skills in the context of a team [164]. To be specific, they employed a variational Bayesian neural network to form the optimal team, contributing to a better performance than prior state-of-the-art.

In many scenarios, the purpose of team formation is to anticipate future expert teams by combining sequences of expert skills and the evolution of collaborative relationships over time. Hamidi et al. defined an optimal team as a team observed in the past, and proposed a variational Bayesian neural network to estimate the map** function from the skill power set to the expert power set, finally forming an optimal expert team e for a required subset s of skills [163]. Fani et al. presented a streaming-based neural model training strategy to estimate the map** function from the stream of the collaborative set and the skill subset to the expert subset to find the best possible team [124]. When recommending future teams by learning the expert and skill distributions of past successful teams, it may happen that a few experts perform most of the successful collaborations while the majority of experts rarely participate. The long-tailed distribution of the training data tends to lead to the overfitting of the model, to address the problem, Dashti et al. proposed three negative sampling heuristics [100].

Another scene that draws wide attention is the fast-growing Online Labor Marketplaces, which provide a sharp decrease in communication costs. For example, Liu et al. first implemented team formation in crowdsourcing markets with consideration of the impact of teamwork [256]. The study designs a mechanism that combines the greedy selection rule and a special payment scheme, obtaining various desirable properties, such as efficiency, profitability, and truthfulness. In addition, Barnabo et al. considered the fairness of algorithms related to these online marketplaces [40]. They formalized the Fair Team Formation as the problem of finding the cheapest team that can complete the task and, at the same time, that counts the same number of people from two not overlap** classes. Consequently, four algorithms are designed to solve the problem and experiment on real-world data to confirm their effectiveness. Yet, most of the works focus on the offline version of the team formation problem, i.e., the tasks to be completed are a-priori known. To this end, Anagnostopoulos et al. implemented the problem of online cost minimization, where the goal is to minimize the overall cost (paid on hiring, outsourcing, and salary costs) of maintaining a team that can complete the arriving tasks [18]. Moreover, the study considers a more complex case of outsourcing, i.e., hiring, firing, and outsourcing decisions can be taken by an online algorithm leading to cost savings with respect to alternatives.

Evaluation. Broadly speaking, the performance of team formation is assessed in two main ways: communication cost and collaboration prediction.

  • Communication Cost: Nemec et al. used minimizing communication costs, (i.e., by minimizing a distance function in a graph network composed of experts), to form teams and evaluate the results [306].

  • Collaboration Prediction: Fani et al. predicted the need of experts for a successful situation by learning from past teams that collaborated successfully, and following a map** function of collaboration and skills to experts [124].

\diamondsuit Takeaway.

  • Advantages of AI technology: (1)Existing methods can help teams accurately match candidates’ skills and experience with the needs, leading to intelligent hiring and talent recommendations. By analyzing large amounts of data, AI-based methods can identify the best candidates and provide personalized recruiting solutions to improve the efficiency and quality of recruiting. (2)Based on historical data and trend analysis, AI-based approaches can predict team member turnover and team expansion needs, hel** teams make timely staffing adjustments and expansion plans. This can help teams better cope with changes and challenges and maintain a competitive edge.

  • Limitations and future directions: (1)While AI technology can provide efficient tools and systems, it may lack the human touch and human interaction. Human emotions and communication are crucial in teamwork, and AI technology cannot completely replace this aspect. (2) Current research may have difficulty explaining its results and decision-making process, which may reduce the team’s trust and acceptance of the algorithm. The lack of interpretability may make it difficult for team members to understand why particular team members were chosen or particular actions taken. Future technologies will need to focus more on human emotions and emotional intelligence. This may include develo** algorithms that understand and respond to human emotions, as well as designing user interfaces and interactions that are more human and approachable. It is also important to develop algorithms that provide explanatory and transparency. This means that algorithms need to be able to explain their results and decision-making processes so that team members can understand and trust the algorithm’s decisions.

4.2.2 Team Optimization.

Team optimization means the optimization of existing teams, which is beneficial for responding to changes in team tasks. Indeed, there are two ways to optimize teams, including team member replacement and team expansion.

Definition 4.3 (Team Optimization). For a team T𝑇Titalic_T, the purpose of team optimization is to recommend the group of candidate members i𝑖iitalic_i based on their fit with the existing team T𝑇Titalic_T when the team is expanding or to recommend the group of new members (sub-teams) i𝑖iitalic_i to replace departing members (sub-teams) isuperscript𝑖i^{\prime}italic_i start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in the team when members are replacing them using the GNN approach.

Two key problems within the scope of team optimization are team member replacement and team expansion, as illustrated in Figure 10. Specifically, team member replacement is put forward first by [237], which aims to find a good candidate to best replace a team member who becomes unavailable to perform the task. To tackle this problem, they introduced the concept of graph kernels that takes into account the interaction of both skill and structure matching requirements. Furthermore, the study proposes a series of effective and scalable algorithms for this problem. In [242], the authors further took the synergy between skill similarity and structural similarity into consideration, instead of considering the two aspects independently. Later, [241] proposed a new graph kernel that evaluates the superiority of candidate subteams in a holistic way, which can be freely adapted to the needs of the situation. An effective pruning strategy is used in the algorithm to reduce the kernel computation by exploiting the similarity between the structures of the candidate teams, which can output more human-agreeable recommendations compared to previous studies. What’s more, [180] used a clustering-based GNN framework to capture team network knowledge for flexible subteam replacement; and incorporated a self-supervised positive team comparison training scheme into the model for improved team-level representation learning and unsupervised node clustering to reduce candidate objects for fast computation.

In addition, some effort has been paid on team expansion. Zhao et al. formally defined the problem in collaborative environments and proposed a neural network-based approach, considering three important factors (team task, existing team members, and candidate team member) as well as their interactions simultaneously [476]. However, most works on team optimization treat teams as a static system and recommend a single action to optimize a short-term objective. To this end, Zhou et al. proposed a deep reinforcement learning-based framework to continuously learn and update its team optimization strategy by incorporating both skill similarity and structural consistency [482].

Evaluation. An important metric in team optimization is the fitness score. In general, selecting members to join a team is similar to a recommendation problem while the team is expanding. In recommending candidate members to the team, Zhao et al. determined the members to join the team by calculating the fitness scores between the candidate members and the team [476].

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) Existing methods can intelligently match the right team members based on individual skills, experience, and preferences. By analyzing large amounts of data, the most suitable candidates can be found more accurately, leading to improved team performance and efficiency. (2) Compared with traditional methods, the extant method can help ensure diversity and inclusion in teams. It can identify potential biases and tendencies and provide objective recommendations to ensure diversity in team members, thus promoting innovation and better decision-making.

  • Limitations and future directions: (1)Although AI technology can analyze large amounts of data and patterns, it still lacks human intuition and judgment when dealing with complex situations and interpersonal relationships. For example, in team member turnover, AI may not be able to fully take into account individual emotions, motivations, and interpersonal relationships, resulting in less comprehensive or humanized decision outcomes. (2)Training data for existing methods may be biased and unbalanced, which may affect the predictive accuracy and fairness of the model. For example, in team member turnover, if an AI-based model overly relies on historical data, it may perpetuate past biases and inequalities, resulting in fair opportunities for new team members. Future AI technologies should focus more on interpretability and transparency so that team members can understand and trust the decision-making process of AI-based models. By develo** explainable AI-based models and algorithms, team acceptance of AI technology can be increased, promoting human-machine collaboration and co-development.

Refer to caption
Figure 10: The illustration of team optimization.

4.2.3 Person-Organization Fit

Person-organization fit (P-O fit) refers to the compatibility between employees and their organizations. In fact, P-O fit has been widely recognized as an effective indicator of proactive talent management, and it has a significant impact on outcomes such as work attitudes, turnover intentions, and job performance [225]. In the domain of organizational behavior, most studies measure P-O fit based on the similarity between organizational profile and employees’ profile. Figure 11(a) presents the classical P-O fit modeling process. At first, experts collect information with questionnaires to extract employee and organization profiles and manually design metrics. Then, the statistical methods are applied to measure the congruence between an employee and an organization as P-O fit score. However, this process is labor-intensive and subjective, which is difficult to apply to real-world applications. To this end, AI-driven techniques are proposed to automatically extract the profiles and model P-O fit in a dynamic, quantitative, and objective manner. More specifically, the P-O fit problem is defined as follows:

Definition 4.3 (P-O Fit Problem)

Given a sequence of time periods, each associated with an organization network Gt=(V,Et)subscript𝐺𝑡𝑉subscript𝐸𝑡G_{t}=(V,E_{t})italic_G start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = ( italic_V , italic_E start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ), where the nodes V𝑉Vitalic_V are employees and links Etsuperscript𝐸𝑡E^{t}italic_E start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT indicate their relationships (e.g., reporting relationship) in the time period t𝑡titalic_t. Each node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT has a feature vector xt,isubscript𝑥𝑡𝑖x_{t,i}italic_x start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT, representing their traits and behaviors in the t𝑡titalic_t-th time period. The target of Person-Organization Fit is to learn to model the compatibility of each node on the tree with their local environment, seize their dynamic nature and patterns, and accordingly predict relevant talent outcomes y𝑦yitalic_y.

Refer to caption
(a) A classic P-O fit modeling process.
Refer to caption
(b) AI-driven P-O fit modeling process.
Figure 11: The overview of P-O fit modeling.

To solve this problem, Sun et al. proposed a new P-O fit modeling process based on AI technology [381], as shown in Figure 11(b). Specifically, they first extracted features automatically from collected employees’ in-firm data and generated person profiles by dimension reduction on these features. Then, they exploited the organization profile by combining the organization’s structure with the profiles of the employees and extracted a unique environment profile for each employee based on their corresponding positions. Finally, they applied a deep neural network to achieve a more complicated map** from person and environment profiles to a P-O fit representation. To capture the dynamic nature of P-O fit and its consequent impact, they exploited an adapted Recurrent Neural Network with an attention mechanism to model the temporal information of P-O fit. Later, in [383], Sun et al. further proposed the attentional features extraction layers that can distinguish individualized relation-level and individual-level influence differences for different nodes on the organizational tree. This largely enhanced the performance of person-organization compatibility modeling and improved the interpretability. Combining the person-organization fit theory, Artar et al. used K-Nearest Neighbors algorithm to cluster the employees through P-O fit representations like the number of former employers, and the number of years in the organization [24].

Evaluation. Generally, P-O fit is to learn the compatibility of each node on the tree with their local environment and predict relevant talent outcomes like turnover prediction and performance prediction, these tasks are all classification problems, therefore, Cross-Entropy and AUC are usually used as the evaluation metrics [383].

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) Traditionally, the movement of employees among departments has been task-oriented, and existing AI-based approaches take better account of adaptability so that employee performance can be improved over the long term. (2) Compared with the traditional linear modeling, the AI-based method is more accurate to model the nonlinear relationship in P-O matching.

  • Limitations and future directions: (1) Existing methods mostly concentrate on the organizational network information extraction including communication and report chain, however, there are many other indicators in profiles for P-O fit like age, academic, working years and so on [24], these indicators should be considered into the model in the future. (2) Existing methods do not take into account the dynamic change of employees and organization, for example, in the long term, the employee and the organization are matched, but the performance of the employee and the organization is not matched due to the impact of short-term tasks.

4.3 Organizational Incentive Analysis

Compensation and benefits (C&B) represent one of the most important branches of human resources, which plays an indispensable role in attracting, motivating, and retaining talents. It includes the process of determining how much an employee should be paid and deciding what benefits should be offered. In the past few decades, considerable efforts have been made in this research direction from the management perspective. Recently, the accumulation of massive job related-data enables a new paradigm for organizational incentive analysis in a data-driven view. In this part, we will introduce two classic data-driven tasks in C&B, namely job title benchmarking and salary benchmarking respectively.

4.3.1 Job Title Benchmarking

Job title benchmarking (JTB), as an important function in C&B, aims at matching job titles with similar expertise levels across various organizations (i.e., companies), which provides precise and substantial facilitation of job and salary calibration/forecasting for both talent recruitment and job seekers. Traditional JTB mainly relies on manual market surveys, which are expensive and labor-intensive. Recently, the popularity of online professional networks helps to accumulate massive career records, which provides the opportunity for a data-driven solution. Formally, JTB can be defined as follows:

Definition 4.4 (Job Title Benchmarking)

JTB is a process that matches job titles with similar expertise levels across various companies. Formally, given two job title-company pairs, i.e., (titlei,companyi𝑡𝑖𝑡𝑙subscript𝑒𝑖𝑐𝑜𝑚𝑝𝑎𝑛subscript𝑦𝑖title_{i},company_{i}italic_t italic_i italic_t italic_l italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_c italic_o italic_m italic_p italic_a italic_n italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) and (titlej,companyj𝑡𝑖𝑡𝑙subscript𝑒𝑗𝑐𝑜𝑚𝑝𝑎𝑛subscript𝑦𝑗title_{j},company_{j}italic_t italic_i italic_t italic_l italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_c italic_o italic_m italic_p italic_a italic_n italic_y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT), the objective is to determine whether the given pairs are on the same level.

To handle this problem, Zhang et al. proposed to construct a Job-Graph by extracting information from large-scale career trajectory data, where nodes represent job titles affiliated with the specific companies and edges represent the numbers of transitions between job titles [463]. They redefined JTB as a link prediction task on the Job-Graph by assuming that the benchmarked job title pairs should have a strong correlation with the link. Along this line, they proposed a collective multi-view representation learning model to represent job titles from multiple views, including graph topology view, semantic view, job transition balance view, and job transition duration view. Subsequently, they devised a fusion strategy to generate a unified representation from multi-view representation. Finally, they leveraged the similarity between these representations as an indicator for job title benchmarking. Following this line, subsequent studies have refined the job transition network. Specifically, Zhu and Hudelot [486] proposed enhanced job transition network with the bi-directional edges when job nodes have the same tag, for example,“automotive shop manager” and “purchasing manager” have the same tag named “manager”. At the same time, bi-directional edges between a job title and a tag, representing the “has/in” relationship, have been added into the job transition network. Furthermore, this research uses an open dataset from a Kaggle competition containing a collection of working experiences.

There is another way to solve the problem of job title benchmarking, which is the way of embedding. In these researches, job title generally is embedded with the auxiliary information like skill, employees and so on. Zbib et al. embedded the auxiliary skills under job titles, through the feedback based on the similarity calculation with the text encoding of the job title itself, the reasonable skill embedding under the job title can be learned and different job titles can be integrated [455]. Finally, the effect is verified by skill-based retrieval and text-based retrieval. Malandri et al. used the job contents under job title and generated embedding of job titles to cluster similar jobs [272]. Besides, Zha et al. aggregated job titles by decomposing the semantics of different modules of job titles [458].

The information provided by the job transition graph and embedding can also be combined to perform job title benchmarking. For example, JAMES proposed by Yamashita et al. combined the hyperbolic graph embedding generated from the job transition records, BERT embedding of original job titles, and syntactic embedding generated by the average similarity of original job titles and occupations in job taxonomy like European Skills/Competences, qualifications, and Occupations (ESCO) taxonomy [434]. Liu and Ge jointly trained the job embedding and employee embedding, which used the job contents. Specifically, the job context learning module is designed to use the job embedding to generate employee embedding, using the job transition information of the employee, where the observed context of positions are positive and a sample of negative positions through negative sampling is negative [250].

Evaluation. Generally, job title benchmarking could be evaluated by calculating the job text similarity or some downstream tasks as follows.

  • Similarity of Job. Job title embedding should improve the job similarity, in the  [250], job similarity is obtained from Amazon Mechanical Turk (MTurk), the human-labeling task is The similarity between jobs A and B is higher than the similarity between jobs A and C., The similarity between jobs A and B is lower than the similarity between jobs A and C. and The similarity between jobs A and B is almost the same as the similarity between jobs A and C. Finally, the average accuracy of the real comparison of the similarity between jobs A and B and the similarity between jobs A and C is adopted as a metric to evaluate the model effectiveness.

  • Clustering Effectiveness. There are two specific evaluations, which are intrinsic evaluation like the Mann-Whitney U-test, and extrinsic evaluation like classification metrics including precision, recall, f1-score, and accuracy [272].

  • Job Title Classification. Due to the existing of job taxonomy like ESCO, the classification of job titles into job taxonomy is also adopted to evaluate the model’s effectiveness. In this case, Precision@N and NDCG@N are used, with N being the top-N results produced by each model [434]. Similarly, some scholars use different classification way generated from the dataset so that other classification metrics like Macro-F1 and Micro-F1 are adopted [486].

  • Link Prediction. Due to the usage of the job transition network in job title embedding, link prediction is one of the most common tasks for these graphs or networks, in this case, AUC is generally used [434, 463].

  • Job Mobility Prediction. Different from the link prediction, job mobility prediction is a job-domain downstream task, generally, MAP@10, mean average precision at 10 jobs, is adopted [434]. More strictly, the accuracy of the next job prediction is adopted in some other research [486]. Besides, some commonly used metrics for job duration prediction, such as RMSE and MAE, are employed [458].

  • Text Ranking. Job recommendation is an essential application of job title embedding. Therefore, precision at 5 or 10 is commonly used in the evaluation of tasks such as retrieval [455].

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI-based job title benchmarking can effectively help organizations sort out the positions in the market and within the organization. However, the traditional method is basically manual and cannot comprehensively analyze all the positions in the organization, which avoids subjective biases. (2) Existing AI-based approaches can help improve the efficiency of the external market analysis for organizations, while providing the foundation for some meaningful downstream tasks of talent management, such as job pricing, job grading, and labor market competition analysis [257].

  • Limitations and future directions: (1) Job taxonomy currently used is quite diverse, such as ESCO, Occupational Information Network (O*NET), and modeling considering the adaptation of different taxonomy and research context is necessary for future research. (2) The existing methods are mostly based on the previous job transition network or the published job postings embedding job, and do not take into account the dynamic evolution of jobs over time. In the future, more consideration should be given to the life-long models. (3) In the process of external environment changes, many new jobs will appear. How to model these new jobs is also the future development direction.

4.3.2 Job Salary Benchmarking

Job salary benchmarking (JSB) refers to the process by which organizations obtain and analyze labor market data to determine appropriate compensation for their existing and potential employees [48]. Traditional approaches for JSB mainly rely on the experience from domain experts and market surveys provided by third-party consulting companies or governmental organizations [199, 329]. However, fast-develo** technology and industrial structure lead to changes in positions and job requirements, making it difficult to conduct salary benchmarking in a timely manner in dynamic scenarios.

In recent years, the prevalence of emerging online recruiting services, such as Indeed and Lagou, has provided the opportunity to accumulate vast amounts of job-related data from a wide range of companies, enabling a new paradigm of compensation benchmarking in a data-driven manner. Formally, the job salary benchmarking problem can be formulated as follows:

Definition 4.5 (Job Salary Benchmarking)

Suppose there are job positions i=1,2,3,I𝑖123𝐼i=1,2,3...,Iitalic_i = 1 , 2 , 3 … , italic_I and location-specific company j=1,2,3,,J𝑗123𝐽j=1,2,3,...,Jitalic_j = 1 , 2 , 3 , … , italic_J. Each position i𝑖iitalic_i has some features, e.g., bag-of-words, and each location-specific company j𝑗jitalic_j can be described as a list of features, e.g., location and industry. Given a combination of position and company (i,j𝑖𝑗i,jitalic_i , italic_j), the objective is to predict its salary s^ijsubscript^𝑠𝑖𝑗\hat{s}_{ij}over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT so that the similarity between s^ijsubscript^𝑠𝑖𝑗\hat{s}_{ij}over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT and real observation sijsubscript𝑠𝑖𝑗s_{ij}italic_s start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is maximized.

To address this problem, some scholars used the statistic machine learning methods to combine the company characteristics to predict the average salary of industries or economic activities [279]. Then, from the job-company salary matrix view, where each entry indicates the corresponding salary of a given job-company pair, the JSB problem can be regarded as a matrix completion task. Generally, matrix factorization (MF) is a widely used method for handling this task. It aims to factorize an incomplete job-company salary matrix into two lower-rank latent matrices, and use their dot product for estimating the possible salary of the missing entries. However, the intuitive method is too general to meet the various special needs of C&B professionals. To this end, Meng et al. proposed an expanded salary matrix by expanding the original job-company salary matrix with locations and time information for the fine-grained salary benchmarking [288]. Then they designed a matrix factorization-based model for predicting the missing salary information in the expanded salary matrix by integrating multiple confounding factors, including company similarity, job similarity, and spatial-temporal similarity. Further, Meng et al. designed a nonparametric Dirichlet-process-based latent factor model for JSB, which learns representation for companies and positions to alleviate the data deficiency problem. By conducting experiments on two large-scale real-world data, the effectiveness and interpretability of the proposed model have been proved [287]. Similarly, the matrix equation method is used in [186] to minimize the unbiased salary, company competitiveness in the salary of the same job, and inflation of the same job. Following another way of thought, some scholars proposed to construct the auxiliary network of skill requirements under job postings to model the relationship between the job and salary. Specifically, the auxiliary network can learn the representation of jobs through graph learning, so as to achieve similar jobs with similar salary learning effects, so as to effectively predict the salary of different jobs [382].

Evaluation. Generally, there are two evaluation ways for job salary benchmarking, which are job similarity and salary prediction. As for the job similarity task, the aim is to minimize the salaries among similar jobs, so MAE could be used to test the biases, Kendall Coefficient is adopted to evaluate the similarity of generated vectors of competitiveness (or inflation) and the ground-truth competitiveness (or inflation) [186]. As for salary prediction, due to this task being generally a regression problem, MAE, RMSE and Pearson relationship (PR) have been used [382, 287].

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) Existing methods can improve the efficiency and accuracy of job salary setting because they fully integrate market information. (2) Compared with the traditional method, it is difficult to achieve the salary alignment of all positions due to the huge labor cost and individual preference. The existing method provides a solution for the transparent management of C&B in the organization, eliminating as much as possible the inequities in the different positions of the organization.

  • Limitations and future directions: (1) The validation sets used by the existing methods come from various countries, and more comparison of differences across countries is needed. (2) Existing methods do not take into account the dynamic change of job salaries under different eras, for example, in the long term, talents with LLM ability have higher salaries, however, the salary is not high in the past even LLM ability is nonexistent in the past labor market.

4.4 Summary

In conclusion, AI-related techniques for organization management contain three aspects, including the organizational network analysis, organizational stability analysis, and organizational incentive analysis. Specifically, organizational network analysis aims to help understand the importance of critical connections and flows in an organization, which can serve downstream talent management applications. Then organizational stability analysis focuses on analyzing the composition of the organization and exploring the compatibility between employees and organizations. Finally, organizational incentive analysis concentrates on leveraging data-mining techniques to solve the job title/salary benchmarking problem in human resources. Indeed, compared with the traditional manually selected factors, these methods could decrease the labor cost and subjective biases to a great degree. At the same time, AI-based methods could generate complete information about employees and organizations to develop more accurate results. However, although these methods allow for time slicing, as organizations and talents develop faster, more fine-grained time slicing or continuous-time information extraction should be paid more attention in the future. Furthermore, since the performance evaluation of organizational management is usually completed by top managers and needs to consider many factors, the evaluation part can not only use data, but also need to include more expert evaluation methods. Finally, due to the increased capability of LLM, agent-related organizational behavior should also be studied.

5 Labor Market Analysis

Refer to caption
Figure 12: The overview of labor market analysis.

Labor market analysis is crucial to the formulation of the strategy, which is an important part of intelligence talent management. Traditionally, most existing studies on the labor market generally also rely on expert knowledge, subjective surveys, and qualitative analysis from psychological, economic, and cultural perspectives  [103, 65, 385, 113, 369, 58, 202]. These methods make it challenging to uncover the complex associations among multi-source data and the hidden patterns in massive data. The efficiency is also limited by manual analysis. Moreover, some studies that rely on online collected data typically employ causal inference or statistical analysis methods. For example, Jackson et al. deployed psychometric measures with internet surveys to infer the reasons behind talent flow [190]. Hershbein et al. analyzed concentration in labor markets from vacancy and employment data [176]. Hershbein et al. deployed various statistical approaches to analyze the different skill requirements of job postings in different economic situations [175].

Recently, the prevalence of Online Professional Networks (OPNs) and online recruitment websites has facilitated the accumulation of a large number of job reviews, company reviews, digital resumes, and job postings. These sources contain a wealth of intricate and diverse information about the labor market, including talent flow, talent demand, market trends, job skills, company branding, and more. These extensive datasets provide novel perspectives and opportunities for conducting a more fine-grained analysis of the labor market at a large scale. However, traditional methods are difficult to efficiently discover complex market patterns from data and accurately predict market trends. AI and machine learning algorithms possess powerful pattern recognition, data generalization, and fitting capabilities, making them well-suited for exploring labor market data [246, 294, 357, 464, 382, 458]. Many researchers have analyzed labor market with AI methods mainly from four aspects: talent flow analysis, job analysis, skill analysis, brand analysis.

We initially introduce AI-driven analysis of talent flow, as a common behavior observed in the labor market [305]. AI methodologies play a pivotal role in discerning talent flow inclinations and organizational competition, facilitating two downstream tasks as flow prediction and flow pattern analysis [466, 464, 249, 86, 85, 309, 310, 425]. Subsequently, we direct our attention to AI-driven job analysis works. Fundamentally, the labor market refers to the supply of jobs and demand for talent. Talent demand forecasting is an important part of job analysis. Several studies have proposed methodologies to scrutinize demand time series, discern fluctuations, and anticipate future demand trends [261, 204, 467, 159]. Besides, topic trend analysis constitutes a vital aspect of job analysis, delving into recruitment evolution under different recruitment topics [484]. Furthermore, we introduce the works related to skill analysis, which is also a pivotal aspect of labor market analysis. As skills are inherent within job descriptions but not directly accessible, extracting and predicting potential skills contained therein constitute significant endeavors [91, 8, 319, 418, 422, 426, 255, 399, 230]. To further adapt the rapid evolution of the labor market, AI delves deeply into changes of skill trends, captures dynamic characteristics of skill demand, and forecasts future skill demand [270, 269, 105, 417, 71, 83]. Besides, the skill valuation represents a vital aspect of skill analysis, which can furnish job seekers and employers with clearer insights into skills  [338, 382, 378]. Lastly, we introduce brand analysis, crafting a comprehensive company profile through the mining of employee data and reviews about the company [246, 245, 31, 376, 454, 54, 353]. This approach not only elucidates users’ perceptions but also reveals potential expectations and social responsibility placed on the company [68, 282, 328, 388]. Moreover, leveraging established sentiment classification methods within NLP enables the effective identification of employees’ sentiments towards the company [294, 188, 366, 132, 136, 341, 296], thereby preventing employee turnover.

To facilitate readers checking the literature, we summarize and organize these papers in Table V, which lists their tasks, the techniques, and the adopted data. And, Figure 12 presents an overview of works related to labor market analysis.

TABLE V: The table of collected papers related to labor market analysis.
Task Method Data Reference
Talent Flow Analysis
Flow prediction RNN OPNs  [430]
Flow prediction Tensor Factorization OPNs  [466]
Flow prediction Latent variable model OPNs  [464]
Flow prediction GAT Questionnaires  [249]
Flow pattern analysis Learning algorithms OPNs  [85]
Flow pattern analysis Optimization algorithm OPNs  [86]
Flow pattern analysis PageRank OPNs  [309, 310]
Flow pattern analysis Clustering model OPNs  [425]
Flow pattern analysis Clustering model, PageRank OPNs  [238]
Job Analysis
Demand trend analysis Latent Semantic Indexing Job postings  [204]
Demand trend analysis N-gram,SVM Job postings  [261]
Demand trend analysis Attentive neural network Job postings  [467]
Demand Trend analysis Dynamic Graph Job postings  [159]
Topic Trend analysis Latent variable model Job postings  [484]
Topic Trend analysis Bi-gram model Job postings  [277]
Topic Trend analysis Word2Vec, RNN Job postings  [29]
Topic Trend analysis NER Job postings  [268]
Skill Analysis
Potential Skill Prediction language models,SVM Job postings  [91]
Potential Skill Prediction Apriori algorithm Job postings  [8]
Potential Skill Prediction FP-growth algorithm Job postings  [319]
Potential Skill Prediction KNN algorithm Job postings  [418]
Potential Skill Prediction Tensor factorization Job postings  [422]
Potential Skill Prediction Topic model Job postings  [426]
Potential Skill Prediction GNN Job postings  [255]
Potential Skill Prediction SDCA logistic regression Job postings  [399]
Potential Skill Prediction Bert Job postings  [230]
Skill demand forecasting Logistic Regression, Granger Causality Job postings  [270]
Skill demand forecasting Bert, Logistic Regression, Random forest, SVC Job postings  [269]
Skill demand forecasting RNN Job postings  [105]
Skill demand forecasting LSTM, TimeGAN Job postings  [417]
Skill demand forecasting Hypernetwork, RNN Job postings  [71]
Skill demand forecasting Pretraining Graph Autoencoder Job postings  [83]
Skill Valuation Hill climbing Dblp  [338]
Skill Valuation Neural network Job postings  [382]
Skill Valuation Linear regression Job postings  [378]
Brand Analysis
Employer branding Topic Regression Job reviews  [246, 245]
Employer branding ELM classifier Company reviews  [31]
Employer branding Markov switching model News, Social media  [376]
Employer branding Prototype representation Corporate events  [454]
Employer branding Extreme learning machine Job reviews  [54]
Employer branding Topic model Company descriptions  [353]
CSR communication Topic model Social media  [68, 282, 328, 388]
Employee sentiment analysis Topic Model Company reviews  [294]
Employee sentiment analysis Topic model Social media  [188]
Employee sentiment analysis Topic model Job reviews  [366, 132]
Employee sentiment analysis TF-IDF, Bag of words, SVM Job reviews  [136, 341]
Employee sentiment analysis CNN, RNN Job reviews  [296]

5.1 Talent Flow Analysis.

Talent flow analysis mainly includes talent flow prediction tasks and other various flow pattern analyses. These tasks primarily leverage OPNs, which can reflect the flow of talent between different companies, to analyze the situation of talent flow in the market, and help formulate company strategies.

Talent flow analysis encompasses two key components: talent flow prediction tasks and diverse flow pattern analyses. These analytical tasks are predominantly reliant on OPN data, which provides a detailed depiction of talent movement across different companies and enables a thorough examination of the talent flow dynamics within the market. By leveraging this data, enterprises can gain invaluable insights into the patterns and trends of talent migration, enabling them to formulate targeted strategies to address issues related to brain drain [466]. Furthermore, governments can utilize talent flow analysis to cultivate a robust ecosystem for talent circulation [249], thereby fostering a conducive environment for sustained economic growth and innovation.

Traditional methods, reliant on subjective surveys for qualitative inference, lack objectivity and precision. They are time-consuming, struggle to adapt to market changes, and provide limited predictive insights. The advent of OPNs has made it possible to achieve large-scale and timely talent flow analysis with data-driven approaches facilitated by AI technology.

Following  [466], we denote talent flow as transition tensor RtN×N×Msuperscript𝑅𝑡superscript𝑁𝑁𝑀R^{t}\in\mathbb{R}^{N\times N\times M}italic_R start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_N × italic_M end_POSTSUPERSCRIPT for each time slice t𝑡titalic_t, where N𝑁Nitalic_N denotes the number of companies, M𝑀Mitalic_M denotes the number of job positions, and each element Rijktsubscriptsuperscript𝑅𝑡𝑖𝑗𝑘R^{t}_{ijk}italic_R start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j italic_k end_POSTSUBSCRIPT is defined as the normalized number of corresponding job transitions:

Rijkt=Numi,j,ktj=1NNumi,j,kt,subscriptsuperscript𝑅𝑡𝑖𝑗𝑘𝑁𝑢subscriptsuperscript𝑚𝑡𝑖𝑗𝑘superscriptsubscript𝑗1𝑁𝑁𝑢subscriptsuperscript𝑚𝑡𝑖𝑗𝑘R^{t}_{ijk}=\frac{Num^{t}_{i,j,k}}{\sum_{j=1}^{N}Num^{t}_{i,j,k}},italic_R start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j italic_k end_POSTSUBSCRIPT = divide start_ARG italic_N italic_u italic_m start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_j , italic_k end_POSTSUBSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_N italic_u italic_m start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_j , italic_k end_POSTSUBSCRIPT end_ARG , (2)

where Numi,j,kt𝑁𝑢subscriptsuperscript𝑚𝑡𝑖𝑗𝑘Num^{t}_{i,j,k}italic_N italic_u italic_m start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_j , italic_k end_POSTSUBSCRIPT denotes the transition number from the job position k𝑘kitalic_k of company i𝑖iitalic_i to company j𝑗jitalic_j at time slice t𝑡titalic_t. According to the definition, talent flow analysis tasks mainly contain talent flow prediction and various flow pattern analyses.

5.1.1 Flow Prediction

The task of talent flow prediction mainly revolves around anticipating changes in the labor market, thereby offering guidance for talent strategies. AI-driven techniques can enhance the accuracy and flexibility of such predictions. Formally, the talent flow prediction problem can be defined as follows:

Definition 5.1 (Talent Flow Prediction)

Given a set of talent flow tensors R1subscript𝑅1R_{1}italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, …, RTsubscript𝑅𝑇R_{T}italic_R start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , and some attributes of companies and market context, the goal of talent flow prediction is to predict the value of RijkT+1subscriptsuperscript𝑅𝑇1𝑖𝑗𝑘R^{T+1}_{ijk}italic_R start_POSTSUPERSCRIPT italic_T + 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j italic_k end_POSTSUBSCRIPT.

To solve this problem, Zhang et al. designed a dynamic latent factor-based Evolving Tensor Factorization (ETF) model for predicting future talent flows [466]. In detail, they used Uitsubscriptsuperscript𝑈𝑡𝑖U^{t}_{i}italic_U start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, Vjtsubscriptsuperscript𝑉𝑡𝑗V^{t}_{j}italic_V start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, Wktsubscriptsuperscript𝑊𝑡𝑘W^{t}_{k}italic_W start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT to represent the latent vectors of origin company i𝑖iitalic_i, destination company j𝑗jitalic_j and job position k𝑘kitalic_k at time slice t𝑡titalic_t, and evolve them to time slice t+1𝑡1t+1italic_t + 1 for predicting talent flows at t+1𝑡1t+1italic_t + 1. This model also integrates several representative attributes of companies as side information for regulating the model inference. The authors also proposed a Talent Flow Embedding (TFE) model to learn the bi-directional talent attractions of each company [464]. Subsequently, they explored the competition between different companies by analyzing talent flows using data from OPNs. In detail, the objective of this latent variable model is to learn two attraction vectors Susubscript𝑆𝑢S_{u}italic_S start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT and Tusubscript𝑇𝑢T_{u}italic_T start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT from talent flow network G𝐺Gitalic_G, where Susubscript𝑆𝑢S_{u}italic_S start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is the source attraction vector of company u𝑢uitalic_u and Tusubscript𝑇𝑢T_{u}italic_T start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is the target attraction vector of company u𝑢uitalic_u. The pair dot of Susubscript𝑆𝑢S_{u}italic_S start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT and Tvsubscript𝑇𝑣T_{v}italic_T start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT indicates the talent flow from company u𝑢uitalic_u to company v𝑣vitalic_v. The experimental results show the pairwise competitive relationships between different companies. Xu et al. enriched the sparse talent flow data by exploiting the correlations between the stock price movement and the talent flows of public companies [430]. They developed a fine-grained data-driven RNN model to capture the dynamics and evolving nature of talent flows, utilizing the rich information available in job transition networks. In addition to leveraging OPNs data, Liu et al. transformed questionnaire data into graph data and achieved accurate talent flow prediction using the Graph Attention Network [249]. This model incorporates an attention mechanism to mitigate information overload and decrease the network’s reliance on temporal and spatial factors.

Evaluations. Generally, the evaluation of talent flow prediction encompasses two primary perspectives: value prediction and link prediction.

  • Value Prediction: Xu et al. [430] and Zhang et al. [466] utilized widely accepted regression metrics like RMSE, MAPE, and MAE to gauge the accuracy of predicted flow values. These metrics provide insights into the disparity between predicted and actual flow values, offering a comprehensive assessment of the model’s predictive prowess.

  • Link Prediction: Liu extracted features from questionnaire data to predict the occurrence of talent flow among academics, introducing overall classification accuracy for evaluation [249]. Additionally, in the context of talent movement, companies with greater competitive appeal tend to attract talent. Zhang et al. employed AUC to assess company attraction and NDCG to evaluate ranking effectiveness [464]. AUC measures the model’s ability to discriminate between attractive and less attractive companies, while NDCG evaluates the ranking list’s efficacy in capturing the most desirable companies.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) Leveraging extensive job transition data, AI significantly enhances the prediction of talent flow across larger populations. (2) AI-based methods offer objectivity compared to subjective survey-based approaches. (3) Through analyzing the evolving patterns of talent flow over time, AI facilitates timely forecasts of dynamic talent movement.

  • Limitations and future directions: (1) Job titles are often clustered without a standardized benchmark for evaluation, leading to a lack of assessment regarding the quality of the clustering process. Establishing improved benchmarks for job titles can enhance the standardization and normalization of this clustering procedure. (2) Existing research predominantly focuses on talent flow within homogeneous job categories, neglecting the complexity of predicting talent movement across diverse job types. Expanding the analysis to encompass multiple dimensions of talent flow and achieving more nuanced predictions could offer valuable insights into job mobility across various occupations. (3) Current research fails to delve into the underlying motivations driving talent flow, a challenge compounded by the opaque nature of neural networks. However, recent advancements in LLM have shown promising capabilities in explanation, thereby rendering the analysis of the reasons behind talent flow more attainable.

5.1.2 Flow Pattern Analysis

Many researchers also explored other various flow pattern analysis tasks, such as competitiveness analysis hop** behavior analysis, and talent circle detection. In [86], the author initially gathered job-related information from various social media data. Subsequently, they developed a model called JobMiner, which focuses mainly on employing graph mining techniques to mine influential companies and uncover talent flow patterns. This method provides a better understanding of company competition and talent flow in professional social networks. Yu Cheng further developed machine learning and analytical techniques for the purpose of mining OPNs data [85]. From OPNs, they can mine influential companies with the related company groups and evaluate the company’s influence and competitiveness. Oentaryo et al. developed a series of data mining methodologies to analyze job-hop** behavior between different jobs and companies using publicly available OPNs data [309]. In detail, they used a weighted version of PageRank to measure the competitiveness of jobs or companies, and construct some metrics to measure the relationship between many properties of jobs and the propensity of hop**. Li et al. undertook the quantification of the dynamic attractiveness of companies to various types of talent [238]. Specifically, they utilized job title clustering to identify prevalent talent categories and then integrated the PageRank algorithm with job titles and talent flow graphs to determine companies’ attractiveness to specific talent categories over time. Then, Oentaryo et al. enhanced the data mining framework for analyzing talent flow patterns [310]. The results show that the factors influencing employee turnover can mainly be divided into four categories: employee personal factors, organizational factors, external environmental factors, and structural factors. Xu et al. developed a talent circle detection model and designed the corresponding learning method maximizing the NDCG to detect a suitable circle structure [425]. Each talent circle includes organizations with similar talent exchange patterns. Formally, a talent circle is a subset of neighbors of an ego node. In one circle, nodes are closely connected and similar to each other. The circles can be denoted as {Cm}subscript𝐶𝑚\{C_{m}\in\mathbb{C}\}{ italic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ∈ blackboard_C }, where m=1,2,,M𝑚12𝑀m=1,2,...,Mitalic_m = 1 , 2 , … , italic_M and CmVsubscript𝐶𝑚𝑉C_{m}\subseteq Vitalic_C start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ⊆ italic_V. V𝑉Vitalic_V represents the set of all organizations. The circles can be overlapped and the appropriate talent circles mean similar flow patterns and are closely connected. The detected talent circle can be used to predict talent exchange in the future and improve the recommendation in talent recruitment.

Evaluations. In analyzing talent flow patterns, various metrics are employed to assess performance across different tasks. Xu et al. utilized precision and recall to evaluate talent exchange prediction [425]. Li et al. compared models using clustering-validation metrics, including Silhouette score (higher is better), Intra-Dispersion (lower is better), and Inter-Dispersion (higher is better) [238]. These metrics gauge separation and cohesion within clusters. F1-score is reported by Li et al. for Employee attrition and Career outcome prediction, assessing classification performance. Oentaryo delved into talent flow patterns using metrics like work experience and job age, along with higher-level metrics such as external hop fraction and job level aggregated at job or organization levels [309, 310]. They also introduced network centrality metrics like in-degree centrality, out-degree centrality, and PageRank centrality to gauge node importance in job and organization graphs.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI technology enables the analysis of vast amounts of data, such as detailed job activity data from online professional networks, at a scale that would be impossible with traditional methods like surveys. This scalability allows for a more comprehensive understanding of job-related insights. (2) Real-time Insights: Unlike traditional surveys, which may have long turnaround times and cover only a small percentage of businesses or organizations, AI technology can provide real-time insights by continuously analyzing data as it becomes available. This real-time aspect is crucial for staying abreast of fast-changing trends. (3) Network Analysis: AI technology facilitates the exploration of interconnected networks within the job market, capturing talent flows from one job to another and from one organization to another. This network analysis provides a more holistic understanding of talent flow pattern, competition among organizations for talent, and the impact of these dynamics on job creation and talent attraction. By leveraging AI for network analysis, researchers and practitioners can uncover valuable insights that were previously obscured by a lack of visibility into these complex interactions.

  • Limitations and future directions: (1) Data Privacy and Ethics: As AI technology relies heavily on data, especially from online professional networks, ensuring data privacy and ethical use becomes paramount. Future directions should focus on develo** robust frameworks for data anonymization, consent management, and ethical guidelines for AI-driven analyzes to protect user privacy while still extracting valuable insights. (2) Bias and Fairness: AI algorithms can inadvertently perpetuate biases present in the data they are trained on, leading to unfair outcomes. Future directions should prioritize the development of bias detection and mitigation techniques within AI systems, along with ensuring fairness and transparency in decision-making processes to mitigate the societal biases.

5.2 Job Analysis.

Job analysis focuses mainly on the analysis of trends, such as demand trends, topic trends, and other relevant factors. These tasks mainly use job posting data, which contain information about the recruitment demand of jobs from companies, to analyze the situation of the recruitment market. Job analysis is essential to model labor market variations, enabling companies to adjust recruitment strategies effectively and empowering job seekers to plan their career paths proactively. Traditional job analysis in recruitment usually involves domain experts employing basic statistical methods. However, with the advancement of online recruitment services and data mining technologies, researchers now predict labor market trends using data-driven methods. Following [467], the job recruitment data can be denoted as trend tensor DtN×Msuperscript𝐷𝑡superscript𝑁𝑀D^{t}\in\mathbb{R}^{N\times M}italic_D start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_M end_POSTSUPERSCRIPT for each time slice t𝑡titalic_t, where N𝑁Nitalic_N denotes the number of companies, M𝑀Mitalic_M denotes the number of job positions, and each element Dijtsubscriptsuperscript𝐷𝑡𝑖𝑗D^{t}_{ij}italic_D start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is the number of job postings published at time slice t𝑡titalic_t, from company i𝑖iitalic_i and position j𝑗jitalic_j. Meanwhile, more context information on companies and job positions can be denoted as {C1,,CN}subscript𝐶1subscript𝐶𝑁\{C_{1},...,C_{N}\}{ italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_C start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT } and {P1,,PM}subscript𝑃1subscript𝑃𝑀\{P_{1},...,P_{M}\}{ italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_P start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT }. Based on these data, a variety of trend analysis tasks can be explored.

5.2.1 Talent Demand Forecasting

Forecasting talent demand is pivotal for both job seekers and employers as it offers a crucial insight into economic trends. Leveraging AI-driven techniques can significantly boost the precision and adaptability of these forecasts. Formally, the talent trend forecasting problem can be defined as follows.

Definition 5.2 (Talent Demand Forecasting)

Given a set of talent demand tensors {Dijt|t[1,T],i[1,N],j[1,M]}conditional-setsubscriptsuperscript𝐷𝑡𝑖𝑗formulae-sequence𝑡1𝑇formulae-sequence𝑖1𝑁𝑗1𝑀\{D^{t}_{ij}|t\in[1,T],i\in[1,N],j\in[1,M]\}{ italic_D start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | italic_t ∈ [ 1 , italic_T ] , italic_i ∈ [ 1 , italic_N ] , italic_j ∈ [ 1 , italic_M ] }, and some side information of companies and job positions. The goal of talent demand forecasting is to predict the value of DijT+1subscriptsuperscript𝐷𝑇1𝑖𝑗D^{T+1}_{ij}italic_D start_POSTSUPERSCRIPT italic_T + 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT.

The demand trend mainly focuses on the number of job recruitment, some researches used classifier method SVM and decomposition methods STL (Seasonal and Trend decomposition using Loess) to analyze the demand time series characteristics from the web data and the official data [261]. These researches show that web data can reflect the trend of the labor market. Karakatsanis et al. suggested a data mining-based approach for identifying the most in-demand occupations in the modern job market [204]. In detail, a Latent Semantic Indexing (LSI) model was developed for online job posts with job description data. The analysis results can highlight job trends most in-demand and identify occupational clusters. Zhang et al. provided Talent Demand Attention Network (TDAN), which can forecast fine-grained talent demand in the labor market [467]. Specifically, they constructed multiple-grained levels of information (e.g., market level, company level, job level et al.) and the intrinsic attributes of both companies and job positions from recruitment job post data. Then, they designed a transformer-based attentive neural network to automatically utilize this information to forecast the demand trend of each job in each company. Guo et al. introduced a Dynamic Heterogeneous Graph Enhanced Meta-learning (DH-GEM) framework to predict fine-grained talent demand-supply jointly [154]. They employed a Demand-Supply Joint Encoder-Decoder (DSJED) and a Dynamic Company-Position Heterogeneous Graph Convolutional Network (DyCP-HGCN) to capture the correlation between demand-supply sequences and company-position pairs. Additionally, they proposed a Loss-Driven Sampling based Meta-learner (LDSM) to optimize long-tail forecasting tasks with limited training data.

Evaluations. Given the challenges associated with forecasting the value of individual points within the univariate time series DijTsubscriptsuperscript𝐷𝑇𝑖𝑗D^{T}_{ij}italic_D start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT, Zhang et al.[467] and Guo et al.[154] opt to categorize the values into distinct trend categories, treating the prediction task as a time-series classification problem. Consequently, assessments of talent demand forecasting primarily rely on metrics such as ACC, F1, and AUC.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI revolutionizes talent demand analysis by leveraging large-scale data from online recruitment platforms. Unlike traditional survey-based methods, AI offers more comprehensive insights, enabling fine-grained forecasting at the level of specific positions within companies. This granular approach enhances prediction accuracy, facilitating real-time analysis of talent demand dynamics. (2) Moreover, AI techniques quantitatively model the dynamic nature of the recruitment market, crucial for understanding evolving trends. (3) Additionally, AI uncovers latent data dependencies not apparent through conventional models, identifying nuanced relationships between factors of talent demand.

  • Limitations and future directions: Existing demand trend forecasting can only predict the categories of trend change, and it struggles to predict specific demand values. In the future, more consideration will be given to the correlation between companies and positions and the fusion and interaction of multivariate time series, which can further achieve accurate multivariate time series predictions.

5.2.2 Topic Trend Analysis

The topic trend primarily focuses on text mining and language modeling techniques applied to job postings. For instance, Marrara et al. designed a language modeling approach for discovering novel occupations in the labor market, which can help the company catch the new trend of recruitment [277]. Zhu et al. developed MTLVM, which is a sequential latent variable model [484]. This model can capture sequential patterns of recruitment states. Moreover, it can automatically learn the latent recruitment topics by the Bayesian generative framework. In detail, it uses ce,tsubscript𝑐𝑒𝑡c_{e,t}italic_c start_POSTSUBSCRIPT italic_e , italic_t end_POSTSUBSCRIPT to represent the latent recruitment state of the company e𝑒eitalic_e at time step t𝑡titalic_t. Then the transition probability between different states is learned to analyze the evolving rules of the recruitment trend, and the topic model is deployed to reveal the trend of different recruitment topics. Azzahra et al. categorized job vacancies into groups like Administration, Finance, IT, and Marketing using a multilabel-classification method with word2vec and an RNN model [29]. Mahdavimoghaddam used NER tools to explore online social topics related to jobs [268]. They assessed social content’s effectiveness in predicting future job requirements, analyzed the relationship between work-related emotions online and social demographics, and identified potential impacts of community support on users’ well-being in the job market.

Evaluations. In Topic Trend Analysis, Zhu et al. evaluated using Validity Metric (VM) and Coverage Metric (CM) to assess topic relevance and word coverage [484]. They also analyzed recruitment state prediction and trend forecasting accuracy using log-likelihood. Azzahra et al. converted job vacancy tagging as a multi-label task, data evaluation framework [29]. It includes partition and ranking evaluation, along with label hierarchy utilization. Metrics used are accuracy, precision, F1-score, and Hamming loss, measuring different aspects of prediction performance.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI enables a fine-grained understanding of recruitment market trends, capturing nuances and fluctuations that traditional methods might overlook. (2) Leveraging large-scale analysis on massive recruitment data, AI provides data-driven insights beyond relying solely on domain expert knowledge or classic statistical models. (3) AI facilitates recruitment forecasting for companies over time, offering more precise predictions for future trends and developments.

  • Limitations and future directions: (1) The current landscape of labor market trend prediction primarily centers on forecasting trends for specific job vacancies, uncovering potential job opportunities, and predicting topic trends within particular companies. However, a unified framework for anticipating topic trends across the labor market is absent. (2) Moreover, in the realm of company-specific topic trend prediction, the model’s efficacy is restricted to companies present in the training dataset, thereby lacking the capability for few-shot learning. Addressing this limitation, future endeavors should explore methodologies to extrapolate topic trends from established companies to emerging ones, enabling accurate predictions even in low-resource scenarios. This transition toward effective topic trend prediction in few-shot settings represents a promising avenue for further research and development.

Refer to caption
Figure 13: The overview of job skill valuation.

5.3 Skill Analysis.

Tasks related to skill analysis mainly concentrate on exploring the relationship between jobs and skills, such as analyzing the skills required for different jobs and estimating the value of newly emerged skills. These tasks rely on job posting data, which contains information on skills, jobs, and salaries, to analyze the situation of job skills in the recruitment market. Following [426, 382], the job postings can be denoted as 𝒫={(Di,Ji,Si,Yi,Ti)|i=1,2,}\mathcal{P}=\{(D_{i},J_{i},S_{i},Y_{i},T_{i})\rvert i=1,2,...\}caligraphic_P = { ( italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_J start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) | italic_i = 1 , 2 , … }, where Disubscript𝐷𝑖D_{i}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denotes a set of job description, Jisubscript𝐽𝑖J_{i}italic_J start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denotes the job title, Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denotes required skill set, Yisubscript𝑌𝑖Y_{i}italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denotes the job salary, Tisubscript𝑇𝑖T_{i}italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denotes the publish time.

5.3.1 Potential Skill Prediction

The skill requirements can be inferred from job postings, and analyzing these requirements can provide valuable assistance in talent selection, job description formulation, and other related tasks. The task of skills requirement prediction task can generally be formulated as:

Definition 5.3 (Potential Skill Prediction)

Given a set of job postings 𝒫{(Ji,Si,(Di,Yi,Ti))|i=1,2,}\mathcal{P}\{(J_{i},S_{i},(D_{i},Y_{i},T_{i})^{*})\rvert i=1,2,...\}caligraphic_P { ( italic_J start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , ( italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) | italic_i = 1 , 2 , … }, where * indicates optional information. The goal of skills requirement prediction is to measure the required level or popularity for potential skills Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT of job Jisubscript𝐽𝑖J_{i}italic_J start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT or labor market.

To solve this problem, Wowczko et al. used k-NN clustering methods to identify key skill requirements in online job postings [418]. Colombo et al. deployed language models and machine learning classification approaches, e.g., SVM, to calculate the skills requirements of jobs [91]. Furthermore, they also classified job skills into a standard classification system and measured the relevance of soft and hard skills, which is important for talent selection and culture cultivation. Xu et al. proposed a Skill Popularity based Topic Model (SPTM) for modeling the generation of the skill network [426]. They used the neighbors of a skill on the skill network to generate the document for this skill. Then, the documents can be used to further analyze the popularity of the skill using topic models. This kind of topic model can integrate different criteria of jobs (e.g., salary levels, company size) and the latent connections between different skills. Then they effectively ranked the job skills based on the multi-faceted popularity. Wu et al. designed a Trend-Aware Tensor Factorization (TATF) framework to analyze the skill demand of jobs [422]. In detail, TATF constructs the relationship between skills and jobs as a special tensor with 4 dimensions, each element et,c,p,ssubscript𝑒𝑡𝑐𝑝𝑠e_{t,c,p,s}italic_e start_POSTSUBSCRIPT italic_t , italic_c , italic_p , italic_s end_POSTSUBSCRIPT in this tensor reflects the demand trend of skill s𝑠sitalic_s in job p𝑝pitalic_p, company c𝑐citalic_c at time t𝑡titalic_t. Then, they enhanced tensor factorization with aggregation-based constraint, i.e., competition (among companies) and co-occurrence (among skills) based aggregations. Furthermore, they designed the temporal constraint based on previous models to output jobs and skills representations which can quantify the potential skill trends of jobs. Akhriza et al. applied Apriori algorithm of the association rule and used recommendation techniques based on the output of the skill association to determine the most sought-after IT skills in the industry [8]. Patacsil et al. applied the Frequent Pattern-growth (FP-growth) algorithm of the association rule to analyze the relationship of jobs and skills requirements which provides a new dimension in labor market research [319]. These results can provide job skill requirements of jobs which are important to enhance the training strategy. Liu et al. devised multiple graphs, including J-Net, S-Net, and JS-Net, to jointly learn the semantics of job positions and skills using a three-layer graph neural network (GNN) [255]. Their approach successfully mapped job position descriptions to their requisite skills. Walek et al. introduced an automated detection method leveraging SDCA logistic regression to identify and accurately label all job requirements within job advertisements [399]. Leon et al. proposed a classification methodology aimed at uncovering correlations between job ad requirements and cross-cutting skill sets [230]. They focused on predicting the necessary skills for individual job descriptions using Bert embeddings.

Evaluation.

  • Ranking Metrics: Xu et al. developed a method to measure the popularity of job skills using various criteria, facilitating the recommendation of suitable job skills based on given job descriptions [426]. They evaluated performance using log-likelihood as a measure.

  • Classification Metrics: Liu et al., in their endeavor to identify correlations between job ad requirements and skill sets, utilized classification-based metrics including Micro-F1, Area Under the ROC Curve (AUC), and Hamming Loss [255]. Additionally, Leon et al. evaluated the classification accuracy (ACC) of their proposed methodology through comprehensive analysis [230].

  • Regression Metrics: Wu et al. focused on predicting the demand for skills, employing commonly accepted regression metrics such as Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) [422]. MAE calculates the mean value of squared errors for all samples, while MAPE emphasizes the absolute ratio between prediction error and ground truth value.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) Traditionally, analyzing the skill demand from job postings relied on time-consuming expert recognition. With the significant advancements in AI technology, particularly in the NLP and recommendation system, accurate extraction of skill information and the comprehensive construction of skill demand understanding have been achieved. (2) Traditional methods struggle to capture the skill demand trend based on the observed statistics of skill demand. AI technology has significantly contributed to trend capturing, achieving notable performance improvements.

  • Limitations and future directions: Current efforts have primarily focused on the multi-label classification to map job descriptions to a set of skills or skill recommendations. However, they often lacked the capability to predict the demand for innovative skills brought by emerging technologies. Looking ahead, it is feasible to leverage the common skill development patterns along with transfer learning technology to predict the demand for emerging skills. This approach holds promise for enhancing the adaptability of prediction models to accommodate the rapidly evolving landscape of technological advancements.

5.3.2 Skill Demand Forecasting.

Skill demand forecasting aims to anticipate the fluctuating demand for skills, thereby providing valuable insights for both employees and employers to stay ahead in the continuously evolving labor market landscape. Conventional approaches to this analysis predominantly rely on labor-intensive interview-based methods, which are susceptible to human biases. However, the emergence of online recruitment platforms has paved the way for a data-driven approach, leveraging AI technology to address previous challenges and demonstrate superior performance. Following [83], the task of skill demand forecasting can be broadly conceptualized as follows:

Definition 5.4 (Skill Demand Forecasting)

Given a set of job postings 𝒫{(Ji,Si,(Di,Yi,Ti))|i=1,2,}\mathcal{P}\{(J_{i},S_{i},(D_{i},Y_{i},T_{i})^{*})\rvert i=1,2,...\}caligraphic_P { ( italic_J start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , ( italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) | italic_i = 1 , 2 , … }, where * indicates optional information, the skill demand can be formulated as it=p𝒫t𝟏(sp)/|𝒫t|subscriptsuperscript𝑡𝑖subscript𝑝superscript𝒫𝑡1𝑠𝑝superscript𝒫𝑡\mathcal{R}^{t}_{i}=\sum_{p\in\mathcal{P}^{t}}\mathbf{1}(s\in p)/{|\mathcal{P}% ^{t}|}caligraphic_R start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_p ∈ caligraphic_P start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT bold_1 ( italic_s ∈ italic_p ) / | caligraphic_P start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT |, where |𝒫t|superscript𝒫𝑡|\mathcal{P}^{t}|| caligraphic_P start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT | is the number of job postings at timestamp t𝑡titalic_t. Given the historic skill demand sequences {t|t[1,T]}conditional-setsuperscript𝑡𝑡1𝑇\{\mathcal{R}^{t}|t\in[1,T]\}{ caligraphic_R start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT | italic_t ∈ [ 1 , italic_T ] }, the goal of skill demand forecasting is to predict the future skill demand T+1superscript𝑇1\mathcal{R}^{T+1}caligraphic_R start_POSTSUPERSCRIPT italic_T + 1 end_POSTSUPERSCRIPT.

To address the issue of skill demand prediction, Mahdavimoghaddam undertook the prediction of future in-demand skills derived from social content [270, 269]. Initially, they employed Granger causality analysis to discern the relationship between online social content and in-demand skills[270]. Furthermore, they utilized various classification algorithms, including Logistic Regression, Random Forest, and Linear Support Vector Classifier [269], trained on Bert embeddings to forecast topic-based time series. Macedo et al. employed various RNN methodologies including LSTM, CNN combined LSTM, and GRU on skill-share datasets [105]. They forecasted skill demand for the subsequent 6, 12, 24, and 36 months. Wolf et al. generated a synthetic skill demand time-series dataset using a Time-Generative-Adversarial-Network (TimeGAN) and trained an LSTM model on it, surpassing the performance of models trained only on real data [417]. Chao et al. concentrated on the interplay between skill demand and supply, introducing a Cross-view Hierarchical Graph Learning Hypernetwork framework for joint skill demand-supply prediction [71]. This framework comprises a cross-view graph encoder to capture asymmetrical relationships, a hierarchical graph encoder to model high-level skill co-evolution trends, and a hyper-decoder for skill trend output based on historical demand-supply data. Chen et al. delved into fine-grained skill demand forecasting at the occupation level [83]. They pre-trained a Graph Autoencoder with job descriptions aggregated at initial timestamps, incorporating specialized contrastive loss for sparse data and composite Tweedie loss and ranking loss for imbalanced demand distribution. Additionally, they proposed an efficient two-step optimization strategy for fine-tuning the Dynamic Graph Autoencoder, facilitating accurate prediction of future occupational skill demand.

Evaluation. Skill demand forecasting evaluation typically employs three primary categories of metrics: classification metrics, ranking metrics, and regression metrics.

  • Classification Metrics: Chao et al. cast the demand and supply forecasting task as a classification problem, to anticipate future skill demand trend [71]. They evaluated their model using widely accepted classification metrics, such as ACC, F1, and AUC.

  • Ranking Metrics: Chen et al. predicted the skill demand at the occupational level and framed it as a dynamic graph predicting task [83]. Furthermore, they decomposed this task into a dynamic link prediction task and dynamic edge regression task. Evaluation of the model’s effectiveness in dynamic link prediction employed ranking metrics, such as NDCG and MRR.

  • Regression Metrics: Macedo [105], Wolf [417] and Chen et al. [83] quantity the skill demand and predict the value of each skill demand. Thus, they utilized common regression metrics, RMSE, and MAE to assess the accuracy of demand value forecasting.

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) Thanks to the advancements in AI technology, particularly in the time-series learning, significant strides have been made in time-series forecasting tasks. By leveraging time-series learning models like MLP and RNN, it is now possible to accurately capture the skill demand changes and forecast the future skill demand over extended periods, spanning months or even longer. (2) Traditional statistical methods often struggle to capture the intricate interrelationships among various skills. In contrast, deep learning methods excel at integrating diverse skill features, enabling the establishment of comprehensive frameworks to comprehend demand series across multiple skills under varying circumstances.

  • Limitations and future directions: (1) The current research largely overlooks the influence of geographical factors on skill demand. Yet, distinct cities possess unique industrial structures, and the evolution of their industries varies significantly. Consequently, skill demand exhibits considerable variation across cities. It is imperative to devote greater attention to forecasting skill demand in different cities, recognizing the importance of local industrial dynamics in sha** skill requirements. (2) The current skill demand forecasting systems suffer from a deficiency in constructing labeled data, and there is a notable absence of discourse regarding skill word extraction methods in articles investigating skill demand prediction. Consequently, the extracted data is prone to biases. Future endeavors should aim to more systematically integrate the tasks of skill word extraction and skill demand prediction. This integration can mitigate data biases and reduce prediction inaccuracies stemming from biased environmental data.

5.3.3 Skill Value Estimation

In recent times, the evaluation of skills has garnered significant attention from researchers and companies alike. This task holds importance not only for companies seeking to identify and retain top talent but also for individuals aiming to proactively acquire essential skills for their desired career path. Formally, the task of estimating the value of skills can be defined as follows:

Definition 5.5 (Skills Value Estimation)

Given a set of job postings 𝒫{(Ji,Si,(Di,Ti))|i=1,2,}\mathcal{P}\{(J_{i},S_{i},(D_{i},T_{i})^{*})\rvert i=1,2,...\}caligraphic_P { ( italic_J start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , ( italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) | italic_i = 1 , 2 , … }, where * indicates optional information. The goal of skills value estimation is to measure the value of each skill Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

Rahman devised a method to estimate individual worker skills derived from the outcomes of team-based tasks [338]. They formulated skill aggregation functions to effectively estimate the skills of workers involved in such endeavors, solving these functions using an efficient heuristic solution based on hill climbing. Sun et al. proposed an enhanced neural network with a cooperative structure, Salary-Skill Composition Network (SSCN), for separating the job skills and measuring their value from the massive job postings [382]. Figure 13 shows the overview of the workflow. In detail, this method mainly contains two modules, one is a Context-aware Skill Valuation Network (CSVN) for dynamically modeling the skills, extracting the context-skill interaction, and estimating the context-aware skill value. Another is the Attentive Skill Domination Network (ASDN) which can extract an influence representation for each skill to model their influence on domination to each other from the skill graph. The value of job skills can help companies in formulating talent strategies. Stephany et al. argued for the importance of complementarity in skill estimation [378]. They assigned interpretable market values to individual skills, measuring their worth through a linear regression approach. Additionally, they extended the theory of skill complementarity, constructing a skill network based on the characteristics of each skill’s closest neighbors. Finally, they computed the value of complements as the average premium of the three most adjacent skills.

Evaluation. Current works on skill valuation aim to quantify skill value either as a deterministic value or as a probability distribution. Therefore, Rahman compared and contrasted different algorithms using Average Absolute Error and Normalized Relative Error [338]. Specifically, Normalized Relative Error is computed as tet×ettqt×qtsubscript𝑡subscript𝑒𝑡subscript𝑒𝑡subscript𝑡superscript𝑞𝑡superscript𝑞𝑡\frac{\sqrt{\sum_{t}e_{t}\times e_{t}}}{\sqrt{\sum_{t}q^{t}\times q^{t}}}divide start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT × italic_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_q start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT × italic_q start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT end_ARG end_ARG, where t𝑡titalic_t denotes a task, qtsuperscript𝑞𝑡q^{t}italic_q start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT represents the skill value of task t𝑡titalic_t, and etsubscript𝑒𝑡e_{t}italic_e start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT stands for the predicted error. In another study by Sun et al., performance was evaluated using root mean square error (RMSE) and mean absolute error (MAE), both popular metrics for measuring differences between observations and predictions [382]. Stephany et al. introduced a new metric called skill premium to explore how variance in values can be described by proposed features such as supply, demand, and complementarity [378].

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) AI offers efficient processing of extensive job advertisement data sourced from online recruitment platforms, thereby providing researchers with a substantial dataset for analyzing job skill worth. Additionally, AI algorithms streamline data processing and analysis, significantly enhancing efficiency and scalability compared to manual methods. (2) In contrast to conventional survey methods, AI-powered analysis delivers real-time insights into the evolving landscape of job skill value, accurately reflecting current market dynamics and trends. By harnessing AI, predictive models can be developed to anticipate future trends in job skill value, empowering individuals and organizations to proactively adapt to shifting market conditions.

  • Limitations and future directions: Various studies have approached skill valuation from different perspectives. Rahman quantified the skill value of individual workers within a teamwork context [338], Sun et al. employed salary prediction as a collaborative task to address the salary-skill value composition issue [382], and Stephany et al. aimed to explore the significance of complementarity in skill estimation [378]. Consequently, a unified and standard skill valuation system and corresponding estimation method represent valuable directions for future research.

5.4 Brand Analysis

Brands are one of the most precious assets for a company, which highlights the talent attractiveness from working and innovation, and the corporate image attributes from employees and public opinion. It is crucial for corporate to manage brands as a talent strategic tool to keep up with the continuously changing business world. How to formulate a strategy for improving brand is raising increasing attention in the area of talent management. Traditionally, the approaches for brand analysis mainly depend on surveys and interviews with expert knowledge. For example, Ambler et al. interviewed respondents from some companies about the relevance of branding to HRM [17]. Arasanmi et al. used online survey method to collect the data and analyzed the relationships between employer branding, job designs, and employee performance by statistical methods [23]. Fatma et al. collected data through a social survey and analyzed the impact of Corporate Social Responsibility (CSR) on corporate brand equity [125].

Refer to caption
Figure 14: The overview of brand analysis.

Recently, with the development of the Internet and online professional social networks, a large amount of company review data and various public data related to companies, e.g., online reviews, news, Twitter, and so on, can be collected. These data can provide new perspectives and opportunities for more comprehensive company brand analysis. However, traditional methods are difficult to analyze massive unstructured text data. The rapidly developed AI technology provides suitable methods for this kind of data. In particular, the topic model method [15, 194], which can cluster the latent semantic structure of the corpus in an unsupervised learning manner, is good at semantic analysis and text mining of brand analysis [294]. Figure 14 presents an overview of works related to brand analysis.

5.4.1 Company Profiling

Company profiling is a kind of analytical task to understand the fundamental characteristics of companies. AI-driven approaches provide an opportunity to profile companies from abundant and various online employment data.

Employer Branding. Employer Branding is to understand an employer’s unique characteristics to identify competitive edges. In [246], Lin et al. proposed CPCTR, which is a Bayesian model combines topic modeling with matrix factorization to obtain the company profiles from online job and company reviews. In detail, CPCTR groups reviews by their job positions and companies denote two words lists as {wn,j,eP}n=1Nsubscriptsuperscriptsubscriptsuperscript𝑤𝑃𝑛𝑗𝑒𝑁𝑛1\{w^{P}_{n,j,e}\}^{N}_{n=1}{ italic_w start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n , italic_j , italic_e end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT and {wm,j,eC}m=1Msubscriptsuperscriptsubscriptsuperscript𝑤𝐶𝑚𝑗𝑒𝑀𝑚1\{w^{C}_{m,j,e}\}^{M}_{m=1}{ italic_w start_POSTSUPERSCRIPT italic_C end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m , italic_j , italic_e end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT to represent the positive opinion and negative opinion for a specific job position j𝑗jitalic_j and company e𝑒eitalic_e, then formulates a joint optimization framework for learning the latent patterns of companies with different jobs vj,esubscript𝑣𝑗𝑒v_{j,e}italic_v start_POSTSUBSCRIPT italic_j , italic_e end_POSTSUBSCRIPT, which leads to a more comprehensive interpretation of company profiling and provides a collaborative view of opinion modeling. Subsequently, in [245], they provided a Gaussian process–based extension, GPCTR, which can capture the complex correlation among heterogeneous information and improve the profiling performance. Bajpai et al. provided a hybrid algorithm, which works as an ensemble of unsupervised and machine learning approaches, for company profiling from online company reviews data [31]. First, this work uses CNN and Doc2Vec to extract the important opinion aspects from reviews. Then they combined universal dependent modifiers and sentiment dictionaries to assign polarity to each aspect of each company. If it fails to assign a score to the aspect, the ELM model can be used to predict the polarity. Finally, each company can be embedded in a n𝑛nitalic_n-dimensional representation space where n𝑛nitalic_n is the number of aspects. Reis delved into the correlation between employer branding and talent management, showcasing employer branding’s potential to attract and retain top-tier employees within organizations [342]. Spears et al. investigated the long-term impact of public opinion on company earnings, utilizing data extracted from news and social media alongside earnings reports [376]. They employed a Markov switching model to quantify the relationship between adverse publicity and company finances, offering insights to enhance brand impact. Yuan et al. introduced the Self-Supervised Prototype Representation Learning (SePaL) framework for dynamic corporate profiling [454]. This involved inferring initial cluster distributions of noise-resistant event prototypes and utilizing self-supervision signals for representation learning, leading to improved predictions of stock price spikes and evaluations of corporate default risk. Bose et al. compiled a vast dataset of corporate reviews and applied an ensemble approach to sentiment analysis, facilitating better insights for customers in selecting businesses [54]. Savin et al. proposed an expert-bias-free classification of startup companies, encompassing 38 topics and quantifying their relevance for each startup [353]. Their analysis of industry and topic distributions provides valuable insights for entrepreneurs.

Corporate Social Responsibility Communication. Corporate social responsibility (CSR) stands as a cornerstone in both industry practices and academic discourse. Companies are compelled to devise robust CSR communication strategies to fortify their legitimacy and reputation. Numerous studies have been dedicated to unraveling this topic. Chae et al. scrutinized CSR through Twitter posts, employing the Structural Topic Model algorithm to unveil correlations between diverse responsibility topics and their sequential trends [68]. Their findings underscore the significance of CSR and corporate reputation in bolstering brand equity. Mazza delved into the evolution of CSR communication in the post-COVID-19 era using topic modeling techniques [282]. Pilgrim conducted a comprehensive review of social media mining methodologies, identifying four key approaches—topic models, network analysis, sentiment analysis, and regression analysis—to elucidate relevant CSR topics [328]. Thakur underscored the pivotal role of CSR discussions on social media platforms, offering practical recommendations for firms and CEOs based on critical insights derived from these discussions [388].

Evaluation. Generally, company profiling encompasses two distinct evaluation methods tailored to different analysis techniques. One approach involves quantitative analysis, utilizing regression metrics as evaluation criteria. The other revolves around qualitative type analysis, employing classification metrics for measurement.

  • Regression Metrics: Lin et al. utilized two commonly employed metrics, namely RMSE and MAE, as measures for Topic Regression [246, 245].

  • Classification Metrics: Bajpai employed Accuracy and Macro F1-score to assess the performance of aspect-level sentiment analysis [31]. Yuan utilized Precision, Recall, and F1-score, widely adopted metrics, for both stock price spike prediction and corporate default risk [454].

\diamondsuit Takeaway.

  • Advantages of AI technology: (1) Data Accessibility: AI can scrape and analyze vast amounts of data from online sources, including employee reviews and ratings on platforms like Glassdoor, Indeed, or LinkedIn. This provides access to information about companies that may not be readily available through traditional financial reports. (2) Real-time Insights: AI algorithms can process and analyze data in real-time, providing up-to-date insights into a company’s reputation, culture, and employee satisfaction. This real-time aspect is particularly valuable in today’s fast-paced business environment where conditions and perceptions can change rapidly.

  • Limitations and future directions: (1) Current research efforts predominantly focus on utilizing textual and numerical data, such as reviews and salary information, to assess employer branding. However, they often overlook the dynamic evolution of companies, leading to an inability to adapt to the fast-paced labor market changes. Looking forward, integrating considerations of the dynamic development of the labor market will facilitate the construction of a timely framework for company profiling. This approach will enable a more nuanced understanding of companies’ positioning and performance in response to the evolving demands of the labor market. (2) Existing methodologies for company profiling that leverage social media information often overlook the reliability of these data sources. Given the prevalence of potential misinformation within these platforms, there is a significant risk to the accuracy and trustworthiness of the derived insights. The field of web information mining, which has long addressed challenges related to fake news [367, 483], offers promising techniques that could enhance AI-based company profiling systems. Incorporating these advanced detection methods to filter and verify social media content represents a critical future direction for improving the robustness and reliability of company profiling.

5.4.2 Employee Sentiment Analysis

The employee sentiment analysis task is more focused on the analysis of the company’s public reviews, especially the reviews of employees. Employee sentiment analysis holds significant importance in the operations of firms and organizations worldwide due to its impact on employee turnover and customer satisfaction. These indicate the feedback of the company’s talent strategy, which is important for further iteration of the right talent strategy. As a widely used text analysis model, the topic model is an appropriate method for employee sentiment analysis task.

For example, Moniz et al. proposed an aspect-sentiment model based on the LDA approach for analyzing company reviews [294]. This kind of LDA approach can identify salient aspects in company reviews, and manually infer one latent topic that appears to have a relationship with the firm’s vision. Then, they combined the satisfaction topic information of company reviews with existing methods for earnings prediction. According to the results, employee satisfaction is important for firm earnings. Ikoro et al. proposed a lexicon-based sentiment analysis method for analyzing the public opinion of corporate brands [188]. In detail, they combined two sentiment lexicons and extracted two levels of sentiment terms, and collected over 60,000 tweets split over nine companies from Twitter. Then the LDA methods are deployed to discover the sentiment topics. Chae et al. analyzed the CSR based on posts data from Twitter, then, they applied the Structural Topic Model algorithm to discover the correlation between different responsibility topics and the sequential trend of topics [68]. The results also show that the CSR and the corporate reputation of a firm are important to its brand equity. In addition, some authors explored other machine learning methods for brand analysis. For example, Spears et al. investigated the impact of public opinion on companies’ earnings over time [376]. The public opinions were extracted from news and social media and the earning are collected from earnings reports. Then, they used the Markov switching model to quantitative relationships between bad publicity impact and the finances of companies, which can guide the company to build a great brand impact. Gaye et al. employed TF-IDF, bag of words, and global vectors to extract three features and proposed a hybrid/voting model called Regression Vector-Stochastic Gradient Descent Classifier (RV-SGDC) for sentiment classification [136]. Shi investigated factors influencing hotel employee satisfaction and explored different sentiments expressed in online reviews by hotel type (premium versus economy) and employment status (current versus former) [366]. Structural topic modeling (STM) and sentiment analysis were utilized to extract topics influencing employee satisfaction and examine sentiment differences across each topic. Ganga examined employee review literature and provided insights, noting emerging leading topics for satisfaction such as work environment and work-life balance, which are significant factors across various industries [132]. Rehan implemented a purely supervised machine learning approach with two modules to classify employees as satisfied/unsatisfied and proper/improper, respectively [341]. Mouli endeavored to address this gap by applying sentiment analysis techniques to a large dataset compiled from Glassdoor, primarily exploring worker sentiments using Bidirectional Gated Recurrent Unit (BiGRU), which demonstrated superior performance across various metrics [296].

Evaluation. In the realm of employee sentiment analysis, two primary evaluation methods were employed. One approach centered on the quantitative analysis concerning firm earnings derived from sentiment analysis, utilizing regression metrics for measurement. The other approach focused on evaluating employee satisfaction or attitudes, utilizing classification metrics for assessment.

  • Regression Metrics. Moniz et al. employed quantitative analysis to explore the relationship between employee satisfaction and firm earnings, utilizing ordinary least squares regression and RMSE as the measurement [294].

  • Classification Metrics. Gaye et al. [136], Shi et al. [366], Rehan et al. [341] and Mouli et al. [296] assessed the performance of their models using metrics such as accuracy, precision, recall, and F1 score.

\diamondsuit Takeaway.

  • Advantages of AI technology: The advent of AI technology, particularly advancements in topic modeling, has revolutionized opinion extraction from vast amounts of review data. Unlike traditional methods that were susceptible to human bias and relied on time-consuming expert labeling, AI-based methods offer the potential to streamline the process and eliminate subjective influences.

  • Limitations and future directions: The current landscape of employee sentiment analysis relies on traditional natural language processing techniques like topic modeling and BERT. However, recent advancements in LLMs have propelled sentiment analysis forward significantly [220]. Looking ahead, integrating LLMs into employee sentiment analysis represents a more efficient approach for future endeavors.

5.5 Summary

In general, AI-related labor market analysis primarily focuses on four key aspects: talent flow analysis, job analysis, skill analysis, and brand analysis. The research data predominantly consists of various data sources, including OPNs, social media platforms, job postings, and job and company reviews. The talent flow analysis mainly includes talent flow prediction task and other various flow pattern analysis tasks. The job analysis mainly focus on the analysis of trend, such as new job trend, demand trend and topic trend. The skill analysis mainly explored the potential skill prediction, skill demand forecasting, and skill valuation. Moreover, the brand analysis aims to model the brand and culture of the company including the CSR communication, and analyze the employee sentiment of the company. Existing AI-driven labor market research has initially demonstrated its advantages. Instead of the traditional questionnaire survey method, AI-driven techniques can mitigate human bias and obtain more objective conclusions. In addition, AI techniques can make extensive use of large-scale data to mine potential patterns in the labor market, capture dynamic changes therein, and accurately predict future trends. Furthermore, with the rapid development of NLP technology, AI can deeply mine the intrinsic correlation of occupations, skills, and talents from data such as job postings, explore the correlation between them, and obtain more accurate analysis conclusions. Despite these remarkable results, AI technology still has many limitations: fine value prediction cannot be achieved on some dynamic prediction tasks, and a large amount of work can only predict the extent of trend changes. In addition, a large amount of work requires label extraction and expert identification based on raw data. These processes still bring human bias and deviations in data distribution, further affecting the accuracy and fairness of downstream tasks. Moreover, this related research about the labor market is still at an early stage, many advanced and potential AI methods, such as LLM, can be combined with labor market analysis-related tasks and improve the intelligence of talent management.

6 Prospects

In the above sections, we reviewed a varsity of recent efforts in AI-based talent analytics in human resource management from three different aspects: talent management, organization management, and labor market analytics. Although it has helped enterprises deliver intelligence for effective decision-making and management, some urgent and vital issues still exist to be resolved. In this section, we outline some potential research directions toward handling those challenges and fostering further advancements in this field.

Refer to caption
Figure 15: The potential research directions in talent analytics.

6.1 Multimodal Talent Analytics

Information about a phenomenon or a process in talent analytics-related scenarios usually comes in different modalities. For instance, we can obtain communication and project collaboration networks in employee collaboration analysis. Indeed, mining the multimodal data in talent analytics can help us enhance the effectiveness of different applications. For example, Hemamou et al. collected the multimodal data in the job interview process, including text, audio, and video, and proposed a hierarchical attention model to achieve the best performance in predicting the hirability of the candidates [172]. Moreover, the utilization of multimodal data to train AI models, rather than solely relying on traditional interviews, offers a more comprehensive understanding of the participants’ personality test results. [217]. Recently, multimodal learning is used to achieve multimodal data representation, translation, alignment, fusion, and co-learning in various domains, such as commercial, social, biomedical [223, 37]. We can foresee that more multimodal learning methods will gradually be extensively used in talent analytics.

6.2 Talent Knowledge Management

Though AI-based approaches have achieved great success in acquiring talents and develo** them, relatively few works explore managing those talents’ knowledge with AI technologies, which is the primary driving force to the economics of ideas [415]. There is an urgent need for additional AI-based technologies that focus on talent knowledge creation, sharing, utilization, and management. Such efforts are crucial for maximizing the potential of human resources and enhancing organizational productivity. Indeed, we can leverage knowledge graph-related technologies [195] to construct the talent’s knowledge base and achieve efficient knowledge management. Moreover, we can transform the scenarios in talent development, such as knowledge learning and collaboration, into different recommendation scenarios and utilize recommendation algorithms to solve these problems. Recently, Wang et al. developed a personalized online courses recommendation system based on the employees’ current profiling [402, 401]. However, there is still a lack of an algorithm that can recommend heterogeneous knowledge. The existing algorithms are only from the individual perspective and have not been analyzed from the organizational aspect, such as the organizational knowledge diversity or competitiveness. Furthermore, although AI-based technologies have greatly improved the efficiency of knowledge management, the social and ethical implications (algorithmic discrimination, etc.) that they bring with them are also worthy of continued research [406].

6.3 Market-oriented Talent Analytics

AI technology has been effectively applied in labor market analytics [464]. However, those approaches mainly focus on the perspective of global market analysis and have not explored how the changing environment of the labor market affects internal talent management or organization management. In fact, combining macro and micro data in talent analytics is a vital research direction [183]. With the recent accumulation of internal and external data, there exists an exceptional opportunity to implement market-oriented talent analytics. For instance, Hang et al. leveraged the job posting data to capture the potential popularity of employees in external markets specific to skills and further achieve more accurate employee turnover prediction based on the market trend [165]. Moreover, the rapid development of AI technologies provides an excellent technical foundation for this direction. We can unitize multi-task learning [471] to jointly learn both macro and micro talent analytics-related tasks. Heterogeneous graph learning [462, 165] can help us effectively model the correlation of macro and micro data.

Besides, we can use AI technologies to identify various types of talents in market. Identifying high-potential employees has always been an important issue in enterprises. Ye et al. [447] proposed a neural network based dynamic social profiling approach for quantitatively identifying high-potential talents. Cheng et al. [84] conducted a quasi field experiment study and found that AI outperforms humans in the identification of high potential talents. In the labor market, accurate recruitment of digital talents is crucial for the digital transformation of enterprises. Harrigan et al. [166] found a specific relationship between enterprise talent investment and digitization by identifying digital skills-related talents within the organizations. By collecting data on workers, Goos et al. [148] found that people in regular jobs had a harder time finding jobs in new factories compared to digital talents. Babina et al. [30] classified different categories of AI skill-related talents based on job postings on recruitment platforms and analyzed the impact of these talents on the development of the enterprise after entering the enterprise. Kim et al. [211] proposed a dynamic co-occurrence method, which dynamically calculates the AI relevance of various types of skills in the labor market, so as to identify AI talents more accurately. It can be seen that AI has a great impact on human resource management, especially in the current situation where enterprises are paying more and more attention to environmental protection and social responsibility, AI has a broad prospect in the identification of green talents and Environmental, Social and Governance (ESG)-related talents. AI and human resource management still needs large-scale research in the future.

6.4 Organizational Culture Management

In our survey, we reviewed recent advancements in AI techniques for talent analytics in HRM from three perspectives, including talent management, organization management, and labor market analysis. The culture of an organization plays a crucial role in sustaining its effectiveness and viability [151]. Generally, the culture mainly contains three aspects: Mission, Vision, and Values (MVVs), which can help employees understand what is encouraged, discouraged, accepted, or rejected within an organization, and facilitate the organization to thrive with the shared purpose. Recent developments reveal that the availability of extensive datasets covering the entire lifecycle of talents and organizations offers opportunities to realize effective culture management. For example, the interconnection between culture and leadership is evident, with exceptional team leaders significantly sha** organizational culture [151]. Some researchers [227] discussed how ML techniques can be used to inform predictive and causal models of leadership effects. Accordingly, they further provided a step-by-step guide on designing studies that combine field experiments to establish causal relationships with maximal predictive power. Meanwhile, several studies analyze the leadership styles with the data mining algorithms, demonstrating that the different leadership styles significantly influence leadership outcomes [5]. Moreover, there are some researchers who tried to utilize text mining to analyze the organizational culture. For instance, Schmiedel et al. leveraged the online company reviews data and topic model to explore the employee’s perception of corporate culture [357]. Li et al. [235] applied a topic model to obtain firm-level measures of exposure and response related to COVID-19 for many U.S. firms. In detail, they deployed Correlated Topic Model (CTM) [343], which is similar to Latent Dirichlet Allocation (LDA), with 35 topics to discover the correlation between COVID-19 and the company-level measure. The results show that despite the large negative impact of COVID-19 on their operations, firms characterized by strong corporate cultures were able to outperform their counterparts lacking this attribute. As a famous saying goes, “Culture eats strategy for breakfast”, employing AI technologies in organizational culture management will become one of the most critical research directions in the future, as it can help managers scientifically address cultural management.

6.5 Ethical AI in Talent Analytics

Admittedly, AI technologies are increasingly employed in talent analytics, significantly enhancing management efficiency and accuracy. However, there are ongoing concerns about how to ensure that AI technologies adhere to well-defined ethical guidelines regarding fundamental values. Recently, some researchers have made efforts from two perspectives, i.e., fairness and explainability.

6.5.1 Fairness

The importance of fairness in talent analytics cannot be understated, given its profound impact on employee wellbeing and alignment with organizational values. Although AI technologies have achieved various successes in talent analytics, there is growing concern that such approaches may bring issues of unfairness to people and organizations, as evidenced by some recent reports [228, 101, 50]. For instance, Amazon scraps its AI-based recruitment system due to its discriminatory outcomes against women [101].

Recent studies have delved into the fairness of AI technologies in talent management from different perspectives. For instance, Qin et al. [335] verified that when involving the sensitive features, such as gender, age, etc., into the person-job fit model, the model without special design will easily learn the bias from the original data. Intuitively, we can solve this problem by removing sensitive features, also regarded as one of the pre-processing methods for imposing fairness [98]. However, a large amount of unstructured data already contains sensitive features, such as audio and video data in the interview process. The model can easily infer the potentially sensitive attributes of data, which may still cause the bias of AI algorithms [104]. In their study, Pena et al. examined multimodal systems to predict recruitable candidates using both image and structured data from resumes [323]. The authors first demonstrated that the deep learning model could reproduce the biases from the training data, even without the sensitive features. To address this, they integrated an adversarial regularizer to remove sensitive information from unstructured data and promote algorithmic fairness [295]. Similarly, Yan et al. both leveraged data balancing and adversarial learning to mitigate bias in the multimodal personality assessment  [436]. Moreover, there exist several open-source tools, such as AIF360 [42], FairML [4], Themis-ML [39], that can facilitate systematic bias checks and embed fairness in the AI algorithms [299]. In addition, AI technologies can also help to reduce human bias in different talent management scenarios. For instance, AI technologies have been applied to detect the potentially problematic words in the job posting that lead to bias or even legal risks and further assist employers in writing inclusive job descriptions [275]. Consequently, ensuring the fairness of AI algorithms is becoming an increasingly significant area of research within the field of talent analytics.

6.5.2 Explainability

Recently, there has been an increasing concern among employees and managers regarding the decisions made by black-box AI algorithms. Questions arise regarding the basis of these decisions, understanding the factors behind algorithmic success or failure, and determining how to rectify any errors that occur. Therefore, the research interests in increasing the transparency of AI-based automated decision-making in talent management are re-emerging [187, 470]. For instance, Qin et al. proposed to leverage the attention mechanisms to explain the matching degree between the content of job posting and resume [334]. Zhang et al. further introduced the hierarchical attention and collaborative attention mechanisms to increase the person-job fit model explainability both at the structured and unstructured information level [473]. Upadhyay et al. leveraged the knowledge graph and name entity recognition technologies to generate the understandable textual job recommendation explanation [391]. In  [207], Kaya et al. focused on constructing an end-to-end system for explainable automatic job candidate screening from video resumes. The authors extracted the audio, face, and scene features and leveraged the decision trees to both predict whether the candidates will be invited to the interview and explain the decisions by using binarization with a threshold. Liem et al. further handled the job candidate screening problem from an interdisciplinary viewpoint of psychologists and machine learning scientists [243]. Moreover, Juvitayapun et al. utilized the tree-based model to calculate the importance of different features, enhancing the explainability of AI-based turnover prediction [200].

However, the current approaches only stay in the perspective of AI model design and fail to consider whether employees or managers can easily comprehend and grasp the explanatory conclusions provided by the model. Indeed, visual analytics is an inherent way to help people who are inexperienced in AI understand the data and model [87, 16]. Therefore, combining visual analysis and explainable AI and building an intelligent talent management system is a valuable research direction. Additionally, leveraging the extensive interaction data generated by users can facilitate iterative model improvements from various angles, including correcting errors in automated decision-making and enhancing the efficiency of visual information presentation.

6.6 Generative AI in Talent Analytics

With technological advances, artificial intelligence has gradually become generative AI, and the language in which AI interacts with humans has changed from “machine language” to “natural language” [420]. By employing suitable templates, all tasks can be addressed through generative AI. Specifically, the knowledge emergence facilitated by large-scale parameters enables generative AI to tackle a series of complex tasks [403]. The advent of generative LLMs have sparked widespread discussion and excitement in both academia and industry. First of all, LLMs have a strong generative ability, which can easily understand various contexts and generate appropriate content according to the prompts. The following work is a good illustration of this aspect of the capabilities of LLMs: Zinjad et al. propose a resume generation tool based on the natural language understanding and information extraction capabilities of LLMs, which allows users to generate tailored personalized resumes by providing simple personal information and job information [490]. Ayoobi et al. propose a method for recognizing LLM-generated resumes in online recruitment platforms, which is beneficial in solving the challenge of identifying fake and LLM-created resumes that are difficult to identify [28]. Magron et al. construct a job posting dataset for skill matching that contains more implicit skills, longer sentences, and is closer to data from real job platforms. They conduct skill-matching experiments using LLMs and show that the method performs well in real-world data evaluation [267]. Second, LLMs facilitate the innovation of traditional AI technologies. For example, in the context of recommender systems, LLMs utilize their high-quality representation of textual features and their extensive coverage of external knowledge to achieve high-quality recommendations [420]. Wu et al. reveal the capability of LLMs to mine graph information by proposing a meta-path prompt constructor to help LLMs understand the semantics of behavioral graphs, and the framework helps provide personalized job recommendations for job seekers [419]. Du et al. improve the traditional LLM-based job recommendation method and verified its effectiveness by mining the explicit and implicit features of users in an online recruitment platform, and aligning the unmatched low-quality and high-quality generated resumes via a generative adversarial network [115]. Zheng et al. guide the LLM-based generative job recommendation system based on the Supervised Fine-Tuning strategy, generate suitable job descriptions according to the resumes of job seekers, and provide individuals with a more personalized and comprehensive job-seeking experience [479]. Abu-Rasheed et al. developed a group chat method based on knowledge graphs to enhance the interpretability of students in conversations with chatbots in the course recommendation task [3].

Besides, in recent years, LLM-based agents are utilized to solve various tasks such as software development, social simulation, and policy simulation [157]. Research on intelligent simulation of management-related problems using digital technologies has also been emerging, such as the study of member autonomy in organizations using human-machine systems based on blockchain technology [119]; how humans can use AI assistants to work effectively with large language models [339]; and how autonomous agents can utilize the generic capabilities of the underlying model for reasoning, decision-making, and environment interaction [441]. Further, by integrating multiple LLM agents, model efficiency can be further improved and simulation effects can be optimized [258]. However, there is not much research on using agents based on large language models for simulating organizational behavior. We should use management theory to simulate the framework of an organization and the behavior of individuals within the organization, which can focus on the trust behavior of inter-organizational personnel in interpersonal interactions, i.e., the willingness to put one’s self-interests at risk based on the positive expectations of others [423]. LLM-based agent can be used to generate conversational data for studying the behaviors and capabilities of chat agents [232]. They query APIs that read and write to web pages, generate content that shapes human behavior, and run system commands as autonomous agents [314]. We can also analyze LLMs’ network formation behavior to examine whether the dynamics of multiple LLMs are similar to or different from human social dynamics [315]. In addition, LLM capabilities can be used to create professional agents with controlled, specialized, and interactive, professional-level capabilities to reshape professional services through evolving expertise [88]. There is an empirical result that shows the agent named QuantAgent is useful in financial research, as it has the capability to uncover viable financial signals and enhance the accuracy of financial forecasts [404]. In a study by Yang et al. [438], an agent-based model was developed to simulate the decision-making behavior and interactive behavior of enterprises and employees based on the HRM characteristics of growing enterprises. The study found that firms should pay extra attention to recruitment programs, changes in workers’ pay gaps are influenced by industry growth and their own capabilities, and pay cap policies have a positive impact on the development of growth firms in the start-up stage. However, there is still a lack of research on Agent-based simulation of cooperative behaviors among employees with different competencies within an organization based on a large language model.

Finally, in the organization, managers use the knowledge inside and outside the organization to make procedural and non-procedural decisions to improve the scientificity and the accuracy of decision making [221]. In order to improve efficiency, many organizational managers have introduced AI to assist in decision making. However, due to the dominant role of humans in decision-making and the lack of autonomy in the interaction of AI, human knowledge and machine knowledge are difficult to function well [468, 212]. The emergence of generative AI gives people hope to solve this problem. Specifically, generative LLMs have the generative properties of being self-adaptive, self-growing, and non-periodic emergence [92, 12]. That is, generative LLMs become more ”proactive” in its interactions with human. In organizational management, generative LLMs can support tasks and decisions of individuals, as well as development planning and work scheduling of the entire organization by organizational managers [57]. For instance, Zheng et al. developed an LLM-based generative job recommendation system to provide individuals with a more personalized and comprehensive job seeking experience [479]. By imposes prompt-based organizational structures on LLM agents, Guo et al. [158] found that LLMs exhibit leadership and spontaneous cooperative behaviors in their organizations. In the decision-making process, it seems that human dominance has become weaker and AI’s decision-making ability has become stronger, which makes it easier to correct the decision-making bias brought about by human intuition [384]. However, due to the algorithmic black box characteristics of generative AI, it is difficult for human to intervene and predict the content generated by generative AI [109]. Therefore, in the decision-making process, generative AI may pose the risk of false information and misleading decisions [368, 408]. Therefore, in talent management, how to dialectically view AI and how to effectively use AI to manage and avoid risks is a key topic worthy of future research. However, the applications of existing LLMs in talent analysis also have limitations, such as the accuracy of generated information, interpretability, interaction forms, etc., which will affect the application of large models in business scenarios

6.7 The Impact of Emergent Events on Talent Analytics

The emergent events like COVID-19 pandemic have had a huge impact on businesses and organizations, and there are many management scenarios where we can apply AI methods to effectively deal with these dilemmas. In talent management, first of all, because of the pandemic, the daily tasks of the employees in many companies will change, in order to make the company adapt to this fluctuation and to mitigate the impact of COVID-19, we can predict the performance of the employees by using AI methods [167]. Additionally, the epidemic has taken a toll on numerous markets and industries, causing employees in companies to face high psychological stress, and some may even leave the organization. We can complete the simulation of employees’ stress through the technology of artificial intelligence [134] and understand the factors that influence the employees’ intention to leave the organization [297, 389]. From the perspective of an organization’s leader, it is essential to understand the impact of the epidemic on customers so that an effective response can be taken. Therefore, completing sentiment analysis of customers through text is a very effective approach [70, 359]. Lastly, in the labor market, AI methods can effectively use large-scale data to provide researchers with research perspectives and technical support to study the impact of COVID-19 on the labor market [67, 11].

Besides the organizational resilience, the forms of talent analytics are facing a transformation due to COVID-19. Because issues such as the need to limit social distances that arose during the epidemic triggered us to think about increasing human-computer interaction in the workplace scenario [291]. In the recruitment process, since global cooperation and telecommuting have become frequent, we can use artificial intelligence technologies such as natural language processing technology to screen job applicants in global real-time big data [379], in order to reduce the risk of disease transmission in densely populated scenarios, the use of chatbots to communicate with onboarding employees during the onboarding process can avoid the contact of the employees and simplify the process of onboarding employees [413]. Once employees are onboarded, they can also be trained with the help of a robot [397]. Xu et al. found that applying AI-enhanced VR simulators to employee training can reduce human resource costs and increase employee satisfaction, thereby enhancing the competitiveness of the organization [424]. Finally, for managers of organizations, as the temporary workforce of organizations is increasing after the epidemic era, combining temporary workforce management with AI can reduce risks and costs and improve work efficiency and quality [281].

7 Conclusions

Artificial Intelligence (AI)-driven talent analytics represents a potent frontier of innovation and opportunity in today’s competitive and fast-evolving business environment, even with the strong automation capability of LLM for decision making. However, in the process of the development of AI-based methods in talent analytics, many challenges need to be considered, such as fairness, explainability, data imbalance, temporal variation, etc. The solution of these problems can be more effective in evaluating AI, making AI-based decisions more accurate and usable. The essence of this survey provides a comprehensive exploration of the recent advancements in this domain. Specifically, we first delineated a detailed taxonomy of pertinent data, establishing a critical foundation for the utilization of AI techniques to understand talents, organizations, and management better. Subsequently, we illustrated the research efforts of the AI techniques for talent analytics, parsed into three critical aspects: talent management, organization management, and labor market analysis. Finally, we summarized the open challenges and potential prospects for future research directions within the AI-driven talent analytics sphere. The primary intent of this survey paper is to provide a thorough understanding of the recent effort of this emergent field to our readers, thereby fostering insights into the dynamic intersection of AI and talent analytics.

References

  • [1] S. M. Abbasi and K. W. Hollman, “Turnover: The real bottom line,” Public personnel management, vol. 29, no. 3, pp. 333–342, 2000.
  • [2] E. Abdollahnejad, M. Kalman, and B. H. Far, “A deep learning bert-based approach to person-job fit in talent recruitment,” in 2021 International Conference on Computational Science and Computational Intelligence (CSCI).   IEEE, 2021, pp. 98–104.
  • [3] H. Abu-Rasheed, M. H. Abdulsalam, C. Weber, and M. Fathi, “Supporting student decisions on learning recommendations: An llm-based chatbot with knowledge graph contextualization for conversational explainability and mentoring,” arXiv preprint arXiv:2401.08517, 2024.
  • [4] J. A. Adebayo et al., “Fairml: Toolbox for diagnosing bias in predictive modeling,” Ph.D. dissertation, Massachusetts Institute of Technology, 2016.
  • [5] W. Ahmad, M. Akhtaruzamman, U. Zahra, C. Ohri, and B. Ramakrishnan, “Investigation on the impact of leadership styles using data mining techniques,” Leadership, 2018.
  • [6] P. Ajit, “Prediction of employee turnover in organizations using machine learning algorithms,” algorithms, vol. 4, no. 5, p. C5, 2016.
  • [7] M. A. Akasheh, E. F. Malik, O. Hujran, and N. Zaki, “A decade of research on machine learning techniques for predicting employee turnover: A systematic literature review,” Expert Syst. Appl., vol. 238, no. PE, feb 2024. [Online]. Available: https://doi.org/10.1016/j.eswa.2023.121794
  • [8] T. M. Akhriza, L. D. Adistia et al., “Constructing recommendation about skills combinations frequently sought in it industries based on apriori algorithm,” in Mathematics, Informatics, Science, and Education International Conference (MISEIC 2019).   Atlantis Press, 2019, pp. 24–28.
  • [9] S. T. Al-Otaibi and M. Ykhlef, “A survey of job recommender systems,” International Journal of Physical Sciences, vol. 7, no. 29, pp. 5127–5142, 2012.
  • [10] D. Alao and A. Adeyemo, “Analyzing employee attrition using decision tree algorithms,” Computing, Information Systems, Development Informatics and Allied Research Journal, vol. 4, no. 1, pp. 17–28, 2013.
  • [11] A. A. Alaql, F. Alqurashi, and R. Mehmood, “Multi-generational labour markets: data-driven discovery of multi-perspective system parameters using machine learning,” Science Progress, vol. 106, no. 4, p. 00368504231213788, 2023.
  • [12] I. L. Alberts, L. Mercolli, T. Pyka, G. Prenosil, K. Shi, A. Rominger, and A. Afshar-Oromieh, “Large language models (llm) and chatgpt: what will the impact on nuclear medicine be?” European journal of nuclear medicine and molecular imaging, vol. 50, no. 6, pp. 1549–1552, 2023.
  • [13] H. E. Aldrich and P. H. Kim, “Small worlds, infinite possibilities? how social networks affect entrepreneurial team formation and search,” Strategic Entrepreneurship Journal, vol. 1, no. 1-2, pp. 147–165, 2007.
  • [14] V. Aleksić, M. Brems, A. Mathes, and T. Bertele, “Lexicon-driven automatic sentence generation for the skills section in a job posting,” in Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, 2023, pp. 32–40.
  • [15] R. Alghamdi and K. Alfalqi, “A survey of topic modeling in text mining,” Int. J. Adv. Comput. Sci. Appl.(IJACSA), vol. 6, no. 1, 2015.
  • [16] G. Alicioglu and B. Sun, “A survey of visual analytics for explainable artificial intelligence methods,” Computers & Graphics, vol. 102, pp. 502–520, 2022.
  • [17] T. Ambler and S. Barrow, “The employer brand,” Journal of brand management, vol. 4, no. 3, pp. 185–206, 1996.
  • [18] A. Anagnostopoulos, C. Castillo, A. Fazzone, S. Leonardi, and E. Terzi, “Algorithms for hiring and outsourcing in the online labor market,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, pp. 1109–1118.
  • [19] A. Apatean, E. Szakacs, and M. Tilca, “Machine-learning based application for staff recruiting,” Acta Technica Napocensis, vol. 58, no. 4, pp. 16–21, 2017.
  • [20] S. Appalaraju, B. Jasani, B. U. Kota, Y. Xie, and R. Manmatha, “Docformer: End-to-end transformer for document understanding,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 993–1003.
  • [21] S. Aral, E. Brynjolfsson, and L. Wu, “Three-way complementarities: Performance pay, human resource analytics, and information technology,” Management Science, vol. 58, no. 5, pp. 913–931, 2012.
  • [22] N. Arambepola and L. Munasinghe, “What makes job satisfaction in the information technology industry?” in 2021 International Research Conference on Smart Computing and Systems Engineering (SCSE), vol. 4.   IEEE, 2021, pp. 99–105.
  • [23] C. N. Arasanmi and A. Krishna, “Employer branding: perceived organisational support and employee retention–the mediating role of organisational commitment,” Industrial and Commercial Training, 2019.
  • [24] M. Artar, Y. S. Balcioglu, and O. Erdil, “Improving the quality of hires via the use of machine learning and an expansion of the person–environment fit theory,” Management Decision, 2024.
  • [25] J. B. Arthur, “Effects of human resource systems on manufacturing performance and turnover,” Academy of Management journal, vol. 37, no. 3, pp. 670–687, 1994.
  • [26] S. Avlonitis, D. Lavi, M. Mansoury, and D. Graus, “Career path recommendations for long-term income maximization: A reinforcement learning approach,” arXiv preprint arXiv:2309.05391, 2023.
  • [27] C. Ayishathahira, C. Sreejith, and C. Raseek, “Combination of neural networks and conditional random fields for efficient resume parsing,” in 2018 International CET Conference on Control, Communication, and Computing (IC4).   IEEE, 2018, pp. 388–393.
  • [28] N. Ayoobi, S. Shahriar, and A. Mukherjee, “The looming threat of fake and llm-generated linkedin profiles: Challenges and opportunities for detection and prevention,” in Proceedings of the 34th ACM Conference on Hypertext and Social Media, 2023, pp. 1–10.
  • [29] S. H. Azzahra, I. Nurma Yulita, and A. Hidayat, “Text categorization of job vacancy using recurrent neural network method,” in 2021 International Conference on Artificial Intelligence and Big Data Analytics.   Bandung, Indonesia: IEEE, Oct. 2021.
  • [30] T. Babina, A. Fedyk, A. He, and J. Hodson, “Artificial intelligence, firm growth, and product innovation,” Journal of Financial Economics, vol. 151, p. 103745, 2024.
  • [31] R. Bajpai, D. Hazarika, K. Singh, S. Gorantla, E. Cambria, and R. Zimmerman, “Aspect-sentiment embeddings for company profiling and employee opinion mining,” arXiv preprint arXiv:1902.08342, 2019.
  • [32] M. Bakaev and T. Avdeenko, “Intelligent information system to support decision-making based on unstructured web data,” ICIC Express Letters, vol. 9, no. 4, pp. 1017–1023, 2015.
  • [33] G. P. Baker, M. C. Jensen, and K. J. Murphy, “Compensation and incentives: Practice vs. theory,” The journal of Finance, vol. 43, no. 3, pp. 593–616, 1988.
  • [34] J. Balaji, M. Sigdel, P. Hoang, M. Liu, and M. Korayem, “Airesume: Automated generation of resume work history.” in KaRS@ CIKM, 2019, pp. 19–25.
  • [35] B. Balducci and D. Marinova, “Unstructured data in marketing,” Journal of the Academy of Marketing Science, vol. 46, pp. 557–590, 2018.
  • [36] K. S. Ball, “The use of human resource information systems: a survey,” Personnel review, vol. 30, no. 6, pp. 677–693, 2001.
  • [37] T. Baltrušaitis, C. Ahuja, and L.-P. Morency, “Multimodal machine learning: A survey and taxonomy,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 2, pp. 423–443, 2018.
  • [38] B. E. Bannaka, D. G. Dhanasekara, M. Sheena, A. Karunasena, and N. Pemadasa, “Machine learning approach for predicting career suitability, career progression and attrition of it graduates,” in 2021 21st international conference on advances in ict for emerging regions (icter).   IEEE, 2021, pp. 42–48.
  • [39] N. Bantilan, “Themis-ml: A fairness-aware machine learning interface for end-to-end discrimination discovery and mitigation,” Journal of Technology in Human Services, vol. 36, no. 1, pp. 15–30, 2018.
  • [40] G. Barnabò, A. Fazzone, S. Leonardi, and C. Schwiegelshohn, “Algorithms for fair team formation in online labour marketplaces,” in Companion Proceedings of The 2019 World Wide Web Conference, 2019, pp. 484–490.
  • [41] A. Barrak, B. Adams, and A. Zouaq, “Toward a traceable, explainable, and fairjd/resume recommendation system,” arXiv preprint arXiv:2202.08960, 2022.
  • [42] R. K. Bellamy, K. Dey, M. Hind, S. C. Hoffman, S. Houde, K. Kannan, P. Lohia, J. Martino, S. Mehta, A. Mojsilovic et al., “Ai fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias,” arXiv preprint arXiv:1810.01943, 2018.
  • [43] H. Benbya, S. Pachidi, and S. Jarvenpaa, “Special issue editorial: Artificial intelligence in organizations: Implications for information systems research,” Journal of the Association for Information Systems, vol. 22, no. 2, p. 10, 2021.
  • [44] M. Beresewicz, H. Cherniaiev, A. Mantaj, and R. Pater, “Text analysis of job offers for mismatch of educational characteristics to labour market demands,” Quality & Quantity, vol. 58, no. 2, pp. 1799–1825, 2024.
  • [45] S. Bian, X. Chen, W. X. Zhao, K. Zhou, Y. Hou, Y. Song, T. Zhang, and J.-R. Wen, “Learning to match jobs with resumes from sparse interaction data using multi-view co-teaching network,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 65–74.
  • [46] S. Bian, W. X. Zhao, Y. Song, T. Zhang, and J.-R. Wen, “Domain adaptation for person-job fit with transferable deep global match network,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 4810–4820.
  • [47] G. M. Biancofiore, T. Di Noia, E. Di Sciascio, F. Narducci, and P. Pastore, “Guapp: Enhancing job recommendations with knowledge graphs.” in IIR, 2021.
  • [48] E. Blankmeyer, J. P. LeSage, J. Stutzman, K. J. Knox, and R. K. Pace, “Peer-group dependence in salary benchmarking: a statistical model,” Managerial and Decision Economics, vol. 32, no. 2, pp. 91–104, 2011.
  • [49] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003.
  • [50] M. Bogen, “All the ways hiring algorithms can introduce bias,” Harvard Business Review, vol. 6, p. 2019, 2019.
  • [51] C. Boon, D. N. Den Hartog, and D. P. Lepak, “A systematic review of human resource management systems and their measurement,” Journal of management, vol. 45, no. 6, pp. 2498–2537, 2019.
  • [52] C. Borchers, D. Gala, B. Gilburt, E. Oravkin, W. Bounsi, Y. M. Asano, and H. Kirk, “Looking for a handsome carpenter! debiasing gpt-3 job advertisements,” in Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), 2022, pp. 212–224.
  • [53] F. Borisyuk, L. Zhang, and K. Kenthapadi, “Lijar: A system for job application redistribution towards efficient career marketplace,” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 1397–1406.
  • [54] A. Bose and N. Khatoon, “Sentiment analysis of It Companies- Features and Feelings Profiling, as well as Data Mining for Employee Input,” Dec. 2022.
  • [55] R. Boselli, M. Cesarini, S. Marrara, F. Mercorio, M. Mezzanzanica, G. Pasi, and M. Viviani, “Wolmis: A labor market intelligence system for classifying web job vacancies,” Journal of intelligent information systems, vol. 51, pp. 477–502, 2018.
  • [56] E. Boštjančič and Z. Slana, “The role of talent management comparing medium-sized and large companies–major challenges in attracting and retaining talented employees,” Frontiers in psychology, vol. 9, p. 396740, 2018.
  • [57] S. G. Bouschery, V. Blazevic, and F. T. Piller, “Augmenting human innovation teams with artificial intelligence: Exploring transformer-based language models,” Journal of Product Innovation Management, vol. 40, no. 2, pp. 139–153, 2023.
  • [58] C. Brancatelli, A. C. Marguerie, and S. Brodmann, “Job creation and demand for skills in kosovo: What can we learn from job portal data?” World Bank Policy Research Working Paper, no. 9266, 2020.
  • [59] R. Broderick and J. W. Boudreau, “Human resource management, information technology, and the competitive edge,” Academy of Management Perspectives, vol. 6, no. 2, pp. 7–17, 1992.
  • [60] A. Bujold, I. Roberge-Maltais, X. Parent-Rocheleau, J. Boasen, S. Sénécal, and P.-M. Léger, “Responsible artificial intelligence in human resources management: a review of the empirical literature,” AI and Ethics, pp. 1–16, 2023.
  • [61] N. Bunker, “January 2022 jolts report: Workers quitting jobs at elevated rates,” https://www.hiringlab.org/2022/03/09/january-2022-jolts-report/, 2022.
  • [62] H. Cao, V. Yang, V. Chen, Y. J. Lee, L. Stone, N. J. Diarrassouba, M. E. Whiting, and M. S. Bernstein, “My team will go on: Differentiating high and low viability teams through team interaction,” Proceedings of the ACM on Human-Computer Interaction, vol. 4, no. CSCW3, pp. 1–27, 2021.
  • [63] P. Cappelli and J. Keller, “Talent management: Conceptual approaches and practical challenges,” Annu. Rev. Organ. Psychol. Organ. Behav., vol. 1, no. 1, pp. 305–331, 2014.
  • [64] A. Cardoso, F. Mourão, and L. Rocha, “The matching scarcity problem: When recommenders do not connect the edges in recruitment services,” Expert Systems with Applications, vol. 175, p. 114764, 2021.
  • [65] S. C. Carr, K. Inkson, and K. Thorn, “From global careers to talent flow: Reinterpreting ‘brain drain’,” Journal of world business, vol. 40, no. 4, pp. 386–398, 2005.
  • [66] W. F. Cascio and H. Aguinis, “Research in industrial and organizational psychology from 1963 to 2007: Changes, choices, and trends.” Journal of Applied Psychology, vol. 93, no. 5, p. 1062, 2008.
  • [67] M. G. Celbiş, P.-h. Wong, K. Kourtit, and P. Nijkamp, “Impacts of the covid-19 outbreak on older-age cohorts in european labor markets: A machine learning exploration of vulnerable groups,” Regional Science Policy & Practice, vol. 15, no. 3, pp. 559–585, 2023.
  • [68] B. K. Chae and E. O. Park, “Corporate social responsibility (csr): A survey of topics and trends using twitter data and topic modeling,” Sustainability, vol. 10, no. 7, p. 2231, 2018.
  • [69] D. Chandola, A. Garg, A. Maurya, and A. Kushwaha, “Online resume parsing system using text analytics,” Journal of Multi-Disciplinary Engineering Technologies, vol. 9, 2015.
  • [70] Y.-C. Chang, C.-H. Ku, and D.-D. Le Nguyen, “Predicting aspect-based sentiment using deep learning and information visualization: The impact of covid-19 on the airline industry,” Information & Management, vol. 59, no. 2, p. 103587, 2022.
  • [71] W. Chao, Z. Qiu, L. Wu, Z. Guo, Z. Zheng, H. Zhu, and H. Liu, “A Cross-View Hierarchical Graph Learning Hypernetwork for Skill Demand-Supply Joint Prediction,” Jan. 2024.
  • [72] M. Chaudhary, L. Gaur, and A. Chakrabarti, “Comparative analysis of entropy weight method and c5 classifier for predicting employee churn,” in 2022 3rd International Conference on Intelligent Engineering and Management (ICIEM).   IEEE, 2022, pp. 232–236.
  • [73] H. Chen, L. Du, Y. Lu, Q. Fu, X. Chen, S. Han, Y. Kang, G. Lu, and Z. Li, “Professional network matters: Connections empower person-job fit,” in Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024, pp. 96–105.
  • [74] J. Chen, L. Gao, and Z. Tang, “Information extraction from resume documents in pdf format,” Electronic Imaging, vol. 28, pp. 1–8, 2016.
  • [75] J. Chen, C. Zhang, and Z. Niu, “A two-step resume information extraction algorithm,” Mathematical Problems in Engineering, vol. 2018, 2018.
  • [76] J. Chen, D. Zhu, X. Shen, X. Li, Z. Liu, P. Zhang, R. Krishnamoorthi, V. Chandra, Y. Xiong, and M. Elhoseiny, “Minigpt-v2: large language model as a unified interface for vision-language multi-task learning,” arXiv preprint arXiv:2310.09478, 2023.
  • [77] K. Chen, M. Niu, and Q. Chen, “A hierarchical reasoning graph neural network for the automatic scoring of answer transcriptions in video job interviews,” arXiv preprint arXiv:2012.11960, 2020.
  • [78] L. Chen, G. Feng, C. W. Leong, B. Lehman, M. Martin-Raugh, H. Kell, C. M. Lee, and S.-Y. Yoon, “Automated scoring of interview videos using doc2vec multimodal feature extraction paradigm,” in Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016, pp. 161–168.
  • [79] L. Chen, K. Zechner, and X. Xi, “Improved pronunciation features for construct-driven assessment of non-native spontaneous speech,” in Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the Association for Computational Linguistics, 2009, pp. 442–449.
  • [80] L. Chen, R. Zhao, C. W. Leong, B. Lehman, G. Feng, and M. E. Hoque, “Automated video interview judgment on a large-sized corpus collected online,” in 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).   IEEE, 2017, pp. 504–509.
  • [81] M. Chen, C. Wang, C. Qin, T. Xu, J. Ma, E. Chen, and H. Xiong, “A trend-aware investment target recommendation system with heterogeneous graph,” in 2021 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2021, pp. 1–8.
  • [82] X. Chen, Y. Liu, L. Zhang, and K. Kenthapadi, “How linkedin economic graph bonds information and product: applications in linkedin salary,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, pp. 120–129.
  • [83] X. Chen, C. Qin, Z. Wang, Y. Cheng, C. Wang, H. Zhu, and H. Xiong, “Pre-dygae: Pre-training enhanced dynamic graph autoencoder for occupational skill demand forecasting.” in IJCAI, 2024.
  • [84] Y. Cheng, X. Tang, X. Zhang, and H. Xiong, “Effectiveness of ai in strategic decision making: An empirical study on identifying high-potential talents,” in Forty-Second International Conference on Information Systems 2021 Proceedings, 2021.
  • [85] Y. Cheng, “Mining and understanding professional social networks: Challenges and solutions,” Ph.D. dissertation, Northwestern University, 2015.
  • [86] Y. Cheng, Y. Xie, Z. Chen, A. Agrawal, A. Choudhary, and S. Guo, “Jobminer: A real-time system for mining job-related patterns from social media,” in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 2013, pp. 1450–1453.
  • [87] J. Choo and S. Liu, “Visual analytics for explainable deep learning,” IEEE computer graphics and applications, vol. 38, no. 4, pp. 84–92, 2018.
  • [88] Z. Chu, Y. Wang, F. Zhu, L. Yu, L. Li, and J. Gu, “Professional agents–evolving large language models into autonomous experts with human-level competencies,” arXiv preprint arXiv:2402.03628, 2024.
  • [89] Z. Chuang, W. Ming, L. C. Guang, X. Bo, and L. Zhi-qing, “Resume parser: Semi-structured chinese document analysis,” in 2009 WRI world congress on computer science and information engineering, vol. 5.   IEEE, 2009, pp. 12–16.
  • [90] W. Cohen and L. Jensen, “A structured wrapper induction system for extracting information from semi-structured documents,” in Proceedings of the Workshop on Adaptive Text Extraction and Mining (IJCAI’01).   Citeseer, 2001.
  • [91] E. Colombo, F. Mercorio, and M. Mezzanzanica, “Applying machine learning tools on web vacancies for labour market and skill analysis,” Terminator or the Jetsons? The Economics and Policy Implications of Artificial Intelligence, 2018.
  • [92] G. Cooper, “Examining science education in chatgpt: An exploratory study of generative artificial intelligence,” Journal of Science Education and Technology, vol. 32, no. 3, pp. 444–452, 2023.
  • [93] D. C. Corrales, J. C. Corrales, and A. Ledezma, “How to address the data quality issues in regression models: A guided process for data cleaning,” Symmetry, vol. 10, no. 4, p. 99, 2018.
  • [94] P. Dahlbom, N. Siikanen, P. Sajasalo, and M. Jarvenpää, “Big data and hr analytics in the digital era,” Baltic Journal of Management, vol. 15, no. 1, pp. 120–138, 2020.
  • [95] H. Dai, Z. Liu, W. Liao, X. Huang, Y. Cao, Z. Wu, L. Zhao, S. Xu, W. Liu, N. Liu et al., “Auggpt: Leveraging chatgpt for text data augmentation,” arXiv preprint arXiv:2302.13007, 2023.
  • [96] L. Dai, Y. Yin, C. Qin, T. Xu, X. He, E. Chen, and H. Xiong, “Enterprise cooperation and competition analysis with a sign-oriented preference network,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 774–782.
  • [97] Y. Dai, “Studies on content analysis and ordering of courses from a knowledge-based perspective,” 2021.
  • [98] B. d’Alessandro, C. O’Neil, and T. LaGatta, “Conscientious classification: A data scientist’s guide to discrimination-aware classification,” Big data, vol. 5, no. 2, pp. 120–134, 2017.
  • [99] D. R. Dalton, W. D. Todor, M. J. Spendolini, G. J. Fielding, and L. W. Porter, “Organization structure and performance: A critical review,” Academy of management review, vol. 5, no. 1, pp. 49–64, 1980.
  • [100] A. Dashti, S. Samet, and H. Fani, “Effective neural team formation via negative samples,” in Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp. 3908–3912.
  • [101] J. Dastin, “Amazon scraps secret ai recruiting tool that showed bias against women,” https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G, 2018.
  • [102] S. Datta, P. Mallick, S. Patil, I. Bhattacharya, and G. Palshikar, “Generating an optimal interview question plan using a knowledge graph and integer linear programming,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021, pp. 1996–2005.
  • [103] T. H. Davenport, J. Harris, and J. Shapiro, “Competing on talent analytics,” Harvard business review, vol. 88, no. 10, pp. 52–58, 2010.
  • [104] M. De-Arteaga, A. Romanov, H. Wallach, J. Chayes, C. Borgs, A. Chouldechova, S. Geyik, K. Kenthapadi, and A. T. Kalai, “Bias in bios: A case study of semantic representation bias in a high-stakes setting,” in proceedings of the Conference on Fairness, Accountability, and Transparency, 2019, pp. 120–128.
  • [105] M. M. G. de Macedo, W. Clarke, E. Lucherini, T. Baldwin, D. Q. Neto, R. de Paula, and S. Das, “Practical Skills Demand Forecasting via Representation Learning of Temporal Dynamics,” in Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, Jul. 2022, pp. 285–294.
  • [106] A. De Mauro, M. Greco, M. Grimaldi, and P. Ritala, “Human resources for big data professions: A systematic classification of job roles and required skill sets,” Information Processing & Management, vol. 54, no. 5, pp. 807–817, 2018.
  • [107] C. de Ruijt and S. Bhulai, “Job recommender systems: A review,” arXiv preprint arXiv:2111.13576, 2021.
  • [108] J.-J. Decorte, J. Van Hautte, J. Deleu, C. Develder, and T. Demeester, “Career path prediction using resume representation learning and skill-based matching,” arXiv preprint arXiv:2310.15636, 2023.
  • [109] J. Deng and Y. Lin, “The benefits and challenges of chatgpt: An overview,” Frontiers in Computing and Intelligent Systems, vol. 2, no. 2, pp. 81–83, 2022.
  • [110] S. Deppe, “Ai-based recommendation system for industrial training ki-basierte empfehlungssysteme für die industrielle ausbildung.”
  • [111] G. DeSanctis, “Human resource information systems: A current assessment,” MIS quarterly, pp. 15–27, 1986.
  • [112] G. D. Devi and S. Kamalakkannan, “Prediction of job satisfaction from the employee using ensemble method,” in 2022 International Conference on Advanced Computing Technologies and Applications (ICACTA).   IEEE, 2022, pp. 1–8.
  • [113] R. Dix-Carneiro and B. K. Kovak, “Trade liberalization and the skill premium: A local labor markets approach,” American Economic Review, vol. 105, no. 5, pp. 551–57, 2015.
  • [114] Z. Dong, X. Huang, G. Yuan, H. Zhu, and H. Xiong, “Butterfly-core community search over labeled graphs,” arXiv preprint arXiv:2105.08628, 2021.
  • [115] Y. Du, D. Luo, R. Yan, X. Wang, H. Liu, H. Zhu, Y. Song, and J. Zhang, “Enhancing job recommendation through llm-based generative adversarial networks,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 8, 2024, pp. 8363–8371.
  • [116] R. Duncan, “What is the right organization structure? decision tree analysis provides the answer,” Organizational dynamics, vol. 7, no. 3, pp. 59–80, 1979.
  • [117] A. D. Ekawati, “Predictive analytics in employee churn: A systematic literature review,” Journal of Management Information and Decision Sciences, vol. 22, no. 4, pp. 387–397, 2019.
  • [118] A. Elbadrawy and G. Karypis, “Domain-aware grade prediction and top-n course recommendation,” in Proceedings of the 10th ACM conference on recommender systems, 2016, pp. 183–190.
  • [119] E. W. Ellinger, R. W. Gregory, T. Mini, T. Widjaja, and O. Henfridsson, “Skin in the game: The transformational potential of decentralized autonomous organizations,” MIS Quarterly, 2023.
  • [120] A. Elragal, “Erp and big data: The inept couple,” Procedia Technology, vol. 16, pp. 242–249, 2014.
  • [121] E. Faliagka, L. Iliadis, I. Karydis, M. Rigou, S. Sioutas, A. Tsakalidis, and G. Tzimas, “On-line consistent ranking on e-recruitment: seeking the truth behind a well-formed cv,” Artificial Intelligence Review, vol. 42, no. 3, pp. 515–528, 2014.
  • [122] E. Faliagka, A. Tsakalidis, and G. Tzimas, “An integrated e-recruitment system for automated personality mining and applicant ranking,” Internet research, 2012.
  • [123] C. Fang, C. Qin, Q. Zhang, K. Yao, J. Zhang, H. Zhu, F. Zhuang, and H. Xiong, “Recruitpro: A pretrained language model with skill-aware prompt learning for intelligent recruitment,” in Proceedings of the 29th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD-2023), 2023.
  • [124] H. Fani, R. Barzegar, A. Dashti, and M. Saeedi, “A streaming approach to neural team formation training,” in European Conference on Information Retrieval.   Springer, 2024, pp. 325–340.
  • [125] M. Fatma, Z. Rahman, and I. Khan, “Building company reputation and brand equity through csr: the mediating role of trust,” International Journal of Bank Marketing, 2015.
  • [126] T. W. Ferratt, R. Agarwal, C. V. Brown, and J. E. Moore, “It human resource management configurations and it turnover: Theoretical synthesis and empirical analysis,” Information systems research, vol. 16, no. 3, pp. 237–255, 2005.
  • [127] P. Frow and A. Payne, “Customer Relationship Management: A Strategic Perspective,” Journal of business market management, vol. 3, no. 1, pp. 7–27, Feb. 2009.
  • [128] B. Fu, H. Liu, H. Zhao, Y. Zhu, Y. Song, T. Zhang, and Z. Wu, “Market-aware dynamic person-job fit with hierarchical reinforcement learning,” in International Conference on Database Systems for Advanced Applications.   Springer, 2022, pp. 697–705.
  • [129] B. Fu, H. Liu, Y. Zhu, Y. Song, T. Zhang, and Z. Wu, “Beyond matching: Modeling two-sided multi-behavioral sequences for dynamic person-job fit,” in International Conference on Database Systems for Advanced Applications.   Springer, 2021, pp. 359–375.
  • [130] L. Gadár and J. Abonyi, “Explainable prediction of node labels in multilayer networks: a case study of turnover prediction in organizations,” Scientific Reports, vol. 14, no. 1, p. 9036, 2024.
  • [131] S. Gaikwad and N. Bogiri, “Effective and efficient xml duplicate detection using levenshtein distance algorithm.”
  • [132] A. Ganga, “Employees Satisfaction in Different Industries: An Exploratory Review of the Literature,” International Journal of Economics and Management Systems, vol. 07, Jul. 2022.
  • [133] L. S. Ganthi, Y. Nallapaneni, D. Perumalsamy, and K. Mahalingam, “Employee attrition prediction using machine learning algorithms,” in Proceedings of International Conference on Data Science and Applications: ICDSA 2021, Volume 1.   Springer, 2022, pp. 577–596.
  • [134] A. Garlapati, D. R. Krishna, K. Garlapati, and G. Narayanan, “Predicting employees under stress for pre-emptive remediation using machine learning algorithm,” in 2020 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT).   IEEE, 2020, pp. 315–319.
  • [135] D. Gaucher, J. Friesen, and A. C. Kay, “Evidence that gendered wording in job advertisements exists and sustains gender inequality.” Journal of personality and social psychology, vol. 101, no. 1, p. 109, 2011.
  • [136] B. Gaye, D. Zhang, and A. Wulamu, “Sentiment classification for employees reviews using regression vector- stochastic gradient descent classifier (RV-SGDC),” PeerJ Computer Science, vol. 7, p. e712, Sep. 2021.
  • [137] J. Gelens, J. Hofmans, N. Dries, and R. Pepermans, “Talent management and organisational justice: Employee reactions to high potential identification,” Human Resource Management Journal, vol. 24, no. 2, pp. 159–175, 2014.
  • [138] B. Gerhart, P. M. Wright, G. C. MC MAHAN, and S. A. Snell, “Measurement error in research on human resources and firm performance: how much error is there and how does it influence effect size estimates?” Personnel psychology, vol. 53, no. 4, pp. 803–834, 2000.
  • [139] S. C. Geyik, V. Dialani, M. Meng, and R. Smith, “In-session personalization for talent search,” in Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018, pp. 2107–2115.
  • [140] S. C. Geyik, Q. Guo, B. Hu, C. Ozcaglar, K. Thakkar, X. Wu, and K. Kenthapadi, “Talent search and recommendation systems at linkedin: Practical challenges and lessons learned,” in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018, pp. 1353–1354.
  • [141] P. Ghosh and V. Sadaphal, “Jobrecogpt–explainable job recommendations using llms,” arXiv preprint arXiv:2309.11805, 2023.
  • [142] L. M. Giermindl, F. Strich, O. Christ, U. Leicht-Deobald, and A. Redzepi, “The dark sides of people analytics: reviewing the perils for organisations and employees,” European Journal of Information Systems, vol. 31, no. 3, pp. 410–435, 2022.
  • [143] F. Gilardi, M. Alizadeh, and M. Kubli, “Chatgpt outperforms crowd workers for text-annotation tasks,” Proceedings of the National Academy of Sciences, vol. 120, no. 30, p. e2305016120, 2023.
  • [144] S. Gim and E. T. Im, “A study on predicting employee attrition using machine learning,” in IEEE/ACIS International Conference on Big Data, Cloud Computing, and Data Science Engineering.   Springer, 2022, pp. 55–69.
  • [145] A. Giri, A. Ravikumar, S. Mote, and R. Bharadwaj, “Vritthi-a theoretical framework for it recruitment based on machine learning techniques applied over twitter, linkedin, spoj and github profiles,” in 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE).   IEEE, 2016, pp. 1–7.
  • [146] M. Gollapalli, A.-U. Rahman, A. Osama, A. Alfaify, M. Yassin, and A. Alabdullah, “Data mining and visualization to understand employee attrition and work performance,” in 2023 3rd International Conference on Computing and Information Technology (ICCIT).   IEEE, 2023, pp. 149–154.
  • [147] Z. Gong, Y. Song, T. Zhang, J.-R. Wen, D. Zhao, and R. Yan, “Your career path matters in person-job fit,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 8, 2024, pp. 8427–8435.
  • [148] M. Goos, E. Rademakers, and R. Roettger, “Routine-biased technical change: Individual-level evidence from a plant closure,” Research Policy, vol. 50, no. 7, p. 104002, 2021.
  • [149] A. Group, “Let’s talk: Focused conversation topics to supercharge recruiting success,” 2019.
  • [150] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 855–864.
  • [151] B. Groysberg, J. Lee, J. Price, and J. Y.-J. Cheng, “The leader’s guide to corporate culture,” https://hbr.org/2018/01/the-leaders-guide-to-corporate-culture, 2018.
  • [152] Z. Guan, J.-Q. Yang, Y. Yang, H. Zhu, W. Li, and H. Xiong, “Jobformer: Skill-aware job recommendation with semantic-enhanced transformer,” arXiv preprint arXiv:2404.04313, 2024.
  • [153] A. Guarino, D. Malandrino, F. Marzullo, A. Torre, and R. Zaccagnino, “Adaptive talent journey: Optimization of talents’ growth path within a company via deep q-learning,” Expert Systems with Applications, vol. 209, p. 118302, 2022.
  • [154] P. Guo, K. Xiao, Z. Ye, H. Zhu, and W. Zhu, “Intelligent career planning via stochastic subsampling reinforcement learning,” Scientific Reports, vol. 12, no. 1, pp. 1–16, 2022.
  • [155] P. Guo, K. Xiao, H. Zhu, and Q. Meng, “Preference-constrained career path optimization: An exploration space-aware stochastic model,” in 2023 IEEE International Conference on Data Mining (ICDM).   IEEE, 2023, pp. 120–129.
  • [156] S. Guo, F. Alamudun, and T. Hammond, “Résumatcher: A personalized résumé-job matching system,” Expert Systems with Applications, vol. 60, pp. 169–182, 2016.
  • [157] T. Guo, X. Chen, Y. Wang, R. Chang, S. Pei, N. V. Chawla, O. Wiest, and X. Zhang, “Large language model based multi-agents: A survey of progress and challenges,” arXiv preprint arXiv:2402.01680, 2024.
  • [158] X. Guo, K. Huang, J. Liu, W. Fan, N. Vélez, Q. Wu, H. Wang, T. L. Griffiths, and M. Wang, “Embodied llm agents learn to cooperate in organized teams,” arXiv preprint arXiv:2403.12482, 2024.
  • [159] Z. Guo, H. Liu, L. Zhang, Q. Zhang, H. Zhu, and H. Xiong, “Talent Demand-Supply Joint Prediction with Dynamic Heterogeneous Graph Enhanced Meta-Learning,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.   Washington DC USA: ACM, Aug. 2022. [Online]. Available: https://dl.acm.org/doi/10.1145/3534678.3539139
  • [160] P. Gupta, S. F. Fernandes, and M. Jain, “Automation in recruitment: a new frontier,” Journal of Information Technology Teaching Cases, vol. 8, no. 2, pp. 118–125, 2018.
  • [161] V. Ha-Thuc, Y. Xu, S. P. Kanduri, X. Wu, V. Dialani, Y. Yan, A. Gupta, and S. Sinha, “Search by ideal candidates: Next generation of talent search at linkedin,” in Proceedings of the 25th International Conference Companion on World Wide Web, 2016, pp. 195–198.
  • [162] V. Ha-Thuc, Y. Yan, X. Wu, V. Dialani, A. Gupta, and S. Sinha, “From query-by-keyword to query-by-example: Linkedin talent search approach,” in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017, pp. 1737–1745.
  • [163] R. Hamidi Rad, H. Fani, E. Bagheri, M. Kargar, D. Srivastava, and J. Szlichta, “A variational neural architecture for skill-based team formation,” ACM Transactions on Information Systems, vol. 42, no. 1, pp. 1–28, 2023.
  • [164] R. Hamidi Rad, H. Fani, M. Kargar, J. Szlichta, and E. Bagheri, “Learning to form skill-based teams of experts,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 2049–2052.
  • [165] J. Hang, Z. Dong, H. Zhao, X. Song, P. Wang, and H. Zhu, “Outside in: Market-aware heterogeneous graph neural network for employee turnover prediction,” in Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, 2022, pp. 353–362.
  • [166] J. Harrigan, A. Reshef, and F. Toubal, “The march of the techies: Job polarization within and between firms,” Research Policy, vol. 50, no. 7, p. 104008, 2021.
  • [167] M. R. Hasan, R. K. Ray, and F. R. Chowdhury, “Employee performance prediction: An integrated approach of business analytics and machine learning,” Journal of Business and Management Studies, vol. 6, no. 1, pp. 215–219, 2024.
  • [168] M. He, D. Shen, T. Wang, H. Zhao, Z. Zhang, and R. He, “Self-attentional multi-field features representation and interaction learning for person-job fit,” IEEE Transactions on Computational Social Systems, 2021.
  • [169] M. He, T. Wang, Y. Zhu, Y. Chen, F. Yao, and N. Wang, “Finn: Feature interaction neural network for person-job fit,” in 2021 7th International Conference on Big Data and Information Analytics (BigDIA).   IEEE, 2021, pp. 123–130.
  • [170] M. He, X. Zhan, D. Shen, Y. Zhu, H. Zhao, and R. He, “What about your next job? predicting professional career trajectory using neural networks,” in Proceedings of the 2021 4th International Conference on Machine Learning and Machine Intelligence, 2021, pp. 184–189.
  • [171] L. Hemamou, G. Felhi, J.-C. Martin, and C. Clavel, “Slices of attention in asynchronous video job interviews,” in 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII).   IEEE, 2019, pp. 1–7.
  • [172] L. Hemamou, G. Felhi, V. Vandenbussche, J.-C. Martin, and C. Clavel, “Hirenet: A hierarchical attention model for the automatic analysis of asynchronous video job interviews,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 573–581.
  • [173] L. Hemamou, A. Guillon, J.-C. Martin, and C. Clavel, “Don’t judge me by my face: An indirect adversarial approach to remove sensitive information from multimodal neural representation in asynchronous job video interviews,” in 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII).   IEEE, 2021, pp. 1–8.
  • [174] ——, “Multimodal hierarchical attention neural network: Looking for candidates behaviour which impact recruiter’s decision,” IEEE Transactions on Affective Computing, vol. 14, no. 2, pp. 969–985, 2021.
  • [175] B. Hershbein and L. B. Kahn, “Do recessions accelerate routine-biased technological change? evidence from vacancy postings,” American Economic Review, vol. 108, no. 7, pp. 1737–72, 2018.
  • [176] B. Hershbein, C. Macaluso, and C. Yeh, “Concentration in us local labor markets: evidence from vacancy and employment data,” Working paper, Tech. Rep., 2018.
  • [177] W. Hong and Y. Wang, “Evaluation of highway construction foreman’s competency based on support vector machine,” in Journal of Physics: Conference Series, vol. 1168, no. 3.   IOP Publishing, 2019, p. 032106.
  • [178] Y. Hou, X. Pan, W. X. Zhao, S. Bian, Y. Song, T. Zhang, and J.-R. Wen, “Leveraging search history for improving person-job fit,” in International Conference on Database Systems for Advanced Applications.   Springer, 2022, pp. 38–54.
  • [179] L. Hough and S. Dilchert, “Personality: Its measurement and validity for employee selection.” 2010.
  • [180] C. Hu, Q. Zhou, and H. Tong, “Genius: A novel solution for subteam replacement with clustering-based graph neural network,” arXiv preprint arXiv:2211.04100, 2022.
  • [181] X. Hu, Y. Cheng, Z. Zheng, Y. Wang, X. Chi, and H. Zhu, “Boss: A bilateral occupational-suitability-aware recommender system for online recruitment,” in Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023, pp. 4146–4155.
  • [182] ——, “Boss: A bilateral occupational-suitability-aware recommender system for online recruitment,” in Proceedings of the 29th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD-2023), 2023.
  • [183] P. Huang and Z. Zhang, “Participation in open knowledge communities and job-hop**,” MIS Quarterly, vol. 40, no. 3, pp. 785–806, 2016.
  • [184] Y. Huang, “Study of college human resources data mining based on the som algorithm,” in 2009 Asia-Pacific Conference on Information Processing, vol. 1.   IEEE, 2009, pp. 324–327.
  • [185] Y. Huang, T. Lv, L. Cui, Y. Lu, and F. Wei, “Layoutlmv3: Pre-training for document ai with unified text and image masking,” in Proceedings of the 30th ACM International Conference on Multimedia, 2022, pp. 4083–4091.
  • [186] C.-C. Hung and E.-P. Lim, “On aggregating salaries of occupations from job post and review data,” IEEE Access, vol. 9, pp. 43 422–43 433, 2021.
  • [187] A. L. Hunkenschroer and C. Luetge, “Ethics of ai-enabled recruiting and selection: A review and research agenda,” Journal of Business Ethics, pp. 1–31, 2022.
  • [188] V. Ikoro, M. Sharmina, K. Malik, and R. Batista-Navarro, “Analyzing sentiments expressed on twitter by uk energy company consumers,” in 2018 Fifth international conference on social networks analysis, management and security (SNAMS).   IEEE, 2018, pp. 95–98.
  • [189] M. Ilwani, G. Nassreddine, and J. Younis, “Machine learning application on employee promotion,” Mesopotamian Journal of Computer Science, vol. 2023, pp. 106–120, 2023.
  • [190] D. J. Jackson, S. C. Carr, M. Edwards, K. Thorn, N. Allfree, J. Hooks, and K. Inkson, “Exploring the dynamics of new zealand’s talent flow.” New Zealand Journal of Psychology, vol. 34, no. 2, 2005.
  • [191] S. E. Jackson and R. S. Schuler, “Understanding human resource management in the context of organizations and their environments,” Annual review of psychology, vol. 46, no. 1, pp. 237–264, 1995.
  • [192] M. A. Jafor, M. A. H. Wadud, K. Nur, and M. M. Rahman, “Employee promotion prediction using improved adaboost machine learning approach,” AIUB Journal of Science and Engineering (AJSE), vol. 22, no. 3, pp. 258–266, 2023.
  • [193] N. Jain, A. Tomar, and P. K. Jana, “A novel scheme for employee churn problem using multi-attribute decision making approach and machine learning,” Journal of Intelligent Information Systems, vol. 56, pp. 279–302, 2021.
  • [194] H. Jelodar, Y. Wang, C. Yuan, X. Feng, X. Jiang, Y. Li, and L. Zhao, “Latent dirichlet allocation (lda) and topic modeling: models, applications, a survey,” Multimedia Tools and Applications, vol. 78, no. 11, pp. 15 169–15 211, 2019.
  • [195] S. Ji, S. Pan, E. Cambria, P. Marttinen, and S. Y. Philip, “A survey on knowledge graphs: Representation, acquisition, and applications,” IEEE Transactions on Neural Networks and Learning Systems, 2021.
  • [196] F. Jiang, C. Qin, J. Zhang, K. Yao, X. Chen, D. Shen, C. Zhu, H. Zhu, and H. Xiong, “Towards efficient resume understanding: A multi-granularity multi-modal pre-training approach,” 2024.
  • [197] J. Jiang, S. Ye, W. Wang, J. Xu, and X. Luo, “Learning effective representations for person-job fit by feature fusion,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 2549–2556.
  • [198] Y. **, J. Xie, W. Guo, C. Luo, D. Wu, and R. Wang, “Lstm-crf neural network with gated self attention for chinese ner,” IEEE Access, vol. 7, pp. 136 694–136 703, 2019.
  • [199] C. B. Johnson, M. L. Riggs, and R. G. Downey, “Fun with numbers: Alternative models for predicting salary levels,” Research in Higher Education, vol. 27, no. 4, pp. 349–362, 1987.
  • [200] T. Juvitayapun, “Employee turnover prediction: The impact of employee event features on interpretable machine learning methods,” in 2021 13th international conference on knowledge and smart technology (kst).   IEEE, 2021, pp. 181–185.
  • [201] V. Kabalina and A. Osipova, “Identifying and assessing talent potential for future needs of a company,” Journal of Management Development, vol. 41, no. 3, pp. 147–162, 2022.
  • [202] T. Kang, H. Park, and S. Shin, “Searching, examining, and exploiting in-demand technical (see it) skills using web data mining.” in SEKE, 2020, pp. 424–428.
  • [203] N. Karagampitiya and A. Kirupananda, “Voluntary turnover prediction system for tourism industry with special reference to hotel industry in sri lanka,” in World Conference on Information Systems for Business Management.   Springer, 2023, pp. 447–455.
  • [204] I. Karakatsanis, W. AlKhader, F. MacCrory, A. Alibasic, M. A. Omar, Z. Aung, and W. L. Woon, “Data mining approach to monitoring the requirements of the job market: A case study,” Information Systems, vol. 65, pp. 1–6, 2017.
  • [205] M. Kargar, A. An, and M. Zihayat, “Efficient bi-objective team formation in social networks,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases.   Springer, 2012, pp. 483–498.
  • [206] J. Kaur and A. A. Fink, “Trends and practices in talent analytics,” Society for Human Resource Management (SHRM)-Society for Industrial-Organizational Psychology (SIOP) Science of HR White Paper Series, vol. 20, 2017.
  • [207] H. Kaya, F. Gurpinar, and A. Ali Salah, “Multi-modal score fusion and decision trees for explainable automatic job candidate screening from video cvs,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 1–9.
  • [208] M. Kaya and T. Bogers, “An exploration of sentence-pair classification for algorithmic recruiting,” in Proceedings of the 17th ACM Conference on Recommender Systems, 2023, pp. 1175–1179.
  • [209] ——, “Understanding recruiters’ information seeking behavior in talent search,” in Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, 2023, pp. 14–23.
  • [210] K. Kenthapadi, A. Chudhary, and S. Ambler, “Linkedin salary: A system for secure collection and presentation of structured compensation insights to job seekers,” in 2017 IEEE Symposium on Privacy-Aware Computing (PAC).   IEEE, 2017, pp. 13–24.
  • [211] J. Kim, A. Rai, and Y.-K. Lin, “Ai labor markets: Toward a dynamic skills-based approach to measurement,” in (International Conference on Information Systems 2023, 2023.
  • [212] J.-C. Kim and K. Chung, “Knowledge-based hybrid decision model using neural network for nutrition management,” Information Technology and Management, vol. 21, no. 1, pp. 29–39, 2020.
  • [213] J. Klünder, J. Horstmann, and O. Karras, “Identifying the mood of a software development team by analyzing text-based communication in chats with machine learning,” in International Conference on Human-Centred Software Engineering.   Springer, 2020, pp. 133–151.
  • [214] K. Konar, S. Das, and S. Das, “Employee attrition prediction for imbalanced data using genetic algorithm-based parameter optimization of xgb classifier,” in 2023 International Conference on Computer, Electrical & Communication Engineering (ICCECE).   IEEE, 2023, pp. 1–6.
  • [215] Q. Kong, Y. Cai, and Q. Zhu, “The case study for the basic information service of job post resource based on web mining,” in 2012 International Conference on Computer Science and Service System.   IEEE, 2012, pp. 498–501.
  • [216] S. K. Kopparapu, “Automatic extraction of usable information from unstructured resumes to aid search,” in 2010 IEEE International Conference on Progress in Informatics and Computing, vol. 1.   IEEE, 2010, pp. 99–103.
  • [217] A. Koutsoumpis, S. Ghassemi, J. K. Oostrom, D. Holtrop, W. Van Breda, T. Zhang, and R. E. de Vries, “Beyond traditional interviews: Psychometric analysis of asynchronous video interviews for personality and interview performance evaluation using machine learning,” Computers in Human Behavior, vol. 154, p. 108128, 2024.
  • [218] S. Krishnan, D. Haas, M. J. Franklin, and E. Wu, “Towards reliable interactive data cleaning: A user survey and recommendations,” in Proceedings of the Workshop on Human-In-the-Loop Data Analytics, 2016, pp. 1–5.
  • [219] H. Kropsu‐Vehkapera, H. Haapasalo, J. Harkonen, and R. Silvola, “Product data management practices in high‐tech companies,” Industrial Management & Data Systems, vol. 109, no. 6, pp. 758–774, Jun. 2009.
  • [220] J. O. Krugmann and J. Hartmann, “Sentiment Analysis in the Age of Generative AI,” Customer Needs and Solutions, vol. 11, no. 1, p. 3, Dec. 2024.
  • [221] W. L. Kuechler and V. Vaishnavi, “So, talk to me: The effect of explicit goals on the comprehension of business process narratives,” Mis Quarterly, pp. 961–979, 2006.
  • [222] D. Kumar, T. Grosz, N. Rekabsaz, E. Greif, and M. Schedl, “Fairness of recommender systems in the recruitment domain: an analysis from technical and legal perspectives,” Frontiers in big Data, vol. 6, 2023.
  • [223] D. Lahat, T. Adali, and C. Jutten, “Multimodal data fusion: an overview of methods, challenges, and prospects,” Proceedings of the IEEE, vol. 103, no. 9, pp. 1449–1477, 2015.
  • [224] T. Lappas, K. Liu, and E. Terzi, “Finding a team of experts in social networks,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009, pp. 467–476.
  • [225] K. J. Lauver and A. Kristof-Brown, “Distinguishing between employees’ perceptions of person–job and person–organization fit,” Journal of vocational behavior, vol. 59, no. 3, pp. 454–470, 2001.
  • [226] D. Lavi, V. Medentsiy, and D. Graus, “consultantbert: Fine-tuned siamese sentence-bert for matching jobs and job seekers,” arXiv preprint arXiv:2109.06501, 2021.
  • [227] A. Lee, I. Inceoglu, O. Hauser, and M. Greene, “Determining causal relationships in leadership research using machine learning: The powerful synergy of experiments and data science,” The Leadership Quarterly, p. 101426, 2020.
  • [228] M. K. Lee, “Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management,” Big Data & Society, vol. 5, no. 1, p. 2053951718756684, 2018.
  • [229] U. Leicht-Deobald, T. Busch, C. Schank, A. Weibel, S. Schafheitle, I. Wildhaber, and G. Kasper, “The challenges of algorithm-based hr decision-making for personal integrity,” in Business and the Ethical Implications of Technology.   Springer, 2022, pp. 71–86.
  • [230] F. Leon, M. Gavrilescu, S.-A. Floria, and A. A. Minea, “Hierarchical classification of transversal skills in job advertisements based on sentence embeddings,” Information, vol. 15, no. 3, p. 151, 2024.
  • [231] R. E. Lewis and R. J. Heckman, “Talent management: A critical review,” Human resource management review, vol. 16, no. 2, pp. 139–154, 2006.
  • [232] G. Li, H. Hammoud, H. Itani, D. Khizbullin, and B. Ghanem, “Camel: Communicative agents for” mind” exploration of large language model society,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  • [233] H. Li, Y. Ge, H. Zhu, H. Xiong, and H. Zhao, “Prospecting the career development of talents: A survival analysis perspective,” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 917–925.
  • [234] J. Li, Y. Long, T. Wang, D. Shen, and Z. Zhang, “A data-driven method for competency evaluation of personnel,” in Proceedings of the 3rd International Conference on Data Science and Information Technology, 2020, pp. 93–104.
  • [235] K. Li, X. Liu, F. Mai, and T. Zhang, “The role of corporate culture in bad times: Evidence from the covid-19 pandemic,” Journal of Financial and Quantitative Analysis, vol. 56, no. 7, pp. 2545–2583, 2021.
  • [236] L. Li, H. **g, H. Tong, J. Yang, Q. He, and B.-C. Chen, “Nemo: Next career move prediction with contextual embedding,” in Proceedings of the 26th International Conference on World Wide Web Companion, 2017, pp. 505–513.
  • [237] L. Li, H. Tong, N. Cao, K. Ehrlich, Y.-R. Lin, and N. Buchler, “Replacing the irreplaceable: Fast algorithms for team member recommendation,” in Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 636–646.
  • [238] L. Li, T. Lappas, and R. Liu, “Measuring employer attractiveness in diverse talent markets,” Decision Support Systems, vol. 177, Feb. 2024. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0167923623001549
  • [239] M. Li, X. Chen, W. Liao, Y. Song, T. Zhang, D. Zhao, and R. Yan, “Ezinterviewer: To improve job interview performance with mock interview generator,” in Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023, pp. 1102–1110.
  • [240] X. Li, H. Shu, Y. Zhai, and Z. Lin, “A method for resume information extraction using bert-bilstm-crf,” in 2021 IEEE 21st International Conference on Communication Technology (ICCT).   IEEE, 2021, pp. 1437–1442.
  • [241] Z. Li, X. Pi, M. Wu, and H. Tong, “Reform: Fast and adaptive solution for subteam replacement,” in 2021 IEEE International Conference on Big Data (Big Data).   IEEE, 2021, pp. 350–358.
  • [242] Li, Liangyue and Tong, Hanghang and Cao, Nan and Ehrlich, Kate and Lin, Yu-Ru and Buchler, Norbou, “Enhancing team composition in professional networks: Problem definitions and fast solutions,” IEEE transactions on knowledge and data engineering, vol. 29, no. 3, pp. 613–626, 2016.
  • [243] C. Liem, M. Langer, A. Demetriou, A. M. Hiemstra, A. Sukma Wicaksana, M. P. Born, and C. J. König, “Psychology meets machine learning: Interdisciplinary perspectives on algorithmic job candidate screening,” Explainable and interpretable models in computer vision and machine learning, pp. 197–253, 2018.
  • [244] C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” in Text summarization branches out, 2004, pp. 74–81.
  • [245] H. Lin, H. Zhu, J. Wu, Y. Zuo, C. Zhu, and H. Xiong, “Enhancing employer brand evaluation with collaborative topic regression models,” ACM Transactions on Information Systems (TOIS), vol. 38, no. 4, pp. 1–33, 2020.
  • [246] H. Lin, H. Zhu, Y. Zuo, C. Zhu, J. Wu, and H. Xiong, “Collaborative company profiling: Insights from an employee’s perspective,” in Thirty-First AAAI Conference on Artificial Intelligence, 2017.
  • [247] Y. Lin, F. Lin, W. Zeng, J. Xiahou, L. Li, P. Wu, Y. Liu, and C. Miao, “Hierarchical reinforcement learning with dynamic recurrent mechanism for course recommendation,” Knowledge-Based Systems, vol. 244, p. 108546, 2022.
  • [248] LinkedIn, “About linkedin,” https://about.linkedin.com, 2022.
  • [249] B. Liu, Z. Jia, L. Zhao, and W. Kong, “Talent Flow Model Based on Graph Attention Network,” in 2023 4th International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE).   Nan**g, China: IEEE, Aug. 2023, pp. 115–121. [Online]. Available: https://ieeexplore.ieee.org/document/10303067/
  • [250] H. Liu and Y. Ge, “Job and employee embeddings: A joint deep learning approach,” IEEE Transactions on Knowledge and Data Engineering, 2022.
  • [251] H. Liu and H. Motoda, Feature extraction, construction and selection: A data mining perspective.   Springer Science & Business Media, 1998, vol. 453.
  • [252] H. Liu, S. Dai, and H. Jiang, “Application of rough set and support vector machine in competency assessment,” in 2009 Fourth International on Conference on Bio-Inspired Computing.   IEEE, 2009, pp. 1–4.
  • [253] J. Liu, J. Huang, T. Wang, L. Xing, and R. He, “A data-driven analysis of employee development based on working expertise,” IEEE Transactions on Computational Social Systems, vol. 8, no. 2, pp. 410–422, 2021.
  • [254] L. Liu, J. Liu, W. Zhang, Z. Chi, W. Shi, and Y. Huang, “Hiring now: A skill-aware multi-attention model for job posting generation,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 3096–3104.
  • [255] L. Liu, W. Zhang, J. Liu, W. Shi, and Y. Huang, “Learning multi-graph neural network for data-driven job skill prediction,” in 2021 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2021, pp. 1–8.
  • [256] Q. Liu, T. Luo, R. Tang, and S. Bressan, “An efficient and truthful pricing mechanism for team formation in crowdsourcing markets,” in 2015 IEEE International Conference on Communications (ICC).   IEEE, 2015, pp. 567–572.
  • [257] Y. Liu, G. Pant, and O. R. Sheng, “Predicting labor market competition: Leveraging interfirm network and employee skills,” Information Systems Research, vol. 31, no. 4, pp. 1443–1466, 2020.
  • [258] Z. Liu, Y. Zhang, P. Li, Y. Liu, and D. Yang, “Dynamic llm-agent network: An llm-agent collaboration framework with agent team optimization,” arXiv preprint arXiv:2310.02170, 2023.
  • [259] Y. Long, J. Liu, M. Fang, T. Wang, and W. Jiang, “Prediction of employee promotion based on personal basic features and post features,” in Proceedings of the International Conference on Data Processing and Applications, 2018, pp. 5–10.
  • [260] A. Lorincz, D. Graus, D. Lavi, and J. L. M. Pereira, “Transfer learning for multilingual vacancy text generation,” in Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), 2022, pp. 207–222.
  • [261] P. G. Lovaglio, M. Mezzanzanica, and E. Colombo, “Comparing time series characteristics of official and web job vacancy data,” Quality & Quantity, vol. 54, no. 1, pp. 85–98, 2020.
  • [262] J. Luo, Y. Liu, and M. Hou, “A novel chinese resume named entity recognition model based on lexical enhancement,” in Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition, 2022, pp. 341–346.
  • [263] Y. Luo, H. Zhang, Y. Wen, and X. Zhang, “Resumegan: An optimized deep representation learning framework for talent-job fit via adversarial learning,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019, pp. 1101–1110.
  • [264] J. Lv, C. Chen, and Z. Liang, “Automated scoring of asynchronous interview videos based on multi-modal window-consistency fusion,” IEEE Transactions on Affective Computing, 2023.
  • [265] H. Ma, Y. Zeng, S. Yang, C. Qin, X. Zhang, and L. Zhang, “A novel computerized adaptive testing framework with decoupled learning selector,” Complex & Intelligent Systems, vol. 9, no. 5, pp. 5555–5566, 2023.
  • [266] L. Ma, “Employee turnover prediction based on machine learning model,” in 2022 5th Asia Conference on Machine Learning and Computing (ACMLC).   IEEE, 2022, pp. 22–27.
  • [267] A. Magron, A. Dai, M. Zhang, S. Montariol, and A. Bosselut, “Jobskape: A framework for generating synthetic job postings to enhance skill matching,” arXiv preprint arXiv:2402.03242, 2024.
  • [268] J. Mahdavimoghaddam, “The Utility of Social Media in Understanding the Future of Work,” in Companion Publication of the 2022 Conference on Computer Supported Cooperative Work and Social Computing.   Virtual Event Taiwan: ACM, Nov. 2022. [Online]. Available: https://dl.acm.org/doi/10.1145/3500868.3561397
  • [269] J. Mahdavimoghaddam, A. Bahuguna, and E. Bagheri, “Exploring the utility of social content for understanding future in-demand skills,” Proceedings of the ACM on Human-Computer Interaction, vol. 6, no. CSCW2, pp. 1–35, 2022.
  • [270] J. Mahdavimoghaddam, N. Krishnaswamy, and E. Bagheri, “On the congruence between online social content and future it skill demand,” Proceedings of the ACM on Human-Computer Interaction, vol. 5, no. CSCW2, pp. 1–27, 2021.
  • [271] A. A. Mahmoud, T. A. Shawabkeh, W. A. Salameh, and I. Al Amro, “Performance predicting in hiring process and performance appraisals using machine learning,” in 2019 10th International Conference on Information and Communication Systems (ICICS).   IEEE, 2019, pp. 110–115.
  • [272] L. Malandri, F. Mercorio, M. Mezzanzanica, and N. Nobani, “Meet-lm: A method for embeddings evaluation for taxonomic data in the labour market,” Computers in Industry, vol. 124, p. 103341, 2021.
  • [273] J. Malinowski, T. Keim, O. Wendt, and T. Weitzel, “Matching people and jobs: A bilateral recommendation approach,” in Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS’06), vol. 6.   IEEE, 2006, pp. 137c–137c.
  • [274] O. Manad, M. Bentounsi, and P. Darmon, “Enhancing talent search by integrating and querying big hr data,” in 2018 IEEE International Conference on Big Data (Big Data).   IEEE, 2018, pp. 4095–4100.
  • [275] G. Mann and C. O’Neil, “Hiring algorithms are not neutral,” https://hbr.org/2016/12/hiring-algorithms-are-not-neutral, 2016.
  • [276] K. Mao, Z. Dou, F. Mo, J. Hou, H. Chen, and H. Qian, “Large language models know your contextual search intent: A prompting framework for conversational search,” arXiv preprint arXiv:2303.06573, 2023.
  • [277] S. Marrara, G. Pasi, M. Viviani, M. Cesarini, F. Mercorio, M. Mezzanzanica, and M. Pappagallo, “A language modelling approach for discovering novel labour market occupations from the web,” in Proceedings of the International Conference on Web Intelligence, 2017, pp. 1026–1034.
  • [278] Y. Mashayekhi, B. Kang, J. Lijffijt, and T. De Bie, “Recon: Reducing congestion in job recommendation using optimal transport,” in Proceedings of the 17th ACM Conference on Recommender Systems, 2023, pp. 696–701.
  • [279] Y. T. Matbouli and S. M. Alghamdi, “Statistical machine learning regression models for salary prediction featuring economy wide activities and occupations,” Information, vol. 13, no. 10, p. 495, 2022.
  • [280] N. Mathys and H. LaVan, “A survey of the human resource information systems (hris) of major companies,” Human Resource Planning, vol. 5, no. 2, pp. 83–90, 1982.
  • [281] M. M. Matonya, “Innovation, artificial intelligence in contingent work-force management,” International Journal of Engineering and Management Sciences, vol. 5, no. 1, pp. 571–590, 2020.
  • [282] R. Mazza, E. Zavarrone, M. Olivieri, and D. Corsaro, “A text mining approach for CSR communication: an explorative analysis of energy firms on Twitter in the post-pandemic era,” Italian Journal of Marketing, vol. 2022, no. 3, pp. 317–340, Sep. 2022.
  • [283] K. McDonald and L. Hite, “Career development: A human resource development perspective,” Routledge, 2015.
  • [284] N. McJames, A. Parnell, and A. O’Shea, “Factors affecting teacher job satisfaction and retention: A causal inference machine learning approach using data from talis 2018,” 2022.
  • [285] E. Medina, “Job satisfaction and employee turnover intention: what does organizational culture have to do with it?” Ph.D. dissertation, Columbia university, 2012.
  • [286] M. A. Menacer, F. B. Hamda, G. Mighri, S. B. Hamidene, and M. Cariou, “An interpretable person-job fitting approach based on classification and ranking,” in Proceedings of the 4th International Conference on Natural Language and Speech Processing (ICNLSP 2021), 2021, pp. 130–138.
  • [287] Q. Meng, K. Xiao, D. Shen, H. Zhu, and H. Xiong, “Fine-grained job salary benchmarking with a nonparametric dirichlet process–based latent factor model,” INFORMS Journal on Computing, 2022.
  • [288] Q. Meng, H. Zhu, K. Xiao, and H. Xiong, “Intelligent salary benchmarking for talent recruitment: A holistic matrix factorization approach,” in 2018 IEEE International Conference on Data Mining (ICDM).   IEEE, 2018, pp. 337–346.
  • [289] Q. Meng, H. Zhu, K. Xiao, L. Zhang, and H. Xiong, “A hierarchical career-path-aware neural network for job mobility prediction,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 14–24.
  • [290] V. M. Menon and H. Rahulnath, “A novel approach to evaluate and rank candidates in a recruitment process by estimating emotional intelligence through social media data,” in 2016 International Conference on Next Generation Intelligent Systems (ICNGIS).   IEEE, 2016, pp. 1–6.
  • [291] A. Mer and A. S. Virdi, “Navigating the paradigm shift in hrm practices through the lens of artificial intelligence: A post-pandemic perspective,” The Adoption and Effect of Artificial Intelligence on Human Resources Management, Part A, pp. 123–154, 2023.
  • [292] M. R. Mohamad, F. H. Nasaruddin, S. Hamid, S. Bukhari, and M. T. Ijab, “Predicting employees’ turnover in it industry using classification method with feature selection,” in 2021 International conference on computer science and engineering (IC2SE), vol. 1.   IEEE, 2021, pp. 1–7.
  • [293] S. Mohammad, “Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 english words,” in Proceedings of the 56th annual meeting of the association for computational linguistics (volume 1: Long papers), 2018, pp. 174–184.
  • [294] A. Moniz and F. d. Jong, “Sentiment analysis and the impact of employee satisfaction on firm earnings,” in European conference on information retrieval.   Springer, 2014, pp. 519–527.
  • [295] A. Morales, J. Fierrez, R. Vera-Rodriguez, and R. Tolosana, “Sensitivenets: Learning agnostic representations with application to face images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 6, pp. 2158–2164, 2020.
  • [296] N. Mouli, P. Das, M. B. Muquith, A. Biswas, M. D. Kabir Niloy, M. Faisal Ahmed, and D. Z. Karim, “Unveiling Employee Job Satisfaction: Harnessing Deep Learning for Sentiment Analysis,” in 2023 26th International Conference on Computer and Information Technology (ICCIT).   Cox’s Bazar, Bangladesh: IEEE, Dec. 2023, pp. 1–6.
  • [297] F. Mozaffari, M. Rahimi, H. Yazdani, and B. Sohrabi, “Employee attrition prediction in a pharmaceutical company using both machine learning approach and qualitative data,” Benchmarking: An International Journal, vol. 30, no. 10, pp. 4140–4173, 2023.
  • [298] J. Mu, X. Li, and N. Goodman, “Learning to compress prompts with gist tokens,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  • [299] D. F. Mujtaba and N. R. Mahapatra, “Ethical considerations in ai-based recruitment,” in 2019 IEEE International Symposium on Technology and Society (ISTAS).   IEEE, 2019, pp. 1–7.
  • [300] A. N. Mukherjee, S. Bhattacharyya, and R. Bera, “Role of information technology in human resource management of sme: A study on the use of applicant tracking system,” IBMRD’s Journal of Management & Research, pp. 1–22, 2014.
  • [301] R. Muthyala, S. Wood, Y. **, Y. Qin, H. Gao, and A. Rai, “Data-driven job search engine using skills and company attribute filters,” in 2017 IEEE International Conference on Data Mining Workshops (ICDMW).   IEEE, 2017, pp. 199–206.
  • [302] V. Nagadevara, V. Srinivasan, and R. Valk, “Establishing a link between employee turnover and withdrawal behaviours: Application of data mining techniques,” 2008.
  • [303] I. Naim, M. I. Tanveer, D. Gildea, and M. E. Hoque, “Automated prediction and analysis of job interview performance: The role of what you say and how you say it,” in 2015 11th IEEE international conference and workshops on automatic face and gesture recognition (FG), vol. 1.   IEEE, 2015, pp. 1–6.
  • [304] Naim, Iftekhar and Tanveer, Md Iftekhar and Gildea, Daniel and Hoque, Mohammed Ehsan, “Automated analysis and prediction of job interview performance,” IEEE Transactions on Affective Computing, vol. 9, no. 2, pp. 191–204, 2016.
  • [305] H. R. Nalbantian and R. A. Guzzo, Workforce Diversity, Internal Labor Market Approach to.   John Wiley & Sons, Ltd, 2015, pp. 1–5.
  • [306] J. Nemec, H. Davoudi, L. Golab, M. Kargar, Y. Lytvyn, P. Mierzejewski, J. Szlichta, and M. Zihayat, “Rw-team: Robust team formation using random walk,” in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 4759–4763.
  • [307] L. S. Nguyen, D. Frauendorfer, M. S. Mast, and D. Gatica-Perez, “Hire me: Computational inference of hirability in employment interviews based on nonverbal behavior,” IEEE transactions on multimedia, vol. 16, no. 4, pp. 1018–1031, 2014.
  • [308] D. Ntioudis, P. Masa, A. Karakostas, G. Meditskos, S. Vrochidis, and I. Kompatsiaris, “Ontology-based personalized job recommendation framework for migrants and refugees,” Big Data and Cognitive Computing, vol. 6, no. 4, p. 120, 2022.
  • [309] R. J. Oentaryo, X. J. S. Ashok, E.-P. Lim, and P. K. Prasetyo, “On analyzing job hop behavior and talent flow networks,” in 2017 IEEE International Conference on Data Mining Workshops (ICDMW).   IEEE, 2017, pp. 207–214.
  • [310] R. J. Oentaryo, E.-P. Lim, X. J. S. Ashok, P. K. Prasetyo, K. H. Ong, and Z. Q. Lau, “Talent flow analytics in online professional network,” Data Science and Engineering, vol. 3, no. 3, pp. 199–220, 2018.
  • [311] C. Ozcaglar, S. Geyik, B. Schmitz, P. Sharma, A. Shelkovnykov, Y. Ma, and E. Buchanan, “Entity personalized talent search models with tree interaction features,” in The World Wide Web Conference, 2019, pp. 3116–3122.
  • [312] S. Pal, K. Khan, A. K. Singh, S. Ghosh, T. Nayak, G. Palshikar, and I. Bhattacharya, “Weakly supervised context-based interview question generation,” in Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), 2022, pp. 43–53.
  • [313] G. K. Palshikar, K. Sahu, and R. Srivastava, “Ensembles of interesting subgroups for discovering high potential employees,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining.   Springer, 2016, pp. 208–220.
  • [314] A. Pan, E. Jones, M. Jagadeesan, and J. Steinhardt, “Feedback loops with language models drive in-context reward hacking,” arXiv preprint arXiv:2402.06627, 2024.
  • [315] M. Papachristou and Y. Yuan, “Network formation and dynamics among multi-llms,” arXiv preprint arXiv:2402.10659, 2024.
  • [316] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
  • [317] A. Parameswaran, P. Venetis, and H. Garcia-Molina, “Recommendation systems with complex constraints: A course recommendation perspective,” ACM Transactions on Information Systems (TOIS), vol. 29, no. 4, pp. 1–33, 2011.
  • [318] J. Park, I. B. Wood, E. **g, A. Nematzadeh, S. Ghosh, M. D. Conover, and Y.-Y. Ahn, “Global labor flow network reveals the hierarchical organization and dynamics of geo-industrial clusters,” Nature communications, vol. 10, no. 1, pp. 1–10, 2019.
  • [319] F. F. Patacsil and M. Acosta, “Analyzing the relationship between information technology jobs advertised on-line and skills requirements using association rules,” Bulletin of Electrical Engineering and Informatics, vol. 10, no. 5, pp. 2771–2779, 2021.
  • [320] S. Patki, V. Sankhe, M. Jawwad, and N. Mulla, “Personalised employee training,” in 2021 International Conference on Communication information and Computing Technology (ICCICT).   IEEE, 2021, pp. 1–6.
  • [321] S. Pawar, R. Srivastava, and G. K. Palshikar, “Automatic gazette creation for named entity recognition and application to resume processing,” in Proceedings of the 5th ACM COMPUTE Conference: Intelligent & scalable system technologies, 2012, pp. 1–7.
  • [322] V. Peltokorpi, D. G. Allen, and F. Froese, “Organizational embeddedness, turnover intentions, and voluntary turnover: The moderating effects of employee demographic characteristics and value orientations,” Journal of organizational behavior, vol. 36, no. 2, pp. 292–312, 2015.
  • [323] A. Pena, I. Serna, A. Morales, and J. Fierrez, “Bias in multimodal ai: Testbed for fair automatic recruitment,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 28–29.
  • [324] J. W. Pennebaker, M. E. Francis, and R. J. Booth, “Linguistic inquiry and word count: Liwc 2001,” Mahway: Lawrence Erlbaum Associates, vol. 71, no. 2001, p. 2001, 2001.
  • [325] R. Pepermans, D. Vloeberghs, and B. Perkisas, “High potential identification policies: An empirical study among belgian companies,” Journal of Management Development, 2003.
  • [326] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 701–710.
  • [327] L. Pham Van, S. Vu Ngoc, and V. Nguyen Van, “Study of information extraction in resume.”   Conference, 2018.
  • [328] K. Pilgrim, J. Koss, and S. Bohnet-Joschko, CSR Communication on Twitter - A Sco** Review on Social Media Mining and Analytic Methods, Jan. 2023.
  • [329] C. O. Porter, D. E. Cordon, and A. E. Barber, “The dynamics of salary negotiations: Effects on applicants’justice perceptions and recruitment decisions,” International Journal of Conflict Management, 2004.
  • [330] D. S. Pugh, “Organization theory: selected classic readings,” 2007.
  • [331] Pugh, Derek Salman, “Organization theory: Selected readings,” Penguin (Non-Classics), 1971.
  • [332] C. Qin, K. Yao, H. Zhu, T. Xu, D. Shen, E. Chen, and H. Xiong, “Towards automatic job description generation with capability-aware neural networks,” IEEE Transactions on Knowledge and Data Engineering, 2022.
  • [333] C. Qin, H. Zhu, D. Shen, Y. Sun, K. Yao, P. Wang, and H. Xiong, “Automatic skill-oriented question generation and recommendation for intelligent job interviews,” ACM Transactions on Information Systems, vol. 42, no. 1, pp. 1–32, 2023.
  • [334] C. Qin, H. Zhu, T. Xu, C. Zhu, L. Jiang, E. Chen, and H. Xiong, “Enhancing person-job fit for talent recruitment: An ability-aware neural network approach,” in The 41st international ACM SIGIR conference on research & development in information retrieval, 2018, pp. 25–34.
  • [335] C. Qin, H. Zhu, T. Xu, C. Zhu, C. Ma, E. Chen, and H. Xiong, “An enhanced neural network approach to person-job fit in talent recruitment,” ACM Transactions on Information Systems (TOIS), vol. 38, no. 2, pp. 1–33, 2020.
  • [336] C. Qin, H. Zhu, C. Zhu, T. Xu, F. Zhuang, C. Ma, J. Zhang, and H. Xiong, “Duerquiz: A personalized question recommender system for intelligent job interview,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 2165–2173.
  • [337] A. Qutub, A. Al-Mehmadi, M. Al-Hssan, R. Aljohani, and H. S. Alghamdi, “Prediction of employee attrition using machine learning and ensemble methods,” Int. J. Mach. Learn. Comput, vol. 11, no. 2, pp. 110–114, 2021.
  • [338] H. Rahman, S. Thirumuruganathan, S. B. Roy, S. Amer-Yahia, and G. Das, “Worker skill estimation in team-based tasks,” Proceedings of the VLDB Endowment, vol. 8, no. 11, pp. 1142–1153, Jul. 2015.
  • [339] M. A. K. Raiaan, M. S. H. Mukta, K. Fatema, N. M. Fahad, S. Sakib, M. M. J. Mim, J. Ahmad, M. E. Ali, and S. Azam, “A review on large language models: Architectures, applications, taxonomies, open issues and challenges,” IEEE Access, 2024.
  • [340] R. Ramanath, H. Inan, G. Polatkan, B. Hu, Q. Guo, C. Ozcaglar, X. Wu, K. Kenthapadi, and S. C. Geyik, “Towards deep and representation learning for talent search at linkedin,” in Proceedings of the 27th ACM international conference on information and knowledge management, 2018, pp. 2253–2261.
  • [341] M. S. Rehan, F. Rustam, S. Ullah, S. Hussain, A. Mehmood, and G. S. Choi, “Employees reviews classification and evaluation (ERCE) model using supervised machine learning approaches,” Journal of Ambient Intelligence and Humanized Computing, vol. 13, no. 6, pp. 3119–3136, Jun. 2022, 11 citations (Crossref) [2024-04-28].
  • [342] I. Reis, M. J. Sousa, and A. Dionísio, “Employer Branding as a Talent Management Tool: A Systematic Literature Revision,” Sustainability, vol. 13, no. 19, p. 10698, Jan. 2021.
  • [343] M. E. Roberts, B. M. Stewart, and E. M. Airoldi, “A model of text for experimentation in the social sciences,” Journal of the American Statistical Association, vol. 111, no. 515, pp. 988–1003, 2016.
  • [344] D. S. Rodrigo and G. S. Ratnayake, “Employee turnover prediction system: With special reference to apparel industry in sri lanka,” in 2021 6th International Conference for Convergence in Technology (I2CT).   IEEE, 2021, pp. 1–9.
  • [345] E. Rosenbaum, “Ibm artificial intelligence can predict with 95% accuracy which workers are about to quit their jobs,” https://www.cnbc.com/2019/04/03/ibm-ai-can-predict-with-95-percent-accuracy-which-employees-will-quit.html, 2019.
  • [346] K. Saha, A. Yousuf, L. Hickman, P. Gupta, L. Tay, and M. De Choudhury, “A social media study on demographic differences in perceived job satisfaction,” Proceedings of the ACM on Human-Computer Interaction, vol. 5, no. CSCW1, pp. 1–29, 2021.
  • [347] K. Şahinbaş, “Employee promotion prediction by using machine learning algorithms for imbalanced dataset,” in 2022 2nd International Conference on Computing and Machine Intelligence (ICMI).   IEEE, 2022, pp. 1–5.
  • [348] Y. Saito and K. Sugiyama, “Job recommendation based on multiple behaviors and explicit preferences,” in 2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT).   IEEE, 2022, pp. 1–8.
  • [349] ——, “Multi-behavior job recommendation with dynamic availability,” in Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, 2023, pp. 264–271.
  • [350] L. Saleh and S. Abu-Soud, “Predicting jordanian job satisfaction using artificial neural network and decision tree,” in 2021 11th International Conference on Advanced Computer Information Technologies (ACIT).   IEEE, 2021, pp. 735–738.
  • [351] K. Sandhya and D. P. Kumar, “Employee retention by motivation,” Indian Journal of science and technology, vol. 4, no. 12, pp. 1778–1782, 2011.
  • [352] M. Sap, M. C. Prasettio, A. Holtzman, H. Rashkin, and Y. Choi, “Connotation frames of power and agency in modern films,” in Proceedings of the 2017 conference on empirical methods in natural language processing, 2017, pp. 2329–2334.
  • [353] I. Savin, K. Chukavina, and A. Pushkarev, “Topic-based classification and identification of global trends for startup companies,” Small Business Economics, vol. 60, no. 2, pp. 659–689, Feb. 2023.
  • [354] P. R. SB, M. Agnihotri, and D. B. Jayagopi, “Automatic follow-up question generation for asynchronous interviews,” in Proceedings of the Workshop on Intelligent Information Processing and Natural Language Generation, 2020, pp. 10–20.
  • [355] R. Schellingerhout, V. Medentsiy, and M. Marx, “Explainable career path predictions using neural models,” Recommender Systems for Human Resources, vol. 3218, p. 7, 2022.
  • [356] T. Schmader, J. Whitehead, and V. H. Wysocki, “A linguistic comparison of letters of recommendation for male and female chemistry and biochemistry job applicants,” Sex roles, vol. 57, no. 7, pp. 509–514, 2007.
  • [357] T. Schmiedel, O. Müller, and J. vom Brocke, “Topic modeling as a strategy of inquiry in organizational research: A tutorial with an application example on organizational culture,” Organizational Research Methods, vol. 22, no. 4, pp. 941–968, 2019.
  • [358] S. Shafie, P. O. Soek, and W. K. Khai, “Prediction of employee promotion using hybrid sampling method with machine learning architecture,” Malaysian Journal of Computing (MJoC), vol. 8, no. 1, pp. 1264–1286, 2023.
  • [359] A. M. Shah, X. Yan, A. Qayyum, R. A. Naqvi, and S. J. Shah, “Mining topic and sentiment dynamics in physician rating websites during the early wave of the covid-19 pandemic: Machine learning approach,” International Journal of Medical Informatics, vol. 149, p. 104434, 2021.
  • [360] T. Shao, C. Song, J. Zheng, F. Cai, H. Chen et al., “Exploring internal and external interactions for semi-structured multivariate attributes in job-resume matching,” International Journal of Intelligent Systems, vol. 2023, 2023.
  • [361] A. Sharma and J. Bhatnagar, “Talent analytics: A strategic tool for talent management outcomes,” Indian Journal of Industrial Relations, vol. 52, no. 3, pp. 515–527, 2017.
  • [362] E. Shehab, M. Sharp, L. Supramaniam, and T. Spedding, “Enterprise resource planning: An integrative review,” Business Process Management Journal, vol. 10, no. 4, pp. 359–386, Aug. 2004.
  • [363] D. Shen, C. Qin, H. Zhu, T. Xu, E. Chen, and H. Xiong, “Joint representation learning with relation-enhanced topic models for intelligent job interview assessment,” ACM Transactions on Information Systems (TOIS), vol. 40, no. 1, pp. 1–36, 2021.
  • [364] D. Shen, H. Zhu, C. Zhu, T. Xu, C. Ma, and H. Xiong, “A joint learning approach to intelligent job interview assessment.” in IJCAI, vol. 18, 2018, pp. 3542–3548.
  • [365] B. Shi, S. Li, J. Yang, M. E. Kazdagli, and Q. He, “Learning to ask screening questions for job postings,” in Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020, pp. 549–558.
  • [366] X. C. Shi and Z. Chen, “Listening to your employees: analyzing opinions from online reviews of hotel companies,” International Journal of Contemporary Hospitality Management, vol. 33, no. 6, pp. 2091–2116, Jan. 2021.
  • [367] K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu, “Fake news detection on social media: A data mining perspective,” ACM SIGKDD explorations newsletter, vol. 19, no. 1, pp. 22–36, 2017.
  • [368] A. Shulner-Tal, T. Kuflik, and D. Kliger, “Fairness, explainability and in-between: understanding the impact of different explanation methods on non-expert users’ perceptions of fairness toward an algorithmic system,” Ethics and Information Technology, vol. 24, no. 1, p. 2, 2022.
  • [369] J. K. M. Sia, “University choice: Implications for marketing and positioning,” Education, vol. 3, no. 1, pp. 7–14, 2013.
  • [370] C. Signore, B. Della Piana, and F. Di Vincenzo, “Digital job searching and recruitment platforms: A semi-systematic literature review,” in International Conference in Methodologies and intelligent Systems for Techhnology Enhanced Learning.   Springer, 2023, pp. 313–322.
  • [371] R. Silzer and A. H. Church, “Identifying and assessing high-potential talent,” Strategy-driven talent management: A leadership imperative, vol. 28, pp. 213–280, 2010.
  • [372] A. Singhania, A. Unnam, and V. Aggarwal, “Grading video interviews with fairness considerations,” arXiv preprint arXiv:2007.05461, 2020.
  • [373] D. S. Sisodia, S. Vishwakarma, and A. Pujahari, “Evaluation of machine learning models for employee churn prediction,” in 2017 international conference on inventive computing and informatics (icici).   IEEE, 2017, pp. 1016–1020.
  • [374] B. Sivathanu and R. Pillai, “Technology and talent analytics for talent management–a game changer for organizational performance,” International Journal of Organizational Analysis, 2019.
  • [375] P. Skondras, P. Zervas, and G. Tzimas, “Generating synthetic resume data with large language models for enhanced job description classification,” Future Internet, vol. 15, no. 11, p. 363, 2023.
  • [376] R. Spears, “The impact of public opinion on large global companies’ market valuations: A markov switching model approach,” Journal of Finance and Economics, vol. 9, no. 3, pp. 115–141, 2021.
  • [377] R. Srivastava, G. K. Palshikar, S. Chaurasia, and A. Dixit, “What’s next? a recommendation system for industrial training,” Data Science and Engineering, vol. 3, no. 3, pp. 232–247, 2018.
  • [378] F. Stephany and O. Teutloff, “What is the price of a skill? The value of complementarity,” Research Policy, vol. 53, no. 1, p. 104898, Jan. 2024.
  • [379] K. D. Strang and Z. Sun, “Erp staff versus ai recruitment with employment real-time big data,” Discover Artificial Intelligence, vol. 2, no. 1, p. 21, 2022.
  • [380] B. Subha, I. A. K. Shaikh, P. J. Patil, R. Sethumadhavan, M. Preetha, and H. Patil, “Predictive analysis of employee turnover in it using a hybrid crf-bilstm and cnn model,” in 2023 International Conference on Sustainable Communication Networks and Application (ICSCNA).   IEEE, 2023, pp. 914–919.
  • [381] Y. Sun, F. Zhuang, H. Zhu, X. Song, Q. He, and H. Xiong, “The impact of person-organization fit on talent management: A structure-aware convolutional neural network approach,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 1625–1633.
  • [382] Y. Sun, F. Zhuang, H. Zhu, Q. Zhang, Q. He, and H. Xiong, “Market-oriented job skill valuation with cooperative composition neural network,” Nature communications, vol. 12, no. 1, pp. 1–12, 2021.
  • [383] Sun, Ying and Zhuang, Fuzhen and Zhu, Hengshu and Song, Xin and He, Qing and Xiong, Hui, “Modeling the impact of person-organization fit on talent management with structure-aware attentive neural networks,” IEEE Transactions on Knowledge & Data Engineering, 2021.
  • [384] A. Susarla, R. Gopal, J. B. Thatcher, and S. Sarker, “The janus effect of generative ai: Charting the path for responsible conduct of scholarly activities in information systems,” Information Systems Research, vol. 34, no. 2, pp. 399–408, 2023.
  • [385] I. Tarique and R. S. Schuler, “Global talent management: Literature review, integrative framework, and suggestions for further research,” Journal of world business, vol. 45, no. 2, pp. 122–133, 2010.
  • [386] M. Teng, H. Zhu, C. Liu, and H. Xiong, “Exploiting network fusion for organizational turnover prediction,” ACM Transactions on Management Information Systems (TMIS), vol. 12, no. 2, pp. 1–18, 2021.
  • [387] M. Teng, H. Zhu, C. Liu, C. Zhu, and H. Xiong, “Exploiting the contagious effect for employee turnover prediction,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019, pp. 1166–1173.
  • [388] S. Thakur, A. K. Kar, and N. Sharma, “Impact of Corporate Social Responsibility Orientation of CEOs on Online Reputation-Insights from Text Mining,” in IoT, Big Data and AI for Improving Quality of Everyday Life: Present and Future Challenges: IOT, Data Science and Artificial Intelligence Technologies.   Cham: Springer International Publishing, 2023, pp. 117–138.
  • [389] S. M. Tharani and S. V. Raj, “Predicting employee turnover intention in it&ites industry using machine learning algorithms,” in 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC).   IEEE, 2020, pp. 508–513.
  • [390] M. Theobald, H. Bast, D. Majumdar, R. Schenkel, and G. Weikum, “Topx: efficient and versatile top-k query processing for semistructured data,” The VLDB Journal, vol. 17, pp. 81–115, 2008.
  • [391] C. Upadhyay, H. Abu-Rasheed, C. Weber, and M. Fathi, “Explainable job-posting recommendations using knowledge graphs and named entity recognition,” in 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC).   IEEE, 2021, pp. 3291–3296.
  • [392] P. Van Esch, J. S. Black, and J. Ferolie, “Marketing ai recruitment: The next phase in job application and selection,” Computers in Human Behavior, vol. 90, pp. 215–222, 2019.
  • [393] P. VARDARLIER, “Digital transformation of human resource management: digital applications and strategic tools in hrm,” pp. 239–264, 2020.
  • [394] T. Veale, “Round up the usual suspects: Knowledge-based metaphor generation,” in Proceedings of the Fourth Workshop on Metaphor in NLP, 2016, pp. 34–41.
  • [395] R. Verrap, E. Nirjhar, A. Nenkova, and T. Chaspari, ““am i answering my job interview questions right?”: A nlp approach to predict degree of explanation in job interview responses,” in Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI), 2022, pp. 122–129.
  • [396] S. Vijayarani, M. J. Ilamathi, M. Nithya et al., “Preprocessing techniques for text mining-an overview,” International Journal of Computer Science & Communication Networks, vol. 5, no. 1, pp. 7–16, 2015.
  • [397] D. Vrontis, M. Christofi, V. Pereira, S. Tarba, A. Makrides, and E. Trichina, “Artificial intelligence, robotics, advanced technologies and human resource management: a systematic review,” The international journal of human resource management, vol. 33, no. 6, pp. 1237–1266, 2022.
  • [398] A. Wakchaure, R. Eaglin, and B. Motlagh, “A technique for the quantitative measure of data cleanliness,” in 2008 IEEE Conference on Cybernetics and Intelligent Systems.   IEEE, 2008, pp. 1258–1263.
  • [399] B. Walek and O. Pektor, “Data mining of job requirements in online job advertisements using machine learning and sdca logistic regression,” Mathematics, vol. 9, no. 19, p. 2475, 2021.
  • [400] C. Wang, H. Zhu, Q. Hao, K. Xiao, and H. Xiong, “Variable interval time sequence modeling for career trajectory prediction: Deep collaborative perspective,” in Proceedings of the Web Conference 2021, 2021, pp. 612–623.
  • [401] C. Wang, H. Zhu, P. Wang, C. Zhu, X. Zhang, E. Chen, and H. Xiong, “Personalized and explainable employee training course recommendations: A bayesian variational approach,” ACM Transactions on Information Systems (TOIS), vol. 40, no. 4, pp. 1–32, 2021.
  • [402] C. Wang, H. Zhu, C. Zhu, X. Zhang, E. Chen, and H. Xiong, “Personalized employee training course recommendation with career development awareness,” in Proceedings of the Web Conference 2020, 2020, pp. 1648–1659.
  • [403] L. Wang, C. Ma, X. Feng, Z. Zhang, H. Yang, J. Zhang, Z. Chen, J. Tang, X. Chen, Y. Lin et al., “A survey on large language model based autonomous agents,” Frontiers of Computer Science, vol. 18, no. 6, pp. 1–26, 2024.
  • [404] S. Wang, H. Yuan, L. M. Ni, and J. Guo, “Quantagent: Seeking holy grail in trading by self-improving large language model,” arXiv preprint arXiv:2402.03755, 2024.
  • [405] X. Wang, Z. Zhao, and W. Ng, “Ustf: a unified system of team formation,” IEEE Transactions on Big Data, vol. 2, no. 1, pp. 70–84, 2016.
  • [406] X. Wang, X. Zhang, Y. Cheng, F. Tian, K. Chen, and P. O. de Pablos, “Artificial intelligence-enabled knowledge management,” The Routledge Companion to Knowledge Management, pp. 153–168, 2022.
  • [407] Y. Wang, Y. Allouache, and C. Joubert, “Analysing cv corpus for finding suitable candidates using knowledge graph and bert,” in DBKDA 2021, The Thirteenth International Conference on Advances in Databases, Knowledge, and Data Applications, 2021.
  • [408] Y. Wang, Y. Pan, M. Yan, Z. Su, and T. H. Luan, “A survey on chatgpt: Ai-generated contents, challenges, and solutions,” IEEE Open Journal of the Computer Society, 2023.
  • [409] Z. Wang, W. Wei, C. Xu, J. Xu, and X.-L. Mao, “Person-job fit estimation from candidate profile and related recruitment history with co-attention neural networks,” Neurocomputing, vol. 501, pp. 14–24, 2022.
  • [410] J. Wei and K. Zou, “Eda: Easy data augmentation techniques for boosting performance on text classification tasks,” arXiv preprint arXiv:1901.11196, 2019.
  • [411] M. Wei, Y. He, and Q. Zhang, “Robust layout-aware ie for visually rich documents with pre-trained language models,” in Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020, pp. 2367–2376.
  • [412] X. Wei, X. Cui, N. Cheng, X. Wang, X. Zhang, S. Huang, P. Xie, J. Xu, Y. Chen, M. Zhang et al., “Zero-shot information extraction via chatting with chatgpt,” arXiv preprint arXiv:2302.10205, 2023.
  • [413] S. Westberg, “Applying a chatbot for assistance in the onboarding process: A process of requirements elicitation and prototype creation,” 2019.
  • [414] H. Wi, S. Oh, J. Mun, and M. Jung, “A team formation model based on knowledge and collaboration,” Expert Systems with Applications, vol. 36, no. 5, pp. 9121–9134, 2009.
  • [415] K. M. Wiig, “Knowledge management: an introduction and perspective,” Journal of knowledge Management, 1997.
  • [416] A. B. Wild Ali, “Prediction of employee turn over using random forest classifier with intensive optimized pca algorithm,” Wireless Personal Communications, vol. 119, no. 4, pp. 3365–3382, 2021.
  • [417] V. Wolf, F. Neubürger, and R. Lanwehr, “Generating synthetic data for better prediction modeling in skill demand forecasting,” in 2023 IEEE World Conference on Applied Intelligence and Computing (AIC).   IEEE, 2023, pp. 313–318.
  • [418] I. A. Wowczko, “Skills and vacancy analysis with data mining techniques,” in Informatics, vol. 2, no. 4.   Multidisciplinary Digital Publishing Institute, 2015, pp. 31–49.
  • [419] L. Wu, Z. Qiu, Z. Zheng, H. Zhu, and E. Chen, “Exploring large language model for graph data understanding in online job recommendations,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 8, 2024, pp. 9178–9186.
  • [420] L. Wu, Z. Zheng, Z. Qiu, H. Wang, H. Gu, T. Shen, C. Qin, C. Zhu, H. Zhu, Q. Liu et al., “A survey on large language models for recommendation,” arXiv preprint arXiv:2305.19860, 2023.
  • [421] W.-W. Wu, Y.-T. Lee, and G.-H. Tzeng, “Simplifying the manager competency model by using the rough set approach,” in International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing.   Springer, 2005, pp. 484–494.
  • [422] X. Wu, T. Xu, H. Zhu, L. Zhang, E. Chen, and H. Xiong, “Trend-aware tensor factorization for job skill demand analysis.” in IJCAI, 2019, pp. 3891–3897.
  • [423] C. Xie, C. Chen, F. Jia, Z. Ye, K. Shu, A. Bibi, Z. Hu, P. Torr, B. Ghanem, and G. Li, “Can large language model agents simulate human trust behaviors?” arXiv preprint arXiv:2402.04559, 2024.
  • [424] D. Xu and X. Xiao, “Influence of the development of vr technology on enterprise human resource management in the era of artificial intelligence,” IEEE Access, 2020.
  • [425] H. Xu, Z. Yu, J. Yang, H. Xiong, and H. Zhu, “Talent circle detection in job transition networks,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 655–664.
  • [426] T. Xu, H. Zhu, C. Zhu, P. Li, and H. Xiong, “Measuring the popularity of job skills in recruitment market: A multi-criteria approach,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, 2018.
  • [427] Y. Xu, Y. Xu, T. Lv, L. Cui, F. Wei, G. Wang, Y. Lu, D. Florencio, C. Zhang, W. Che et al., “Layoutlmv2: Multi-modal pre-training for visually-rich document understanding,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp. 2579–2591.
  • [428] Y. Xu, M. Li, L. Cui, S. Huang, F. Wei, and M. Zhou, “Layoutlm: Pre-training of text and layout for document image understanding,” in Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, 2020, pp. 1192–1200.
  • [429] Y. Xu, T. Lv, L. Cui, G. Wang, Y. Lu, D. Florencio, C. Zhang, and F. Wei, “Layoutxlm: Multimodal pre-training for multilingual visually-rich document understanding,” arXiv preprint arXiv:2104.08836, 2021.
  • [430] Xu, Huang and Yu, Zhiwen and Yang, **gyuan and Xiong, Hui and Zhu, Hengshu, “Dynamic talent flow analysis with deep sequence prediction modeling,” IEEE Transactions on Knowledge and Data Engineering, vol. 31, no. 10, pp. 1926–1939, 2018.
  • [431] L. Xue, N. Constant, A. Roberts, M. Kale, R. Al-Rfou, A. Siddhant, A. Barua, and C. Raffel, “mt5: A massively multilingual pre-trained text-to-text transformer,” arXiv preprint arXiv:2010.11934, 2020.
  • [432] N. B. Yahia, J. Hlel, and R. Colomo-Palacios, “From big data to deep data to support people analytics for employee attrition prediction,” Ieee Access, vol. 9, pp. 60 447–60 458, 2021.
  • [433] M. Yamashita, Y. Li, T. Tran, Y. Zhang, and D. Lee, “Looking further into the future: Career pathway prediction,” WSDM Computational Jobs Marketplace, 2022.
  • [434] M. Yamashita, J. T. Shen, T. Tran, H. Ekhtiari, and D. Lee, “James: Normalizing job titles with multi-aspect graph embeddings and reasoning,” in 2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA).   IEEE, 2023, pp. 1–10.
  • [435] R. Yan, R. Le, Y. Song, T. Zhang, X. Zhang, and D. Zhao, “Interview choice reveals your preference on the market: To improve job-resume matching through profiling memories,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 914–922.
  • [436] S. Yan, D. Huang, and M. Soleymani, “Mitigating biases in multimodal personality assessment,” in Proceedings of the 2020 International Conference on Multimodal Interaction, 2020, pp. 361–369.
  • [437] C. Yang, Y. Hou, Y. Song, T. Zhang, J.-R. Wen, and W. X. Zhao, “Modeling two-way selection preference for person-job fit,” in Proceedings of the 16th ACM Conference on Recommender Systems, 2022, pp. 102–112.
  • [438] J. Yang, J. Dong, Q. Song, Y. S. Otmakhova, and Z. He, “The impacts of payment policy on performance of human resource market system: Agent-based modeling and simulation of growth-oriented firms,” Systems, vol. 11, no. 6, p. 298, 2023.
  • [439] Y. Yang, C. Zhang, X. Song, Z. Dong, H. Zhu, and W. Li, “Contextualized knowledge graph embedding for explainable talent training course recommendation,” ACM Transactions on Information Systems, 2023.
  • [440] Z. Yang, S. Yan, A. Lad, X. Liu, and W. Guo, “Cascaded deep neural ranking models in linkedin people search,” in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 4312–4320.
  • [441] Z. Yang, A. Liu, Z. Liu, K. Liu, F. Xiong, Y. Wang, Z. Yang, Q. Hu, X. Chen, Z. Zhang et al., “Towards unified alignment between agents, humans, and environment,” arXiv preprint arXiv:2402.07744, 2024.
  • [442] K. Yao, C. Qin, H. Zhu, C. Ma, J. Zhang, Y. Du, and H. Xiong, “An interactive neural network approach to keyphrase extraction in talent recruitment,” in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 2383–2393.
  • [443] K. Yao, J. Zhang, C. Qin, X. Song, P. Wang, H. Zhu, and H. Xiong, “Resuformer: Semantic structure understanding for resumes via multi-modal pre-training,” in 2023 IEEE 39th International Conference on Data Engineering (ICDE).   IEEE, 2023.
  • [444] K. Yao, J. Zhang, C. Qin, P. Wang, H. Zhu, and H. Xiong, “Knowledge enhanced person-job fit for talent recruitment,” in 2022 IEEE 38th International Conference on Data Engineering (ICDE).   IEEE, 2022, pp. 3467–3480.
  • [445] S. Ye, H. Hwang, S. Yang, H. Yun, Y. Kim, and M. Seo, “In-context instruction learning,” arXiv e-prints, pp. arXiv–2302, 2023.
  • [446] Y. Ye, Z. Dong, H. Zhu, T. Xu, X. Song, R. Yu, and H. Xiong, “Mane: Organizational network embedding with multiplex attentive neural networks,” IEEE Transactions on Knowledge and Data Engineering, 2022.
  • [447] Y. Ye, H. Zhu, T. Xu, F. Zhuang, R. Yu, and H. Xiong, “Identifying high potential talent: A neural network based dynamic social profiling approach,” in 2019 IEEE International Conference on Data Mining (ICDM).   IEEE, 2019, pp. 718–727.
  • [448] D. Yin, X. Zhang, and H. Zhao, “Understanding and predicting innovative potential of scholars based on deep learning method,” in PACIS 2022 Proceedings, 2022.
  • [449] K. Yu, G. Guan, and M. Zhou, “Resume information extraction with cascaded hybrid model,” in Proceedings of the 43rd annual meeting of the Association for Computational Linguistics (ACL’05), 2005, pp. 499–506.
  • [450] X. Yu, J. Zhang, and Z. Yu, “Confit: Improving resume-job matching using data augmentation and contrastive learning,” arXiv preprint arXiv:2401.16349, 2024.
  • [451] X. Yu, C. Qin, D. Shen, H. Ma, L. Zhang, X. Zhang, H. Zhu, and H. Xiong, “Rdgt: Enhancing group cognitive diagnosis with relation-guided dual-side graph transformer,” IEEE Transactions on Knowledge and Data Engineering, 2024.
  • [452] J. Yuan, “Research on employee turnover prediction based on machine learning algorithms,” in 2021 4th international conference on artificial intelligence and big data (icaibd).   IEEE, 2021, pp. 114–120.
  • [453] J. Yuan, Q.-M. Zhang, J. Gao, L. Zhang, X.-S. Wan, X.-J. Yu, and T. Zhou, “Promotion and resignation in employee networks,” Physica A: Statistical Mechanics and its Applications, vol. 444, pp. 442–447, 2016.
  • [454] Z. Yuan, H. Liu, R. Hu, D. Zhang, and H. Xiong, “Self-Supervised Prototype Representation Learning for Event-Based Corporate Profiling,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 5, pp. 4644–4652, May 2021.
  • [455] R. Zbib, L. A. Lacasa, F. Retyk, R. Poves, J. Aizpuru, H. Fabregat, V. Simkus, and E. García-Casademont, “Learning job titles similarity from noisy skill labels,” arXiv preprint arXiv:2207.00494, 2022.
  • [456] K. Zechner, D. Higgins, X. Xi, and D. M. Williamson, “Automatic scoring of non-native spontaneous speech in tests of spoken english,” Speech communication, vol. 51, no. 10, pp. 883–895, 2009.
  • [457] J. Zenger and J. Folkman, “Companies are bad at identifying high-potential employees,” Harvard Business Review, pp. 2–5, 2017.
  • [458] R. Zha, C. Qin, L. Zhang, D. Shen, T. Xu, H. Zhu, and E. Chen, “Career mobility analysis with uncertainty-aware graph autoencoders: A job title transition perspective,” IEEE Transactions on Computational Social Systems, 2023.
  • [459] R. Zha, Y. Sun, C. Qin, L. Zhang, T. Xu, H. Zhu, and E. Chen, “Towards unified representation learning for career mobility analysis with trajectory hypergraph,” ACM Transactions on Information Systems, 2024.
  • [460] C. Zhang and H. Wang, “Resumevis: A visual analytics system to discover semantic information in semi-structured resume data,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 10, no. 1, pp. 1–25, 2018.
  • [461] ——, “Resumevis: A visual analytics system to discover semantic information in semi-structured resume data,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 10, no. 1, pp. 1–25, 2018.
  • [462] C. Zhang, D. Song, C. Huang, A. Swami, and N. V. Chawla, “Heterogeneous graph neural network,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 793–803.
  • [463] D. Zhang, J. Liu, H. Zhu, Y. Liu, L. Wang, P. Wang, and H. Xiong, “Job2vec: Job title benchmarking with collective multi-view representation learning,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019, pp. 2763–2771.
  • [464] L. Zhang, T. Xu, H. Zhu, C. Qin, Q. Meng, H. Xiong, and E. Chen, “Large-scale talent flow embedding for company competitive analysis,” in Proceedings of The Web Conference 2020, 2020, pp. 2354–2364.
  • [465] L. Zhang, D. Zhou, H. Zhu, T. Xu, R. Zha, E. Chen, and H. Xiong, “Attentive heterogeneous graph embedding for job mobility prediction,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 2192–2201.
  • [466] L. Zhang, H. Zhu, T. Xu, C. Zhu, C. Qin, H. Xiong, and E. Chen, “Large-scale talent flow forecast with dynamic latent factor model?” in The World Wide Web Conference, 2019, pp. 2312–2322.
  • [467] Q. Zhang, H. Zhu, Y. Sun, H. Liu, F. Zhuang, and H. Xiong, “Talent demand forecasting with attentive neural sequential model,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 3906–3916.
  • [468] S. Zhang and M. Sridharan, “A survey of knowledge-based sequential decision-making under uncertainty,” AI Magazine, vol. 43, no. 2, pp. 249–266, 2022.
  • [469] X. Zhang, X. Wei, C. X. Ou, E. Caron, H. Zhu, and H. Xiong, “From human-ai confrontation to human-ai symbiosis in society 5.0: Transformation challenges and mechanisms,” IT Professional, vol. 24, no. 3, pp. 43–51, 2022.
  • [470] X. Zhang, Y. Zhao, X. Tang, H. Zhu, and H. ** fairness rules for talent intelligence management system,” in Proceedings of the 53rd Hawaii International Conference on System Sciences, 2020.
  • [471] Y. Zhang and Q. Yang, “A survey on multi-task learning,” IEEE Transactions on Knowledge and Data Engineering, 2021.
  • [472] Y. Zhang, B. Liu, and J. Qian, “Fedpjf: federated contrastive learning for privacy-preserving person-job fit,” Applied Intelligence, vol. 53, no. 22, pp. 27 060–27 071, 2023.
  • [473] Y. Zhang, B. Liu, J. Qian, J. Qin, X. Zhang, and X. Jiang, “An explainable person-job fit model incorporating structured information,” in 2021 IEEE International Conference on Big Data (Big Data).   IEEE, 2021, pp. 3571–3579.
  • [474] Y. Zhang, C. Qin, D. Shen, H. Ma, L. Zhang, X. Zhang, and H. Zhu, “Relicd: A reliable cognitive diagnosis framework with confidence awareness,” in 2023 IEEE International Conference on Data Mining (ICDM).   IEEE, 2023, pp. 858–867.
  • [475] J. Zhao, J. Wang, M. Sigdel, B. Zhang, P. Hoang, M. Liu, and M. Korayem, “Embedding-based recommender system for job to candidate matching on scale,” arXiv preprint arXiv:2107.00221, 2021.
  • [476] L. Zhao, Y. Yao, G. Guo, H. Tong, F. Xu, and J. Lu, “Team expansion in collaborative environments,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining.   Springer, 2018, pp. 713–725.
  • [477] Y. Zhao, M. K. Hryniewicki, F. Cheng, B. Fu, and X. Zhu, “Employee turnover prediction with machine learning: A reliable approach,” in Proceedings of SAI intelligent systems conference.   Springer, 2018, pp. 737–758.
  • [478] Z. Zheng, X. Hu, S. Gao, H. Zhu, and H. Xiong, “Mirror: A multi-view reciprocal recommender system for online recruitment,” in Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024.
  • [479] Z. Zheng, Z. Qiu, X. Hu, L. Wu, H. Zhu, and H. Xiong, “Generative job recommendations with large language model,” arXiv preprint arXiv:2307.02157, 2023.
  • [480] Z. Zheng, Y. Sun, X. Song, H. Zhu, and H. Xiong, “Generative learning plan recommendation for employees: A work performance-aware reinforcement learning approach,” in Proceedings of the 17th ACM Conference on Recommender Systems, 2023.
  • [481] J. Zhenhong, P. Lingxi, and S. Lei, “Person-job fit model based on sentence-level representation and theme-word graph,” in 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), vol. 5.   IEEE, 2021, pp. 1902–1909.
  • [482] Q. Zhou, L. Li, and H. Tong, “Towards real time team optimization,” in 2019 IEEE International Conference on Big Data (Big Data).   IEEE, 2019, pp. 1008–1017.
  • [483] X. Zhou and R. Zafarani, “A survey of fake news: Fundamental theories, detection methods, and opportunities,” ACM Computing Surveys (CSUR), vol. 53, no. 5, pp. 1–40, 2020.
  • [484] C. Zhu, H. Zhu, H. Xiong, P. Ding, and F. Xie, “Recruitment market trend analysis with sequential latent variable models,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 383–392.
  • [485] C. Zhu, H. Zhu, H. Xiong, C. Ma, F. Xie, P. Ding, and P. Li, “Person-job fit: Adapting the right talent for the right job with joint representation learning,” ACM Transactions on Management Information Systems (TMIS), vol. 9, no. 3, pp. 1–17, 2018.
  • [486] J. Zhu and C. Hudelot, “Towards job-transition-tag graph for a better job title representation learning,” arXiv preprint arXiv:2206.02782, 2022.
  • [487] Y. Zhu, H. Yuan, S. Wang, J. Liu, W. Liu, C. Deng, Z. Dou, and J.-R. Wen, “Large language models for information retrieval: A survey,” arXiv preprint arXiv:2308.07107, 2023.
  • [488] N. Ziems, W. Yu, Z. Zhang, and M. Jiang, “Large language models are built-in autoregressive search engines,” arXiv preprint arXiv:2305.09612, 2023.
  • [489] M. Zihayat, A. An, L. Golab, M. Kargar, and J. Szlichta, “Authority-based team discovery in social networks,” arXiv preprint arXiv:1611.02992, 2016.
  • [490] S. B. Zinjad, A. Bhattacharjee, A. Bhilegaonkar, and H. Liu, “Resumeflow: An llm-facilitated pipeline for personalized resume generation and refinement,” arXiv preprint arXiv:2402.06221, 2024.
  • [491] F. Zong, D. San, and W. Cui, “Research on course recommendation system based on artificial intelligence,” in Emerging Trends in Intelligent and Interactive Systems and Applications: Proceedings of the 5th International Conference on Intelligent, Interactive Systems and Applications (IISA2020).   Springer, 2021, pp. 667–671.