A Data-Based Approach to Discovering Multi-Topic Influential Leaders

PLoS One. 2016 Jul 14;11(7):e0158855. doi: 10.1371/journal.pone.0158855. eCollection 2016.

Abstract

Recently, increasing numbers of users have adopted microblogging services as their main information source. However, most of them find themselves drowning in the millions of posts produced by other users every day. To cope with this, identifying a set of the most influential people is paramount. Moreover, finding a set of related influential users to expand the coverage of one particular topic is required in real world scenarios. Most of the existing algorithms in this area focus on topology-related methods such as PageRank. These methods mine link structures to find the expected influential rank of users. However, because they ignore the interaction data, these methods turn out to be less effective in social networks. In reality, a variety of topics exist within the information diffusing through the network. Because they have different interests, users play different roles in the diffusion of information related to different topics. As a result, distinguishing influential leaders according to different topics is also worthy of research. In this paper, we propose a multi-topic influence diffusion model (MTID) based on traces acquired from historic information. We decompose the influential scores of users into two parts: the direct influence determined by information propagation along the link structure and indirect influence that extends beyond the restrictions of direct follower relationships. To model the network from a multi-topical viewpoint, we introduce topic pools, each of which represents a particular topic information source. Then, we extract the topic distributions from the traces of tweets, determining the influence propagation probability and content generation probability. In the network, we adopt multiple ground nodes representing topic pools to connect every user through bidirectional links. Based on this multi-topical view of the network, we further introduce the topic-dependent rank (TD-Rank) algorithm to identify the multi-topic influential users. Our algorithm not only effectively overcomes the shortages of PageRank but also effectively produces a measure of topic-related rank. Extensive experiments on a Weibo dataset show that our model is both effective and robust.

MeSH terms

  • Blogging / statistics & numerical data
  • Humans
  • Leadership*
  • Models, Statistical
  • Social Support

Grants and funding

The work was jointly supported by the National Natural Science Foundations of China under grant No. 61472302, 61272280, U1404620, and 41271447; The Open Projects Program of National Laboratory of Pattern Recognition (201600031); The Program for New Century Excellent Talents in University under grant No. NCET-12-0919; The Fundamental Research Funds for the Central Universities under grant No. K5051203020, K5051303018, JB150313, JB150317, and BDY081422; Special Program for Applied Research on Super Computation of the NSFC-Guangdong Joint Fund (the second phase); Natural Science Foundation of Shaanxi Province, under grant No. 2010JM8027; The Creative Project of the Science and Technology State of xi’an under grant No. CXY1441(1); and The State Key Laboratory of Geo-information Engineering under grant No. SKLGIE2014-M-4-4.