Skip to main content
padlock icon - secure page this page is secure

Detecting Non‐personal and Spam Users on Geo‐tagged Twitter Network

Buy Article:

$59.00 + tax (Refund Policy)

With the rapid growth and popularity of mobile devices and location‐aware technologies, online social networks such as Twitter have become an important data source for scientists to conduct geo‐social network research. Non‐personal accounts, spam users and junk tweets, however, pose severe problems to the extraction of meaningful information and the validation of any research findings on tweets or twitter users. Therefore, the detection of such users is a critical and fundamental step for twitter‐related geographic research. In this study, we develop a methodological framework to: (1) extract user characteristics based on geographic, graph‐based and content‐based features of tweets; (2) construct a training dataset by manually inspecting and labeling a large sample of twitter users; and (3) derive reliable rules and knowledge for detecting non‐personal users with supervised classification methods. The extracted geographic characteristics of a user include maximum speed, mean speed, the number of different counties that the user has been to, and others. Content‐based characteristics for a user include the number of tweets per month, the percentage of tweets with URLs or Hashtags, and the percentage of tweets with emotions, detected with sentiment analysis. The extracted rules are theoretically interesting and practically useful. Specifically, the results show that geographic features, such as the average speed and frequency of county changes, can serve as important indicators of non‐personal users. For non‐spatial characteristics, the percentage of tweets with a high human factor index, the percentage of tweets with URLs, and the percentage of tweets with mentioned/replied users are the top three features in detecting non‐personal users.
No References
No Citations
No Supplementary Data
No Article Media
No Metrics

Document Type: Research Article

Publication date: June 1, 2014

  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content
Cookie Policy
X
Cookie Policy
Ingenta Connect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more