READ ME File For 'Social Interactions in Online Eating Disorder Communities: A Network Perspective' Dataset DOI: 10.5258/SOTON/D0551 ReadMe Author: Tao Wang, University of Southampton This dataset supports the publication: Tao Wang, Markus Brede, Antonella Ianni, Emmanouil Mentzakis Social Interactions in Online Eating Disorder Communities: A Network Perspective PLOS ONE This dataset contains: ----------------------------------------------------------------------------- A. "network.graphml" contains a directed, weighted network representing interactions between 6,169 users and 11,056 directed links in exchanging information on eating disorders (ED). Each node denote a user, and a link runs from a node representing user i to a node representing user j if i mentions or replies to j in a tweet, weighted by the count of mentions and replies. User IDs are stored as the attribute "name" of nodes, and the clusters of users (i.e., pro-ED or pro-recovery groups) are stored as the attribute "cluster" of nodes, with 0 denoting pro-ED group and 1 denoting pro-recovery group. Note that the user IDs have been anonymised, staring from 0 to 6,168. ----------------------------------------------------------------------------- B. "user-features.csv" contains all attributes measured in the paper, including social activities, language use and strength in promoting a pro-ED or pro-recovery tendency. The details of fields are as follows: 1. uid: anonymised user IDs from 0 to 6,168, in line with those in the network.graphml. 2. cluster: users groups assigned by clustering methods based on users' posting interests, 0 denotes pro-ED group and 1 denotes pro-recovery group. 3. friend_count: the number of followings, i.e., people who are followed by a user. 4. status_count: the number of tweets. 5. follower_count: the number of follower, i.e., people who follow a user. 6. friends_day: average number of followings per day. 7. statuses_day: average number of tweets per day. 8. followers_day: average number of followers per day. 9. retweet_pro: ratio of re-tweets in historical posts. 10. mention_pro: ratio of posts with mentions (mentions in retweets are excluded). 11. reply_pro: ratio of posts with replies. 12. retweet_div: entropy of re-tweeting others. 13. reply_div: entropy of replying others. 14. mention_div: entropy of mentioning others. 15. I: 1st personal singular use. 16. We: 1st personal plural use. 17. Negate: Negation use. 18. Swear: Abusive language. 19. Social: Social concerns. 20. Posemo: Positive emotions 21. Negemo: Negative emotions 22. Body: Concerns of body image. 23. Health: Concerns of health 24. Ingest: Concerns of ingestion. 25. prostr: strength in promoting a pro-ED (if a user belongs to pro-ED group) or pro-recovery (if a user belongs to pro-recovery group) tendency. Note that only users who posted more than 50 words in tweets (excluding retweets) are processed with LIWC for more trustworthy results. Thus, the number of users in "users-features.csv" is smaller than the number of whole sample. For more details of data processing, please refer to the paper "Social Interactions in Online Eating Disorder Communities: A Network Perspective" and its supporting information. Date of data collection: March 2016. Information about geographic location of data collection: University of Southampton, U.K. Date that the file was created: June 2018