Me, looking at how me and my friends are connected. Only 284 people

Me, looking at how me and my friends are connected. Only 284 people, but 4052 lines

We tend to call the collective of our friends a network. A social network. We tend to go networking – meeting people. What does my social network mean? I investigated a bit.

For some years it seemed debatable whether social networking sites were private or public space. One might have argued “I am among friends and only friends (those I accepted and they have accepted me) so it is a private matter whatever I do.” Now there is at least one case in Romania when the court ruled that one’s social networking profile is public space and all laws are applicable as such.

This issue made me wonder how private or public is my (or anybody’s) profile depending on the privacy settings applied. So I checked it out and tried to find some other interesting stuff too.

Results in short
I have 283 friends (in a technical sense). My friends have friends anywhere from a few dozen to close to the facebook limit (5000). The count of my friends’ friends is 131,338 individual profiles – so much for the privacy provided by the setting called “friends of friends”. If all my friends who are friends of eachother’s shook hands, that meant 4,052 handshakes. The number of people who are potential audience may easily dwarf that of a regional newspaper and is comparable to the reach of nationwide media products in spite of the seemingly limiting privacy settings of a particular post.

The idea

Let’s use facebook. Let’s check how many friends I have on facebook – easy. Let’s check who are my friends – still easy, some scrolling is needed. Now let’s check each of my friends’ friend list – hard. And let’s build connections.

The crawler

So facebook is a nice guy until one uses it as intended: spending time looking at goofy posts and, most importantly, interacting with the ads. It has an API intended for apps, which in turn need a user’s permission to get access. But what if I wanted to extract the map of my social network? I could have done it manually (in theory), but I decided to write a crawler that emulates my gestures (scrolls and clicks) and extracts what I need: the list of all profiles on someone’s friend-list. I did this using a homebrewed tampermonkey userscript which adds an event listener (window.postMessage) to process and an iframe (a localhost script) to post the messages to the main page loaded in my browser. Through these two I could tunnel the javascript/DOM commands needed thus saving me many-many-many hours of clicking, scrolling, copy-paste-ing. It took my script about a day (give or take) to run through all my friends and the browser (chrome) had a tendency to use more than 1 GiB RAM and then collapse when popular people’s friends got all (4600+) listed – in spite of the DOM optimizations I deployed, like hiding not needed elements to ease graphic rendering. The crawler listed my friends then went through all of them listing their friends.

The collected data

The crawler generated plain text files listing all the profile id-s (semantic URL slugs or integer ids) found on a profile’s friendlist. The containing folder got about 4 MiB. The crawler accessed nothing I couldn’t have accessed using my mouse, and did not go beyond my friends’ friendlist (accessed only profiles that are mine or my friends’).

20150203_facebook_crawled_filelist

Conclusions – technical

Using the database acquired this way I listed all the profiles and found that there are 253,638 profiles linked, of which 131,338 are unique identifiers – me and my friends have common friends. Assuming profiles are nodes and their connections are edges, I counted 249,626 edges- so there are many people out there who have very few connections to my selected sample. Also note that some people set their friend list private or show only mutual friends. There are 4,052 edges if counting only me and my friends’ direct relations.

One might not call me popular in terms of facebook. I have only 283 friends which places me to the bottom of the slope.

My popularity(?)

My popularity(?)

The data also revealed that I won’t be able to visualize the network consisting of me, my friends and the friends of my friends: there is not enough room to fit 130k nodes and lines on a single meaningful picture. So I had to settle with the small “me and my friends” variant – a 6000×6000 picture resembling a huge ball of string.

This is me (center) connected to my friends (circumference, equally distributed). The white lemon shape is given by the names, captions of the nodes

This is me (center) connected to my friends (circumference, equally distributed). The white lemon shape is given by the names, captions of the nodes

284 of us, all connections marked

284 of us, all connections marked

 

I also found that there are about 1500 people named Zsuzsa (or zsuzsi or zsuzsanna), about 1000 named Robert (or Robi), 617 are named Szabolcs – based on their profile slug. Using a quite short list of women’s names (this logic is dictated by the way a Hungarian name is usually composed) and the very dirty slugs I concluded that of the 130k users at least 41k “look like” women.

Conclusions – ethical and legal

When I post something set as “visible to only my friends”, that reaches at most 283 people – manageable (though it depends on facebook how it shows or not my content to others). However, when I set my post to be “visible for friends and friends of friends” then the number of people who are potential audience may easily dwarf that of a regional newspaper’s and is comparable to the reach of nationwide media products (see Hungarian radios in 2013). And I have only a few friends compared to how the above diagram looks like.

So yes, the data shows that whatever happens on facebook, it most likely happens in public space.

facebooktwittergoogle_plusredditpinterestlinkedinmail