I’d heard the term “Dark Social” on a client call about two months ago, I googled it and balked at the definition.
“Dark social describes any web traffic that’s not attributed to a known source”
Allegedly coined by this article http://www.theatlantic.com/technology/archive/2012/10/dark-social-we-have-the-whole-history-of-the-web-wrong/263523/, which does call out important points about the whole theory – however in a very flowery context. Why did they choose to call it dark social? I guess that’s my biggest issue with it, since it’s really just “dark traffic”. To say that this bucket of traffic is any more significant portion than any other identified source is a giant leap.
Source Reporting
Taking your standard sources breakdown, as this is often our main starting point for segmentation of web analytics data. You might be taking advantage of the built in classifications from your tool, or getting more specific as part of some custom analysis process. Either way – these all rely on the same points of data to determine the source breakdown.
We can see our Direct bucket as 40% of total traffic. This “direct” identification comes from lack of any other context about traffic source – no referring URL, and no querystring parameters. When we think about how the rest are identified – it’s still relying on those same two pieces of data.
Direct – or not.
Before we leap to classifying this large chunk of traffic as “Social” maybe we should think about what else it could be:
- Any of the other identifiable sources lacking the normal ways of identification (gasp!)
- Blocked referrers due to user-privacy settings
- Dropped referrers due to bad redirects
- Improperly trafficked paid traffic lacking campaign tracking parameters in URL
- Improperly trafficked social and/or other channel traffic without campaign tracking parameters in URL
- True “direct” traffic
- Bookmarks
- Return traffic
- All other non-attributable non-social
- True “dark social” traffic
- Word-of-mouth
- IM Sharing
- All other non-attributable social
When we look at the laundry list of why a traffic source can’t be identified, it’s probably important to make sure you as the analyst, or your company has a handle on their overall campaign strategy for attributing traffic. If any one of these is less than optimal, your “dark social” assignment is just putting a name to “poor execution of marketing”. I think you can expect at least a some percentage of all unidentifiable traffic to fall into the existing source buckets – and likely in some close proportion to the set of identifiable ones.
How should you use it?
As mentioned, before you attempt to classify anything as “dark social” you need to make sure that things aren’t falling into “direct” for other reasons.
- Paid traffic without proper tracking.
- Bad redirects dropping referrers
- Social tracking/campaigns without proper campaign tracking
Where I do like how dark social is applied (from http://www.theatlantic.com/technology/archive/2012/10/dark-social-we-have-the-whole-history-of-the-web-wrong/263523/):
The first was people who were going to a homepage (theatlantic.com) or a subject landing page (theatlantic.com/politics). The second were people going to any other page, that is to say, all of our articles. These people, they figured, were following some sort of link because no one actually types “http://www.theatlantic.com/technology/archive/2012/10/atlast-the-gargantuan-telescope-designed-to-find-life-on-other-planets/263409/.” They started counting these people as what they call direct social.
However, all things considered I think this really helps you attribute true direct instead of this concept of dark social. I’d be more comfortable calling any visits entering to those top-level pages from unidentifiable sources “true direct” vs the deeper-level pages “dark social”. I’d almost rather we bring back “word of mouth” for this bucket…