Thursday, January 28, 2010

Shameless promotion

Quoted in an article about our Open Data Bridge efforts.

Wednesday, January 13, 2010

Why Demos

The other day, I was speaking to an ad agency about use of third party data in online advertising. I spent a fair bit of time talking about my focus on building out [x+1]'s demographic data set. Toward the end of the talk, someone asked a very interesting question: "If you have buyer propensity or behavioral data, why do you need demographics?"

Hmmm. Why do you need demographics in a world of in-market and intender data? Let me talk a little bit about why demos are useful in online ad targeting, and more specifically, for media targeting.

First, demographics act as useful proxies for life stages and interests. An individual’s life stage and interests are powerful drivers of purchase intent. In fact, demos serve as inputs to the models used to create intender/interest segments (but not in-market status). They are foundational.

Second, demographics are an efficient type of data. An ad network can use the data across a wide variety of product categories. So, get the data once, use it many times. This reduces the amount of integration you need your engineering team to do and speeds time to market for product specific targeting.

Third, demographic data is available for large portions of the internet audience. Any of the large online data providers claim to have ~ 30% of the audience. By contrast, counts for intender data are fairly small. How many people at a given time are in-market for airline flights to Mexico? As a percentage of users seen during an ad campaign, the number is certainly in the low single digits.

Fourth, demographic data will be commoditized. I am not suggesting that it will become cheap. I mean in the classic sense of a commodity; one source is as good as another and also comparable to a standard. This is not the case today. Some providers are more accurate than others, but over time, I would think that there will be little to distinguish one data provider from another. This means that, unlike the intender and in-market data, we'll be able to "stitch" together multiple demographic providers to create a file that provides demographics for a fairly wide set of users. Each provider has a unique (but overlapping) set of users, so we are going to want to combine datasets. Demographic data is relatively easy to combine across providers. By contrast, each provider of intender and in-market data defines their own segments, meaning that we are going to need to treat each data source separately. For a longer discussion of creating an aggregate demographics database, see my article here.

Powerful predictors of likely relevance, broadly useful, for many users, simple, and standardized. All good. So, what’s the catch?

I can see three challenges on the demographic side of data. First, the cost to use demographic data has to be very affordable in order for ad networks and agencies to apply the data to all of their ad decisioning. Online data is not yet commoditized (in the classical sense), but I believe it will eventually become so.

Second, most companies don't yet know the number of unique users each data provider can reach. At [x+1] we use enough of it to have a pretty strong idea of what works for a given campaign, but most folks don't have enough experience to understand the reach they can get from each data provider. The value of each providers data is additive to the extent that they provide data on unique users. If they are not providing data on unique users, then the path to commoditization begins. The providers would be supplying the same product. By definition, the data would be a commodity.

Lastly, each of the data providers have varying degrees of accuracy. Online, it is difficult to assess accuracy. You need to find a source of “truth” and advertisers are often reluctant to share their verified customer files with ad networks. Some ad networks rely on straight lift to assess the value of a data set; they don’t worry about accuracy. The problem with this approach is it tends to be brittle. Data sources that have some level of accuracy are useful for a little while they are being used to target users that they can accurately associate to a given data element. Over time, their predictive power degrades. I am a big believer in taking the time and care to find data sources that accurately represent the users’ age, income, whatever. As the accuracy of your data improves, you can be more confident in the longevity of your targeting strategy.

One last point; Should the data providers worry about commoditization of demographic data? If I were them, I would not be losing any sleep over it. In this case, I think commoditization would be good for the data providers. They would get less money per user on any given transaction, but they would truly make it up in volume and because their product has zero marginal cost this is a good thing. In the offline world, that dynamic has played out to the benefit of Acxiom, Equifax, Experian, InfoUSA, etc.

And for those interested, I have been giving talks to agencies and advertisers on the online 3rd party data landscape. I would be happy to talk to your teams about what kinds of third party data is coming on-line, why they should care, and how/when we expect to be able to use the data. There are very interesting capabilities being developed. Please contact me at krona@xplusone.com if you would like to know more.