Tuesday, December 11, 2007

Kenny's 1 rule of segmentation

I was reading the segmentation post on Analytic Engine and it reminded me that I wanted to post my standard segmentation rant. First a story. If you don't know anything about Marketing Segmentation, check out the Wikipedia post.

When I first got to A fOrmer empoLoyer (think a large web portal), I found that we were big users of Prizm clustering, by Claritas. For those of you who have not seen the Prizm product, it is cool. The folks at Claritas have taken the whole US population and ran a cluster analysis on, well, us. They have classified each household into 66 unique segments. The segments group households that have similar lifestyle and socio-economic traits. They are a really neat way to do market to a target a pre-defined demographic and get some basic insight into who your customers are.

One problem, though, A fOrmer empoLoyer's targeting efforts were not based on simple demographics. We built fairly sophisticaed models to predict who would accept a given offer. In our case, Prizim clusters were rarely predictive, over an above the other variables we had available. People at the company had tried to use Prizm (and some of the other Claritas products) for all kinds of marketing-y things and they just did not provide value.

So, some bright person at A fOrmer empoLoyer said "we need to develop our own segmentation scheme. We'll provide a Universal Segmentation" (that is really what it was called) that can be used for any marketing or targeting activity." Strike two. The Universal Segmentation was a worse solution than Prizm and never made it out of the lab. Actually, A fOrmer empoLoyer took several runs at building a custom segmentation scheme using cluster analysis. None of them were found to be useful. As an aside, I mentioned the Universal Segmentation project to an expert in segmentation and he laughed and laughed. It was a bonding moment.

So why did A fOrmer empoLoyer have some much difficulty in using a tool that is in use by marketers everywhere? In both cases, the segments were not built with A fOrmer empoLoyer's needs in mind. They tried to do too much.

To finish the story. A fOrmer empoLoyer had been sending out a mass email to the whole customer base as a way of stimulating engagement and generating page views. The program was, uh, less than effective. I convinced the Customer Engagement folks to let us build a custom segmentation scheme for their email newsletter program that categorized each customer on the basis of the content they visited. In the spirit of transparency, Omniture did most of the work, they had the data. The thought was that the content someone viewed was a good proxy for their interests.

From the segmentation, we found that A fOrmer empoLoyer had less than 20 unique customer segments, but that very few segments described most of our customers. The Member Engagement staff started to create newsletters for each of the large segments. We had just started to use the segmentation scheme before A fOrmer empoLoyer stopped marketing, but the first campaign had much higher open rates than the mass mailing approach.

The lesson here is that when you initiate a segmentation project, you need to be really thoughtful about what you are trying to accomplish. Don't use every variable that you have available and just start cranking the k-means. Your segments will not be interpretable, and thus (you never see thus any more) won't be actionable. You need focus.

Instead, think about what you are trying to accomplish (e.g., be able to classify your customers into demographic groups) and build datasets that only include variables that are actionable or would give you insight about your customers. Build a segmentation scheme for one purpose. And when the guys at Claritas say "we have been doing this for 20 years. We have this nut cracked. Use our segmentation scheme", ask yourself, why do they have several segmentation products?

I could say some other stuff about creating segments, but if you focus on what you are trying to accomplish and build your dataset accordingly, you should have good results.

No comments: