Friday, December 4, 2009

Why Demos

I wrote a piece for Ad Exchanger and why internet marketers want demographic data. I'll post the text in the next couple of days.

Friday, October 23, 2009

Where is the US household file?

Conducting an offline direct marketing campaign is relatively easy. You can call any one of a number of data providers to get a us household file (that is, demographics on 115 MM US households), run a test campaign, figure out the profile of who responded to the campaign, and you are off to the races. The data exists. You just have to crank up your favorite LOGIT tool and you are in business. In the online space, not so much.


In the online world, there is no single vendor that has the US online user file. Why? It is hard to identify users online in a way that protects privacy and is meaningful for marketers. Though they are all getting better, no single online method of tagging users for demographics has over 30% of the online audience. So, in order to know basic demographics, you need to combine data from multiple data sources.

Say you have signed up multiple data providers. Are you ready to go? No. You now need to make some trade offs on accuracy and comparability. What do I mean by that? Well, all of the data sources have varying levels of accuracy. Is IP based better than cookie based? Do you have a file that you can use to know truth? Also, the data sources may report data at the user level (though anonymous) or the household or the zip4 or the zip. Seems like you should be as close to the user level as you can be, but what about the accuracy issue? Is is better to have accurate data at the Z4 or less accurate data at the user level. By the way, it is going to depend on the data type and category. All in all, very complicated stuff.

What is a online marketer to do? You have two choices on this. Test and learn and know that you are going to need to invest time and resources in getting a good set of data providers in place. Or, self servingly, work with someone who has already climbed up the learning cure :).

Monday, October 12, 2009

Got some press

My new company put together a press release. Check it out here. Too funny.

Wednesday, September 30, 2009

How to start a new job

I was talking to a colleague the other day who was about to start a new job. She had been at her previous company for about 10 years and wanted some pointers on how to make a successful transition. We talked for a while about making a strong start and I gave her three pieces of advice. First, spend the first month listening. Take everyone you can out to lunch; your internal clients and partners, your staff, vendors and ask them about how you can help them be more effective. Get invited to staff meetings and listen to the problems people are wrestling with. Just gather perspectives and try to listen very carefully. In every role, there will be opportunities that you can leverage to get a strong start. Try to find those opportunities. Low hanging fruit and all that.

Second, take what you have learned and come up with a plan to harvest the fruit. Talk to your manager about what you are going to accomplish and get her agreement on the plan. You want fast wins here, so try to avoid committing to a project that is going to take two years to complete. Also, be very wary of projects that have been lying around, incomplete. There may be a reason why the "raw data feed" or some such project has made no progress in 6 months. You can take those projects on, but try to push them back until after you have a track record.

Third, and I hate to be this tactical: Don't walk in and say things like "At we filtered our web log data using SAS" or "At we had the data warehouse take care of this problem". These may be true and relevant observations, but new co-workers react funnily to the comparisons. In some ways, talking about your former is like name dropping a celebrity that you once hung out with and irritating for the same reason. Once is interesting, but it gets old fast. You can say "at a previous employer we did x", but don't use the companies name. It sounds trivial, but take my word for it, people get tired of you saying "We used Lotus Notes at Mckinsey."

And on that note, I get a chance to follow my own advice. I am leaving IXI and moving to an [X+1] as the Vice President for Analytics and Data Strategy. I only work for companies that have an X in their name. Look for the blog to be more active as part of my role will be evangelism in the space.

Tuesday, June 9, 2009

Brute force vs. smarts

My dissertation was, in part, about how to encourage people, when solving problems, to think about the information they already have available to them and to not just gather information for its own sake. I found that you can save a lot of money if you charge just a token amount for each new piece of information. When you charge for information, people think more deeply about the information in their possession and stop asking for information that they don't really need to solve the problem. I had a real world brush with this phenomena the other day.


I got a call from someone who had an enormous database (multi-petabytes) and was looking for some advice on how to "scale" the database. I almost choked. How much more scale do you need? They were saving every piece of information they gathered from their customers and were afraid to throw anything away. "We don't know what we are going to need in the future" was the refrain.

In my mind, the organization was not thinking hard about the data they had and how to use it efficiently. Rather they let cheap storage lead them down an easy path. The brute force path. The engineering path. I can tell you from experience that, assuming money, the technical folks will solve the problem of increasing storage. The organization still thought of their challenge as one of engineering. "How do we save even more data." But the engineers can't fix the underlying problem; the organization was being thoughtful about the data they were saving. At AOL, we did an analysis and found that for predictive modeling, we relied on a small set of data (less than 100 variables) and only used 10-15 for any given use. We had several thousand variables available to us, but most were either correlated with other variables and could be deleted with no loss of usability, the data supported a business we were no longer in, or being saved for no reason that we could determine, other than it was easy.

I would suggest to companies looking to "scale their databases" to first do an inventory of the data they are saving and develop a simple process to determine if the data is worth saving. In AOL's our case, each modeler assigned a letter grade to each variable and we "voted" on the data to give away. And, after this process, at no time did we say "I wish we kept variable x." Reality was we still had too much data.

Monday, June 8, 2009

Thinking about BI recently

It occurred to me that using a BI tool is a hard way to gain insight. You are limited by your own imagination. I like hypothesis driven analysis, but I think you can do a much better job in providing insight if you understand a bit of econometrics (Logit and OLS). You can simply run a stepwise regression on the variable you are trying to understand (say Life Time Value of a customer) and see what pops out (say the interaction of age and education). Once you see what variables pop, you can then use BI tools to illustrate the point. Critical piece, make sure you run all of the interactions. That is where the cool stuff lies.

Data Simplifiction

When I was trained, I was told that you should never take continuous level data and make it categorical. One of the guiding principles of regression analysis is that variance is good; Never reduce it by simplifying your data and creating categories. Maybe an example is in order:


Say you have a variable like temperature. This data is continuous; It is not bounded (except in extreme cases) and the temperature of something can take a wide variety of values. In regression, there would be no reason to define ranges or temperature (0-10, 11-20, 21-30, etc). The computer does the work and if you created the ranges you might reduce the explanatory power of the data (or if the data was used as a dependant variable, make it harder for other variables to predict it value). So, in research, categorizing data is a no no. So said Professor Feldman.

Funny thing was that I had a staff member (David) who kept telling me that while that the theory was right, you can usually create categories without much loss of predictive power. And in certain applications, working with categories is much easier than working with continuous data (ad serving is one such category. But that is another post).

The other day, I went through the exercise. I took continuous level data and made it categorical. David was right. Prof was wrong. At least in the world of Digital Marketing. The data retained its power and it is easier for consumers of the data to use it. Having said that, you still need a fair number of categories (over 10) to retain the power. Even still, I thought this was a fact worth knowing.

Monday, October 13, 2008

On data quality

I believe that the single easiest and more impactful thing you can do to improve your analytics is to check your data quality. Sometimes, however, ensuring quality data can have a direct impact on improving your business results. Case in point, I have a client whose business depends on the accuracy of personal information given to them by their customers. However, they did no address verification on the data; they took whatever information that was provided with out checking its accuracy. We just did a check on the quality of their data. Turns out over 20% of the people do not give accurate personal information. This is an easy fix: Put in an address verification system to make sure that at least the address they get are valid. If someone is going to make up an address, at least force them to give you a valid address. This can only make the problem better.

Best Electoral Prediction Site

One of the things that drives me nuts during campaign season is the reporting of national polls. There is a little thing called the Electoral College, CNN. Ever heard of it? You need to look at the state level polling. Problem is, the state level polls are often conflicting. The margin of errors can be large or the results not reliable due to bias in the polling methodology. What is a political junkie to do. Go to Five Thirty Eight. This is the best site I have seen on predicting the outcome of the election. In fact, if you had asked me how to predict the election, I would have suggested something like this. Note that they don't say who is going to win, but the probability of a win by either candidate.

Thursday, August 28, 2008

Hammer on Analytics

I used to be a sound engineer and one of my clients was MC Hammer. In fact, at one show in LA, I told him that he was spending too much on his entourage and he was going to go broke. Well, here we are, 20 years later, and we are both commenting on analytics.

Thursday, June 12, 2008

Made the papers

Here is a very brief writeup of my business intelligence talk at the AICPA. For those who attended, thanks for the warm welcome. I enjoyed getting in front of the CPA crowd.

Tuesday, May 27, 2008

How to block ads

I got tired of looking at all the internet advertising and just install some ad-blocking software. Link to the article that inspired me, here. The plugins are for Firefox. Updating the hosts file was simple. I just searched for the file name "hosts" and then added the big list of advertisers to both files that were found. No more ads, but Yahoo looks weird.

Thursday, May 15, 2008

Add more data? No, just more understanding

A couple of months ago, Anand Rajaraman, a professor at Stanford who teaches a class on data mining, wrote a blog post talking about a class assignment; students have to do a non-trivial data mining exercise. A bunch of his students decided to go after the Netflix prize, a contest, run by Netflix, to see if anyone could improve their movie suggestion algorithm by greater than 10%. I love these kinds of contests. One team in Anand's class added data from the Internet Movie Database to the Netflix supplied dataset. Another team did not add data, instead, they spent time on optimizing the recommendation algorithm . Turns out that the team that added data did much better on movie recommendations. So good, in fact, that they made the leaderboard. So what should we take away from this?

Anand suggests "adding more, independent data usually beats out designing ever-better algorithms to analyze an existing data set." He is right, but glosses over a critical word "independent." That is, the data being added has to not be correlated with the existing data. It needs to represent something new.

My take: The team that added the data were smart and operationalized descriptions of the movies better than the Netflix data. They found a dataset that added a missing theoretical construct, good descriptions of movies, and that made the difference. Just adding a bunch of data is not the takeaway here. (At my previous employer, we had over 8000 variables at the household level (we did a lot of transformation) that we could use to predict if someone was going to take an offer. In a typical model, we used less than 20 variaibles. Of the 20 models we had in production, we used less than 200 of the 8000 variables.)

So what is the secret sauce to model improvement? Adding data that operationalized a construct previously in the error term. In English: The team thought about what factors (what I was calling theoretical constructs) could possibly be used in a recommendation system and went to find data that could be a proxy for those factors. You should be willing to add (read:pay) for more data, but only if it measures something where you don't have an effective proxy.

Wednesday, May 14, 2008

Winds of Change video

Have you ever seen the Kodak "Winds of Change" video? It has nothing to do with analytics, but I love the way that they confront the brand perception of Kodak as no longer relevant and show that they understand the issue and are working to become relevant again. I heard the CMO speak and he said that he almost got fired over this video. It was an internal piece that got out. Turned out that it was a viral hit and did wonders for the re-branding.

Monday, May 12, 2008

Tips for implementing a BI project

I am speaking at AICPA in Las Vegas on Business Intelligence. My talk is supposed to be a "lessons learned" kind of case study on using BI. I developed 11 tips when rolling out a BI solution. Some of these may look very familiar:

Tip 1: When deciding what to show in your BI tools, use a balanced score card approach.
Balanced scorecards provide a nice framework for thinking about developing useful metrics

Tip 2 :
Select right hardware.
We needed a “Data Appliance” like Netezza. Feel free to overbuy. Your future self will thank you.

Tip 3:
Take your time building requirements.
Figure out who is going to use the data and for what. What are needs going to be a year from now? Three years from now?

Tip 4:
Conduct a data quality audit.
Check for missing data, unusual variability, unusual stability

Tip 5:
Make your data warehouse responsible for fixing data quality problems.
Don’t try to build in work-arounds. You will have bought yourself a bunch of maintenance headaches. Let the guys who are supposed to give you clean data do their jobs.

Tip 6:
Provide some context for each metric.
Show trends, goals, and/or min-max for each metric. This will allow the exec to decide if some metric is worth further attention.

Tip 7:
Enable drill down on your charts (but don’t overdo it).
When an exec sees something “anomalous” they are going to want to see if they can figure out what is going on. Computer users are trained to clicking on things they are curious about. Leverage this behavior.

Tip 8:
Avoid being “flashy”and cool.
Keep your charts simple and redundant. Allow your audience to learn how to consume your BI quickly, not be impressed with your teams programming skills.

Tip 9:
Conduct 1-on-1 sessions with senior execs to ensure that they found the BI tools useful and informative. Adoption of these things is much harder than technical implementation. Do anything you can do to drive adoption

Tip 10:
Choose a software package for ease of integration.
Time spent integrating is not worth the loss of strategic use of the data. Remember, the time you take to get things working right has a cost to the business.
All of the major BI vendors have very similar functionality and differences are not likely to have any impact on business decision making

Tip 11:
Be ruthlessly simple about what you metrics you show. Complexity is your enemy.
Strive for few, but very meaningful metrics. Too often, you are going to want to create complex reports. Fight the urge. They will never be looked at. In this context, I will always sacrifice completeness for simplicity.

Tuesday, April 15, 2008

At Ad-Tech

I don't think I ever posted that I took a new job. I am the Senior Vice President of Internet Products for IXI Corp. IXI is a financial data consortium that collects personal and business asset data from financial service providers, cleans it up, and provides it back to member firms for use in marketing, resourcing, and strategic decision making.

And on that note, I am in San Fran boning up on the latest in Web-based advertising. There are a number of uses of IXI's credit data for ad targeting and fraud prevention.

Wednesday, March 5, 2008

Free data!

I have a longer post about selecting data for modeling, but for now, just know that the UN has put it's data statistical data online. Perhaps the nicest feature is that the site will search across all of their published datasets. I like adding macro level data into modeling and customer insight projects and the UN is a good source.

Tuesday, March 4, 2008

We're number 1!

Interestingly, a search for Business Analytics Blog, on Google, brings up Da Facto in the number 1 spot. It is a little niche-y, but still.

What is the secret sauce in direct marketing?

I am a big fan what I call tribal wisdom. Kevin Hillstrom put together a post of database marketing tribal wisdom. The item that most resonates for me? Segmentation and treating segments differently for marketing purposes. This was one of the first things I took from my McKinsey experience. Much of what he talks about are just specific instances of differential treatment of segments. Nice post.

More on data quality

I was speaking with someone about ways to assess data quality for predictive modeling. I have written on tactically how to ensure quality data, but here is a little framework you can use when thinking about data quality. Your data needs to be accurate, granular, and complete. Accuracy of data: Does the data accurately capture the attribute (e.g., income) that it was intended to? Granularity of data: In every case, the more granular the data, the better the predictive modeling can be. Also, you can usually roll up granular data (from individual to Households, Households to zip+4, etc) to higher level if you need to (for analytic or appending purposes.) Completeness of data: Any given dataset is going to have missing data. Missing data is a funny thing. Of the three, accuracy, granularity, and completeness, the later is the one that you can most influence. Obviously, the less missing data, obviously, the better. But before choosing an appropriate remediation method you need to understand why the data is missing. If the problem is that the datafeeds are broken, you are going to need to get the feeds fixed. If the data does not exist, but you have enough coverage to do some predictive modeling, you can predict the values of missing data. Or you may just need to fill missing values with the mean or median values.

Monday, March 3, 2008

Linking analytics and psychology

Wired has an article on the 1 Million dollar prize Netflix is offering to the person (or team) that can improve their recommendation algorithm by 10%. Most of the competitors rely on fairly advanced math, but one guy is implementing behavioral economic principles and is competing against the big boys.

Wednesday, February 27, 2008

Dan's book is making the news

Here is a link to a NYT article about Predictably Irrational. Fuqua is doing something very smart. While Dan is on his book tour, they are scheduling Alumni and recruiting events around his schedule. I went to one of the talks. If you have a chance to hear him talk about his research, you should go. The research on deception is fascinating...

Monday, February 25, 2008

Visualize!

Here is a chart created by the New York Times staff showing box office receipts over time for every movie release in 2007. It is a neat chart showing how "bursty" blockbusters are and how Oscar contenders have a longer tail. Neat, but you have to work for that insight. Some kind of filitering or making the horizontal access not from the calendar date but on weeks of distribution, would have made the point clearer. People go gaga for this stuff, but the nice presentation obscures the insight that you want to confer.

Friday, February 22, 2008

How should you manage your relationships with recruiters?

As the old joke says, carefully.

I get calls from recruiters at a fairly constant rate and have 3 principles that I use when talking with them. Note that most of the folks I speak with are retained search recruiters, they are paid, in advance, to fill a position. Companies tend to use retained search firms for more senior jobs and the search firm has an exclusive arrangement with the firm. I occasionally get a call from a contingency search firm. These firms are paid when they fill a job and are not exclusive. Most of what I am going to say is applicable to retained search folks who have a more relationship based business. Contingency folks are more transactional, so relationship building may not be as critical. Even still, making friends is always worthwhile. On to the principles.

First, always take the call. I speak with every recruiter that calls, even if they are recruiting for a position that I am not appropriate for. Someone gave me the advice that you should cultivate a recruiter network. It was good advice. In order to build that network, you actually need to speak to them. So, even if I am not right for the position, I chat with the recruiter. Often, especially for analytic jobs, they don't know the space, help them understand the job rec (truly!) or refer them to someone else. I always try to make the calls a positive experience for both of us. Even if we just chat about raising kids.

Second, if you are not interested in a job really try to pass on a referral. I almost always pass on a name. This means that I need to spend a couple of minutes looking through my contacts and see who might be appropriate for their job. One of my former direct reports is my go-to guy for referrals. If I am not interested in a job, he gets the referral. This helps him build his network and helps me deepen my relationship with the recruiter.

Third, be honest in your assessment of your interest level. If you are not the right person for the job or the job is too small, tell the recruiter. Don't try to get the interview for the practice. You'll mess up your relationship with the recruiter. Having said that, I have let a recruiter talk me into interviewing at a company, even though the scope was too small. The company agreed and then built a job around my skills. I wound up not taking the job, but I was up front about my concerns and they decided to proceed with the process, anyway.

I consider my recruiter network a real asset. Every job offer I received (I had 3) was through a recruiter. A good recruiter network will make your job search much easier.

Thursday, February 21, 2008

At least this time she did not hit me

Different kind of post. I noticed that all of the big bloggers are name droppers. Here is my most recent brush with greatness.

I ran into Barabra Minto a couple of weeks ago. She wrote the Pyramid Principle that I recommend on the right side of the page. I met her once before, at a McKinsey Alumni event. She introduced herself by hitting me fairly hard in the arm and calling me a jerk. She thought I looked like Ricky Gervaise from the British version of The Office; Since he plays the role of a jerk, she thought it would be funny to inflict some pain on him/me. As I said to her, before you hit someone, you might want to make sure they are who you think they are. We got it all straightened out and had a laugh. Fast forward 2 years. I saw her again at a McKinsey event and re-introduced myself, related the story, etc. When I got to the looking like Ricky part of the story she said "You do look like him." At least she is consistent.

Wednesday, February 20, 2008

Moving bubble charts

While I am doing analysis, I don't worry too much about visualization. I am very hypothesis driven and data visualization is a great compliment to a more exploratory approach. Having said that, I do give a lot of thought to how I present the data to others. I don't think I have ever used animation, but after watching this video, I may need to expand my horizons. You can play with the Trendalyzer shown in the video here.
Report Portal has a moving bubble chart type that you can use with your own data. Very neat. Thanks to Jonathan Salkoff for turning me on to Trendalyzer.

Predicatably Irrational

My friend, Dan, just wrote a book called Predictably Irrational. Dan is a Behavioral Economist and is the person I know who is most likely to win a Nobel Prize. Fascinating guy. How interesting? Check out this interview of him talking about his new book. Here is the link to the book. Predictably Irrational

Friday, February 15, 2008

Skip level meetings

For part 2 of the Dafacto meeting series I present: Skip level meetings.

I love having regular skip level meetings. For those who have never heard of a skip level, the term refers to the direct reports of your direct reports. I try to hold a 1-1 meeting with each of my skips every 4-6 weeks. In my last organization, I had regular skip level meetings with about 9 people (all of my US folks. I did not hold regular skips with my overseas staff, though I met with almost all of them when I went to visit). Including the weekly 1-1’s with my direct reports, I typically had 7 hours of meetings with various staff members. Obviously, this was an enormous investment in time. Was it worthwhile? Absolutely. Otherwise you are dependent on your directs for information regarding things like staff morale, organizational issues, project progress, etc. I know staff found them valuable.

What did we talk about? Though the skip owned the meeting agenda (notice a theme), the skips were really focused on two things. First, staff development. I wanted to learn the staffs’ career and personal goals. We would use the time to talk about if they were making progress against those goals and what could I do to help.The time was very focused on their careers. In fact, after my first meeting, each person had to put together a development plan so we would have a structured conversation about their development.

Second, we talked about their projects. We talked about what was going well and what was not. People knew that I was fair and that if a project wasn’t going well, they would often tell me in those meetings. There were a couple of times where one of my directs was not living up to his commitment to his staff and the folks on his team needed a channel to be able to voice their concerns. Also, I would get to learn what was going well and use that information to give folks special recognition (I gave bottles of wine and gift cards) and visibility both in the team and to more senior executives.

One last thing. As a manager, it is very easy to move or cancel these meetings. After all, these folks are on your staff and are going to understand that stuff comes up. Right? As a leader, it is a terrible idea. If I had no choice, I would sometime postpone, knowing that this was sending a bad message to the staff member. I tried would make sure that they knew I did not want to cancel the meeting, and reschedule as soon as practical. The worst thing is blowing off the meetings. You wind up alienating the staff instead of helping them.

Wednesday, February 13, 2008

Staff Meetings

I am amazed at how many managers don't have staff meetings. Do they think they are not necessary? I think regular staff meetings are a critical management practice for, well, good management and leadership. And don't get me started on weekly meetings with direct reports or skip meetings. First things first. Staff meetings.

When I first started managing multiple groups, I found that I was repeating the same news over and over again. Also, some of the groups were working on complimentary projects or were interdependent and I was increasingly acting as the communication bridge between groups. So, I started to have staff meetings. I have always taken the same general approach to my staff meetings. First some general practices.

First, who owns the agenda and runs the meeting? Not me. Never me. Typically, one of my direct reports who I am starting to think about promoting. I want to give them the experience of running the meetings. They are going to need to do it themselves soon enough. Also, I don't see any reason to control the agenda. If I want to talk about something, I'll ask to have it on the agenda or just bring it up in the meeting.

Second, who attends? This depends on the organization and the needs. If you are often talking sharing confidential information, then a small, senior staff meeting is the way to go. If you want to use the meeting for sharing information across groups, then invite the senior folks and maybe their direct reports. I have seen people invite their directs on odd weeks and include the skip level folks on even weeks. I tend to go with inviting the larger group and make it clear that what is discussed does not go outside the family.

Third, how long? Between an hour and an hour and a half.

Typical agenda?

1. Company Updates. I use this time to talk any big company or departmental news that is relevant. Typically, this time was spent explaining why the company, my boss, or myself was doing something that did not seem to make sense to the staff. Some senior folks are pretty command and control. like to have a pretty tight reign on the discussions. I would rather use the time to share information.

2. Ken Rona updates. I give the team a sense of what I am working on. I do this so the staff can act as effective agents on my behalf and bring up any items that would materially affect my work. In this way, the people on the team can proactively participate in helping me solve my problems. Also, I really liked discussing my work in front of the team. Not only were folks helpful and pushed my thinking, it is good for morale. people like having the transparency.

3. Direct Reports update. My directs share their project lists. The agenda keeper is responsible for putting together an update project list for the group for every meeting. Mostly I focused these discussions on time lines. Are we going to meet this commitment we made. I think having to affirm the commitments in public, every week, keeps people focused.

4. Information sharing. We share interesting team outputs. I am a big fan of sharing information across silos. I often find that someone would do an analysis and share with the team, only to find that either there was a better way to do the analysis or that we could reuse the analysis for another internal client.

Oh, another tip. Have someone bring food. We did not use catering. We rotated this responsibility and reimbursed the cost of the food. It would have been easier (and more expensive) to have it catered, but I liked that people could bring their own style to the catering.

Tuesday, February 12, 2008

The Analytic Value Chain - Do the analysis

Finally! Lets do the analysis! Actually, I want to spend more time on what not to do.

I really did not expect to write much about how to select the analysis required to solve your problem (what!). I have assumed that you know the appropriate analysis to conduct to answer your original research question. In retrospect, maybe that isn't a great assumption, but here was my thinking: My posts are designed for managers of analytic teams and the folks that work on those teams who are still developing their managerial skills. Those people (I thought) should know the right analyses for a given situation.

Increasingly, I am questioning this assumption. I have found that analysts who are well trained in advanced analytic techniques and remember their training are not the rule; those who understand their business, and can creatively apply their training to a new business problem are rare. Maybe 30 percent of the statisticians I have worked with wholly qualify under my criteria (mostly at A fOrmer empLoyer. Hiring a director is what inspired me to write this thread. I will do a post on how to identify a high potential statistician). Most analysts are technicians and they have a hard time suggesting analyses for problems they have not seen before. This is not an indictment of statisticians, just applying statistical tools to business problems is hard. How hard? Let me illustrate.

Conducting an Ordinary Least Squares regression when you should be using a logistic regression is a common mistake. By using the simpler OLS analysis, you can get totally wrong conclusions, leading to incorrect decisions. I am talking about answers that may not even be directionally correct. So, it is important that you use Logit, even though it is more complicated, when the situation warrants (when the thing you want to predict is a yes or no). I won't hire someone who does not know when to use logit.

This mistake is so basic and no one using regression should make it. And it happens all the time. I know a company (none that I have ever worked for) that were using OLS when they needed to use Logit in a production environment, affecting hundreds of jobs for their clients a month. Did the statisticians who built the system know better? I don't know. I do know that the company has advanced statisticians working for them and who know better and are advocating for change. This is why you need the talented experts. They stop you from doing really dumb things.

As much as I would like to, I can't map out the correct analysis for any given business situation. Having said that, there are some common situations that I have come across over the last couple of years that you should watch out for.

Regression:
If you are using regression, you need to pay attention to the frequency distribution of the dependent variable. Unless the dependent variable is continuous, has a relatively large range, and is normally distributed, Ordinary Least Squares regression is not going to give you the right answer. You may need a more sophisticated analytic technique. Some rules of thumb: If you are using OLS on a binary variable (think yes or no) you are going to need a more sophisticated technique. Also, watch out if the dependant variable has a natural floor or ceiling. So, income is a good example. Very few people make less than zero dollars. So, zero is the floor. If you have an floor, then you may need to go with a Tobit. Depends on the distro of your dependant variable. If the dependant is normally distributed, then you are probably ok. If not, Tobit...

Impact of Seasonality/Time:
Most folks come up with some arithmatic technique to model seasonality. I hate this. You can never unpack the drivers of your dependant variable. Instead, you should use some kind of time series technique; e.g., ARIMA. ARIMA will let you figure out what the real drivers of behavior over time, as well as taking time into account.

Segmentation:
Check out the Kenny's one rule post on segmentation. The short version: For segmentation, don't try to do too much with one segmentation scheme. I think of segmentation like regression. You build a segmentation scheme for specific purposes.

Designing Experiments
In a direct marketing context, at least done correctly, testing is continous. I have had a number of occasions where the tests are not readable due to a business ownerer needing more volume and killing control groups or specific test cells when doing complicated designs.


Winding down
The big statistics software vendors have not yet developed bullet proof tools to help inexperienced analysts to do the appropriate analyses. My advice is to hire an analytics expert and teach them your business. Don't try (too hard) to find someone who is an expert in both analytics and your industry. Industry knowledge is much easier to teach than anaytic expertise. Doing the analysis is hard. And can take a while (it once took someone on my team three months to build a data set and do the analysis for a difficult business problem. But he nailed it and it changed how our business partners thought about the drivers of their business.