Friday, August 31, 2007

Marc's Blog

One of my friends recommended Marc Andreesen's blog. Marc was one of the founders of Netscape and Loudcloud and seems to have a very good touch on picking ideas for (and executing on) start ups. I have been reading his practical advice on founding starts ups. They are a good read and I recommend them. The first post talks about why not to do a start up. I thought it was instructive, but that some of his reasons why start ups are difficult are applicable even in fairly well established companies. Hiring is always a pain, though in different ways than he describes.

He also mentions the long hours. For anyone who wants to great something great, the hours always are long. If really you care about what you are doing, then you are going to put discretionary time in to the work.

His comment about it being easy for the culture to go sideways I would not argue, but as a company gets bigger, the culture is going to take a turn for the worse. I have worked at two large companies that had grown rapidly and both of them had "old-timers" who spoke loudly about the degradation of the corporate culture over time. Creating a good culture is a constant battle and as a company gets larger, it gets fought (day to day) not from the top, but in the middle. Having said that, a bad CEO can ruin the culture faster than you can drive a termites minbike around a pea. I guess what I am saying is that a good corporate culture can go south at any time. Creating stuff is hard. Even in big companies.

Wednesday, August 29, 2007

The Analytic Value Chain

I am currently hiring for a Director for my statistics team and it is a hard position to hire for (if you think you might be qualified, please send me a resume.) A perfect candidate has to have skills all across the data analysis value chain. I imagine you are thinking "consultant speak"; maybe so, but I think of analyses as a product that needs to run through an analytic "factory." You can manufacture impactful analyses, just like GM manufactures cars. And I think of the value chain as having 9 steps.

1. Define the problem
2. Determine data requirements
3. Locate relevant data
4. Extract, Transform, and Load the data
5. QA data
6. Understand relationships in data
7. Do the analysis
8. Create presentation materials
9. Share results with business

It is hard to find someone who has the business skills to understand the problem faced by my internal clients, has the technical skill to manage the analysis, has the process re-engineering skills to improve the process (it is a factory, remember?), and the communication skills to present the results back to the business.

In the next couple of weeks, I'll discuss each of the steps in the value chain. Hopefully this set of posts will help folks as they set up their own analytic shops.

Tuesday, August 28, 2007

Reflections on setting up a blog and Google

Recently, someone I was speaking with referred to Google as a one trick pony and that observation resonated with me. At the time, I remember thinking "Of course, they only make money on ads. Why are they wasting money on creating all of these other products that don't serve ads?"

In retrospect, I don't think that is right. A couple of observations. First, I am someone who creates content, in effect, I am a publisher who, if I am doing my job right, is increasing the number of useful pages the web has to offer. And there really are not that many folks who produce high quality content (I think the jury is still out on my, but I am in thee game, swinging) and, in general, that Google has good a relationship these folks. Second, the growth of Internet advertising is driven both by having willing advertisers (who create demand) but also of having more content (to drive supply). Google is working both sides of the equation by making it easy for advertisers to buy placements but also for savvy content creators to webify their content. . Let me tell you one way that Google is making content creation (and increasing supply) easy. At least for Bloggers.

I prefer integrated products. I have spent too much time in my life trying to get best of breed products to work (see this post.) Google has set Bloogger up as a more or less integrated blogging publishing platform. Actually, they allow you to select best of breed applications, but they make it easy to use their products. Even still , the point holds.

I routinely use 4 Google products to publish this bloq. I create my posts in Google Docs, publish directly to the blog, have embedded Ad-Sense advertising, and track site usage using Google analytics. So what does this have to do with increasing supply? Google has made it easy to not only publish and track performance, they have made it easy to serve ad-sense ads. I could use other vendors for word processing or ad serving of web analytics, but why would I. I am pretty sure that Google has made it so that all of their products are going to play nice with each other (and so far, they do), so why take the chance.

All this is to say that even though Google may seem like a 1 trick pony, they are really a 2 trick pony. And even though the first trick gets a lot of attention, the second trick, creating new distribution channels for their ads, is also a really good trick.

Tuesday, August 21, 2007

Best of Breed or One Vendor?

Analytics folks need a good set of tools to do their work. I won't get into a SAS vs. SPSS discussion (different tools for different situations. Ok, I got into it) but I did want to talk really briefly about choosing an analytic "stack." By stack, I mean the set of entire tools that enable high quality analytic work; The tools that you need for data extractions, transformation, and loading, data QA, analysis, reporting and data mining, etc. In my organization the stack includes Netezza and Sybase as our database platform, SAS for our data ETL, QA and access, SAS for our ad hoc analysis/data mining, and Business Objects for standard reporting.

if you look not too closely, you can see a common theme. SAS is the dominant vendor. Do we use a lot of SAS because they make the best products for each part of our stack? No, but their products are good and they are well integrated. By concentrating our purchases on one vendor we reduce the integration costs; we are assured that the products are going to work well together. Or at least that we can hold someone accountable if the disparate products don't play nicely. In fact, SAS wrote a Netezza database connector for us when we needed it.

The other option is to take a best of breed approach and buy your products from multiple vendors. The advantage in going best of breed is that you can really tailor your stack to your needs. For example, you may need a lot of control over the charts created by your analysis tools and only one vendor allows you to have that control (I once worked for a company whose tick marks on charts had to point away from the chart. You can actually select this option in Excel, but most vendors don't allow such fine control. Having such a fine level of control over the look of the output was considered a "requirement" for our business intelligence platform.)

So which approach should you take? My advice is to fight the logic of best of breed. I know it is difficult to pick products that are not the best fit for your situation. Fight it. In my experience, most of these products are more similar then different. Think about word processors. Most of us use Word, but would not shy away from trying Open Office (actually, I am writing this post using Google Documents). Word processors all have the same basic functionality. Word may be very full featured, but most of us just use it as a text editor with a touch of formatting. Similarly, the stats packages are more similar than different. Those little features or very specialized functions that seemed so important when you were making your purchase decisions won't make a lick of difference in the quality of your team's analytics. What does make a difference to your team's effectiveness is their ability to deploy and exchange data between applications. There are times when you need to select a "best of breed" application; when you to use proprietary analytic techniques that are offered by a single vendor. But in general, you are better off giving up the perfect analytical app for one that is well integrated into an analytic suite. My advice is to stick with one vendor for your analytic stack or at least minimize the number of times you change vendors as data moves up your stack.

Sunday, August 12, 2007

Helpful tips for off-shoring

I may do a longer post about my experiences with my team in India, but for now, here are somethings you should keep in mind if you are thinking about off-shoring reporting and to a lesser extent, analytics. Here are some practical things you can do to help the effort be a success.

  1. Put process in place. By this I mean that you need to have a documented process (and process maps) for each report that is being run. Documentation includes things like data sources, business owners for the reports, how to run a report, what to do in cases of failure (with each type of failure enumerated), etc. This should be a living document that gets updated as reports find new ways to fail to run (we keep ours on a wiki). And the process of updating should be part of the process documentation. I have a recommended book, Business Process Management: Practical Guidelines to Successful Implementations that is a good reference. By putting process in place, you make the responsibilities of both the US and off-shoring side very clear. This clarity is critical. In fact, we won't start producing a report in India until the US side does documentation for an existing report. Trying to do off-shore reporting on an ad hoc basis will be very difficult.
  2. Be very careful about checking skills. We have not had a hard time finding qualified candidates on paper. We have had a very hard time finding qualified candidates in life. We have had a number of instances where candidates have grossly misrepresented their skills. Worse than anything I have seen in the US. Each candidate now has to pass through multiple screening tests of their skills. The tests are both oral, given during the phone screens and interviews, as well as written (given on site for candidates who have made it to an in-person interview). The tests are not hard, but you can't fake knowing what a proc freq is during an interview. We tried to give a pretty comprehensive written test to prospective candidates after the phone screen to be completed before they came in for an in-person interview, but we found significant cheating. Lesson learned.
  3. Check the references. Nuff said.
  4. Don't hire job hoppers. In the India market, anyway, there is some job hopping going on. It is not uncommon to find candidates who have taken several jobs and moved on. Don't think that they are going to treat you any different. We invest a significant amount in hiring and training and we need to make that training pay off. Also, I want people to become part of the culture. We won't even look at hoppers resumes.
  5. Make sure that the US side is invested in success of the off-shoring efforts. In my case, we track utilization and report quality of the team at our weekly staff meetings. I am the person on the US side who is ultimately responsible for the India's team success and sharing metrics with the rest of the leadership team ensures that both me and my staff stay focused. Also, if I find that someone is not using their India resource effectively, I will take the resource away.
  6. Use the off-shored staff for project based work where they can be fairly self sufficient. The original vision for the India staff was to be the equivalent of a US analyst. These were unrealistic expectations. The US staff works with their clients everyday and are much more able to solve problems both proactively and on an ad hoc basis. The time differences makes it much more difficult for the off-shore staff to find the people they need to speak with and they are, by the nature of the distance, more removed from the day to day needs of the business. In our case, production reporting was a perfect thing to move. Reports are produced on a regular basis, allowing the analyst time to get familiar with the infrastructure needed to run the reports and learn what the results mean. Processes can be documented and, in the case of staff turn over, be easily transitioned to someone else We kept ad hoc reporting stays in the US. Maybe over time it will move to India, but for now, we are staying put.
  7. Travel! Both you and your senior off-shore staff need to travel to each others locations, at bare minimum, a couple of times a year for a week. Also, think about bringing your more junior folks over once a year for 5-6 weeks. That will given them an opportunity to meet their US counterparts and build relationships that they need to do their job effectively. I went to India recently and found the experience invaluable. the trip gave me a first hand appreciation on how difficult managing the time zone differences are. I also got to play cricket. Make sure you get your shots and carry a small pharmacy with you. I got a very small cut that turned into a bad infection in about 8 hours. Thankfully, I had Cipro and Bacitracin with me. I treated the cut myself and it turned out fine, but it was touch and go for a while. Just to reiterate, the trip was invaluable.
  8. Meet the staff regularly and use video. I have a weekly pull up with my manager in India and a monthly round table with the whole team. I also hold a bi-weekly "office hours:" where folks in the US can stop by and give the manager and I feedback on how things are going. I just got tired of all the complaining about things not working right and now people not addressing their challenges in a forthright way. By having these forums, people have no excuses. Another tings. I found that when I was in India it was impossible for the off-shored staff to understand what was being said on speaker phone. The phones cut in and out. And no one said anything. We now try to use video for meetings whenever possible.
  9. Ask other folks what has worked for them. I got good counsel from a number of sources. Some of what I have listed above is redundant with that advice, but I agree with their advice.

I would not say that we have off-shoring nailed, but I think we are making good progress. Our next step is to actually offshore advanced analytic work, but we have just started. Once we get our feet wet, I'll put up a lessons learned for advanced analytics. Are their other things people have learned that should get added to the list?

Tuesday, August 7, 2007

Data Quality on the Cheap

I had a funny moment about 3 months ago. I was chatting with the VP of Advertising for a large retail chain and we got to talking about data quality (He was also responsible for Direct Marketing). I asked him “How do you check your data quality?” and he replied “I don’t think we do. Should I?” Yes. Yes, you should. If no one is checking regularly, your data quality is bad and your resulting decisions are going to be…well, you know.

Data tables are like cars. They need regular attention to ensure that they are performing well. If no one is checking, then your tables are out of tune. And for those of you who get your data from an outside vendor, don’t think that they are doing regular data QA. In my experience, my vendors ensure that the tables are produced, but are not tracking to see if the values in a given variable are reasonable. So, how should you check your data quality? We did some very simple things to give ourselves a pretty complete picture of the state of our data.

The first thing we did was build a historical database of some basic statistics for each variable in our bi-weekly production table. We tracked: mean, median, mode, 25th percentile, 75th percentile, standard deviation, skew, and kurtosis. We also tracked number of 0 values and number of missing values.

We found, straight away, that roughly five percent of our variables had large numbers of 0’s or missing values. We went back to our data provider for an explanation on why data seemed to be missing in these variables. Over the course of the next 2 weeks, they either found a problem with the data feed, the logic used to create the variable, or gave us a satisfactory reason why the data looked like it had some many holes.

Our next step was to look at variables that were not stable over time. Our dataset included all US households; the variables associated with the household don’t vary much in the aggregate. We focuses on calculated the mean, medians, and standard deviations over time for each variable (the other metrics, skew, ketosis, modes, etc, did not add any value over the basics). I think we went back 12 weeks (or 6 periods)

I was frankly shocked at how easy it was to find “suspect” variables! If you plot the values over time, suspects just jump out at you. We had some variables (I want to say 10% of the total number) whose means varied by greater than 10%, period over period. There were too many variables to chase down all at once, so we identified the 20 or so variables who were the worst offenders; their means varied by more than 50%, period over period. We went through the same process with the vendor as we did with the missing variables; fix the variable or justify why it varied so much. Over the next 8 weeks, we steadily reduced the amount of acceptable variation, going back and speaking with the vendor, variable by variable. This was a very valuable exercise. Our current variance threshold now hovers around 2%.

In the last part of the project, we made some process changes. First, I had a conversation with the vendor and offered them the SAS code we were using for data QA. They accepted immediately. They wanted to do the QA themselves, before we found a problem. We keep checking, but now the vendor can get ahead of the game and provide even better service. We review our data QA checks, bi-weekly, at my staff meeting. Typically, the person responsible now says “nothing to report.” In addition, we have created good SAS code to automate the process and have just move the QA process to India (I guess I should do a “Lessons learned in off-shoring” piece). All in all, our ongoing QA process is relatively painless.

Was this the most bullet proof data QA process we could have put in? No. We are relaying on changes in distribution to catch bad data. Some variables may be of poor quality and because they have not varied much, it is possible that we may never catch them. I don’t think this is likely. Each variable is used in some project or another on a pretty regular basis. Once a variable makes it into a projects, its quality is checked extensively. We have not found a new suspect variable this way yet, but you never know. I can say, pretty authoritatively, that our data quality has gotten much better.

Monday, August 6, 2007

Getting a pet project off the ground - Part 3

Getting pet projects off the ground - Part 1
Getting pet projects off the ground - Part 2

Conduct a successful pilot

I can imagine what you are thinking. “Conduct a successful pilot? What kind of advice is that? How am I supposed to know if a pilot is going to be successful? That is why I am running a pilot!” Obviously, you can’t ensure that your pilot is successful, but you can (and should) do everything you can to make sure that the execution of the pilot is high quality. In my case, all I did was facilitate and try to ensure that as the business ran the pilot, that I could help them make good decisions. I went to the planning meetings, I kept abreast of the decisions the business made (but really just advised, they did all the work).

I also paid a lot of attention to the results. In my company’s case, the pilot wave had a good result. I put together a one pager with the results and am now using it to show other business units the impact that site testing could have.

Be patient

The last step is not really a step, it is more of an approach. I would counsel to be patient in all parts of the process. Rushing these things turns folks off. You really need to build support and get buy-in. It took probably 9 months from when I had my first discussion with the vendor and our first test. Let folks see the value and want to make the pet project their own. On that note, don’t hold onto the project too tightly. Let other folks take ownership. In my case, the first business unit, the one who piloted site testing is off and running. I am evangelizing site testing with other parts of the business.

Getting a pet project off the ground - Part 2

Link to Step 1.
Step 2
I don’t control the web properties for any of my company’s web sites, so if I wanted to introduce the organization to the benefits of site testing, I would need partners. So, the next step was to start evangelizing site-testing though the organization (BTW, the vendor was Optimost). This step has multiple sub-stages; create a sound-bite, chat the project up, and educate senior business leaders and others.

Sound Bite
I think that a common mistake with junior folks who want their organization to try a new technology, software, whatever, is to lead discussions with the technology. “Look at how cool this is!”

The problem is that a business leader is not going to care about the technology. They care about the problem. You need to make the problem come to life. So, I gave a lot of thought to being able to quickly explain the business problem I was trying to solve using as little jargon as possible...

You are going to use this sound bite in front of senior execs, so it is worth getting it tight and getting it right. Once I had someone interested in the problem, I could get them interested in the solution. In my case, my problem statement is: “We make site design decisions based on opinion, and not fact. In order to know the facts, we need to be able to rapidly test our site designs. We can use a “site testing” vendor to test billions of site design combinations in a matter of weeks. This will let us build a web page that optimizes for the things we care about, like generating page views, increasing the number of unique visitors, or contribution value.” We can argue metrics later, at this point; I was just trying to get some folks interested.

Chat it up!
So I had my sound bite, my next step was to start using it incessantly; in my meetings with my manager, her manager, my colleagues, you get the idea. I wanted as many people as possible to have heard the sound bite. This is really about laying the groundwork, getting people familiar with your problem and agreeing that this is a problem that needs to be solved. The reality is that big companies have any number of big problems needing to be solved. I was trying to get agreement that site design was an important one, one worthy of solving.

Over the course of the next month, I had a couple of meeting with Very Senior folks (regarding other projects) and worked my problem statement into the discussions (I really was shameless). Both of them agreed that our site design process did not take business needs into account, and that the organization really had no way ensuring that the site of gathering those facts; facts that we needed to optimize our page design. I suggested to both of them that I bring in a vendor and invite each of their senior staffs to learn of the benefits of site testing.

This was an easy sell. Most Very Senior folks want their staff to be more innovative and are happy to give a little push. If you try to go from the bottom of the organization up, well, I have tried that method and have had little success. I am sure it is possible, but in my very big company, people are busy doing their regular jobs and need that push to take on additional responsibilities. I had actually tried to get the organization interested in site testing about 6 months previously. One of my colleagues and I invited a number of junior staff to an information session and nothing came of it. While they appreciated the session, no one felt empowered to kick off a pilot.

One last piece here, you better have become an expert in step 1. One of the business leaders had used site testing in a previous organization and knew his stuff. So, I needed to be able to have a pretty detailed conversation with him in order for him to be confident that I was the right guy to push this project forward. Almost home!

Educate senior business leaders and others

As mentioned above, I asked the Very Senior folks who should be invited to the educational sessions and they both suggested inviting all of their direct reports. I then put together an email that invited the directs to a meeting. The email explained site testing and offered to have a second session for their direct reports. This is a critical point. We actually had 2 meetings. One for folks who could reasonably sponsor a pilot and one for the folks who would be responsible for pulling it off. The types of discussions are different in those meetings and I wanted the leadership to be excited by the potential while I wanted their staff to be interested in the execution. I actually had at least 3 execution level meetings for various groups in the organization, but one Very Senior meeting.

It worked out well, though truthfully, I was trying to generate as much support as I could. If one of the more junior folks would have expressed interest in implementing site testing, I am sure we could have worked something out. Once again, I don’t know if my approach would work in every situation, but I was trying to plant 100 flowers and watched to see which one would bloom. The nice thing about having the Very Senior folks engaged was that their influence could help break log jams.

Next step: Conduct a successful pilot

Sunday, August 5, 2007

Google Analytics

I was wondering how to track visitors, etc, on this site. I just assumed that tracking functionality would be built into the back end. Turns out you can use any web tracking package and insert the script snippet directly into the pages header. Very cool. I really like that Google imposes as few constraints as possible. I went with Google Analytics just because they gave me a choice.

Getting a pet project off the ground - Part 1

Part 1

My staff tells me that one of my core skills is getting large organizations to try complicated things. What I think they really mean is that I am good at getting the organization to try things senior folks don’t fully understand. For more junior folks, the folks who are closest to the technologies and the line, who understand the “thing,” it is very frustrating to try to get an organization to try something new and complicated. It does not have to be frustrating. Most recently, I got my employer, a multi-billion dollar web portal company, to start experimenting around with multi-variate site testing. I have a pretty standard plan of attack for these kinds of things and followed the same strategy for getting site testing moving in the organization: I try to know more, built support, conduct a successful pilot, and be very patient.

Know more
I like taking on pet projects. Even if they don’t go anywhere, I learn a lot. In the case of site testing, I spent about 4 weeks becoming the company expert. I started out by Googling like crazy, identifying vendors who offer site testing, read their white papers, etc. I then called the vendors and set up informational discussions. My advice is “don’t be shy.” Most vendors are happy to take these calls. They love in-bound sales leads and they know most don’t go anywhere. I did 2 calls with my first vendor, both an initial discussion of the technology and then a follow-up where we focused on implementation. I then spoke with 3 other vendors and explicitly asked them how their product differed from my first vendor. At the end of the 4 weeks, I had gotten a great education in the technology and what differentiated each company. I even put together a little one pager for myself to make sure I could articulate those differences.

Next part, we’ll talk about building support and evangelizing. Should be up in a couple of days.