Community rating or individual rating for health exchanges

I was chatting with someone about health exchanges and they mentioned about building risk models at the individual level. Certainly that’s a good thing.  However, according to the Act, 2013 will have adjusted community ratings. Ratings will be based on a few factors at the individual level. The obvious variables are present: smoking and such. But the other variables will include things like age, family size and geography. This is for non-grandfathered (i.e. new) plans and will be in effect for companies less than 100 employees.

The ratios of pricing, the price ratio between the different risk areas, cannot exceed certain ratios. Of course, just because the law says this does not mean that there are not costs that will be incurred outside those ratios. This will lead, most likely, to higher rates in other plans such as employer plans. Perhaps this will force more employers to be self-insured area–which might not be a bad thing except that it creates an employment model that favors large companies and could force the squeeze on small companies. Perhaps the landscape will shift to one of large employers overall–the exact opposite of the markets the Act is suppose to help–individuals and small companies.

Opportunities for BigData and Heathcare: Need a little change management here

What are the bigdata opportunities in healthcare? Today, BigData techniques are already employed by startups because BigData technology today can be very cost effectively used to perform analytics  and gives startups an edge on the cost and capabilities front.

Big what are the opportunities in heatlhcare for established companies? I’ll offer the thought that it can be broken into two main categories. The categories reflect the fact that there are in-place data assets that will be in place for quite awhile. Its very difficult to move an entire infrastructure to a new technology base overnight. It is true that if some semblance of modern architecture (messaging, interfaces for data access) is in place today, the movement can be much faster because the underlying implementation can be changed without changing downstream applications.

The two categories are:

  • Move targeted, structured analytical workflows to BigData.
  • Enable new analytical capabilities that were previously not viable.

The first category speaks to the area of BigData that can make a substantial ROI appear fairly quickly. There are many well-undestood workflows today inside healthcare Payers, for example, that simply run too slow, are not robust or are unable to handle the volume. Purchasing another large, hardware based appliance is not the answer. But scaling out to cloudscale (yes using a public cloud for a Payer is considered leading edge but easy to do with the proper security in place) allows a Payer to use BigData technology cheaply. Targeted workflows, that are well understood but underperforming can be moved over to BigData technology. The benefits are substantial ROI for infrastructure and cost avoidance for future updates. The positive ROI that comes from these projects indicates that the transition pays for itself. It can actually occur quite quickly.

The second opportunity is around new analytical capabilities. Today, Payers and others cannot simple perform certain types of analytics easily because of limitations in the information management environments. These areas offer, assuming the business issue being addressed suggests it, substantial cost savings opportunities on the care side. New ways of disease management, outcomes research and network performance management can make substantial returns in under 2 years (it takes a year to cycle through provider network contracts and ensure the new analytics has a change to change the business process). Its these new capabilities that are most exciting.

The largest impediment to these areas of opportunity will be change management. Changing the way analytics are performed is difficult. Today, SAS is used more for data management than statistical analysis and is the defacto standard for the analytical environment. SAS offers grid and other types of larger data processing solutions. To use BigData, plans will have to embrace immature technology and the talent that must be hired to deploy it. But the cost curve could be substantially below that of scaling current environments–again paying for itself fairly quickly. Management and groups used to a certain analytical methodology (e.g. cost allocations) will have to become comfortable seeing that methodology implemented differently. Payers may seek to outsource BigData analytics tools and technologies but the real benefit will be obtained by retaining talent in-house over the long run even if some part of the work is outsourced. Because analytics is a core competency and Payers need to, in my opinion, retain some core versus just becoming a virtual shell, BigData needs to be an in-house capability.

ProPublica: So why can’t the government analyze the data? And what about commercial insurance plans? What questions should we ask the data?

There was a recent set of articles quoting ProPublica’s data analysis of Medicare Part D data. ProPublica acquired the data through the Freedom of Information Act and integrated the data with drug information and provider data. As a side note there has also been the recent publishing of CMS pricing data using socrata dataset publishing model (an API factory). (Side Note: You can plug into the data navigator at CMS).

You can view various trends and aggregations of data to compare a provider against others and navigate the data to gain insight into the script behavior of Medicare Part D providers. If you review the methodology used to create the data, you’ll realize that there are many caveats and just reading through some of the analysis, you realize that a simple evaluation of the data is insufficient to identify actionable responses to insights. You have to dig deep to see if a trend is really significant or an artifact of incomplete analysis. Yes, there is danger in not understanding the data enough.

But the ProPublica analysis is a great example of analysis by external groups. It is an example of simple provider profiling that helps detect variations in standards of care as well as outright fraud. The medical community continues to improve standards of care but it is a challenging problem with few incentives and governance structures.

The question we may ask ourselves is, “Why does the government not perform more analysis?”

The short answer is that they do. The government performs extensive analysis in a variety of ways. What the ProPublica publications really show us is that there is alot more analysis that could be performed and that could be useful in managing healthcare. Some of it is quite complex and cost-wise we should not, either through expectations of infinite funding which do not exist or by law as set by congress, expect the government to perform all the various types of analysis that one could imagine should be performed. Everyone admit, that there is more there and I am sure we all have an opinion about top priorities that conflict with others.

And the government does act as publisher of data. The Chronic Condition Warehouse (CCW) is a good example of data publication. The CMS also has plans to do more in the information management space that should make the sharing easier. I am concerned about pricing though. Based on a very small sampling, it appears that extract costs are still quite high and cumbersome–on the order of $80K for several extracts covering just 2 years. This needs to flatline to $0 per extract since we already pay for CMS already and our funds should be wisely used to enable this service from the start. Both anonymized and identified datasets are available. Comprehensive, anonymized datasets should be available for free.

This publication of the data, including the pricing data, is a great example of “democratizing” data. Many companies use this language to describe the ability to access datasets in a way that any analyst (with sufficient HIPAA safeguards) can gain insight and do their jobs better through information driven guidance. We can see from these examples that just publishing the raw data is not enough. You must have the information management technology to join it together with other datasets. This is what makes analysis so expensive and is the main barrier to data democratization.

So what can’t commercial health plans publish their data? There is really no benefit to them for publishing. Although one could argue that individual state subsidies such as non-profit status, and hence a state entitlement that the residents pay for, should motivate the ability to force data publishing, there is really no benefit for commercial health plans to publish data. Commercial plans do analyze Provider data and create Pay for Performance (P4P) programs used to manage their networks. P4P often ranks Providers and provides incentives to provide more “value.”

Of course, the free agency theory applies here and P4P can really only ever be marginally helpful. Sometimes marginally helpful is good of course so I am not dismissing it. However, the same issues around the ProPublica analysis applies to health plans’ data.

  • First, the information technology of many plans is fairly immature despite the billions these plans handle. This is because they focus on claims processing versus analytics.
  • Second, they have the same data integration issues that everyone else has–and its hard work to get it right after 30 years of extremely bad IT implementations and a lack  management talent.

Things are changing now but I predict that even with better “data democratization” information management technology there is still not enough coverage of analytics to be transformational. It is  possible that if the government really wants to get serious about managing costs of healthcare and gaining insights from data to help drive transformational cost changes, it really needs to have all the plans publish their data together.

Again, you run into competitiveness issue fairly quickly since the “network” for a plan and the prices they pay are a big part of a plan’s competitive advantage.

But I like to blue-sky it here on my blog.

As a truly blue-sky thought, if the U.S. is really, albeit slowly, moving towards single payer (the government is already the largest payer already anyway) then as compromise to keep off true single payer, perhaps the government can force publishing of claim data for anyone to analyze (following HIPPA of course). This could stave-off the march towards a single-payer model and introduce consistency in the networks. This would shift the competitive focus the plans have and force them to compete in other healthcare areas that need more focus, such as sales & marketing, member/patient education outreach, etc.

Of course, there is another blue-sky thought–that Providers will start their own plans (a micro-patchwork of 1,000s of plans) publish their own data according to government standards and democratize the data to help the health of our citizenry. There are already examples for this model. The ACO model provided by the new parts of the Patient Protection Act as well as Medicaid programs where MCO have sprung up attached to hospital systems to serve the Medicaid population profitably.

As a final note, what surprised me the most about Part D prescriptions is that 3% of the script writers, wrote more than 1/2 of all prescriptions. This could mean that these Providers are concentrated around those that need the most help. Perhaps some government focus on these “super-scripters” could help  help manage their costs down.

There are some other thought provoking bits of information as well just in the topline numbers. Based on the report, the ratio of providers to beneficiaries is 1 out of 15. This seems like a really high concentration of providers to beneficiaries in the sense that each physician who wrote Part D scripts saw 15 beneficiaries. In the Michael Porter world where specialists focus more on their specialty and become highly efficient at it (better outcomes, lower costs), I would think that a higher ratio would reflect focus and perhaps the opportunity for innovation. Perhaps not.

Also, what’s astounding is that the total cost was roughly $77 billion dollars. This is for prescriptions including the costs of the visits. This helps prop up the pharmaceutical industry. Many of the top drugs are still branded drugs versus generics. But regardless of the industry it helps, that’s alot of money. Drugs are wonderful healthcare productivity boosters (they truly make a difference in people’s quality and duration of life) but we need to continually attach these cost bumps to shrink them.

It would also be instructive to bounce the medicare Part D data against provider quality scores, say at an aggregate level.

We could then answer the ultimate questions, which centers on value. That is, for the $ we invest in drugs under Part D, are we getting good outcomes? Is the Effort-Return equation producing the best numbers for us? That’s the question we really need to answer.

Healthcare, customer marketing, iphone and education? Can we personalize all of these?

I was listening to a TEDTalk recently about education. One of the ideas in the talk was that children are not receiving the education they need. Many of the education programs, especially at the federal level, try to force a common structure, standardized tests, on students and this has the effect that teachers teach to the test. By teaching to the test, the curriculum normalizes to a focus on the test content. Essentially this has led to the same curriculum for all students. After all, they all need to take the test.

The logic reminded me of healthcare conversations. Today, medicine, by and large, is applied across wide swaths of people. But a funny thing happened along they way to medication heaven–some medicines worked better in some individuals than others. Over time, it was found that the more complex the medicine, the more that it worked well in some and not at all in others and perhaps even hurt the patient.

This created an opportunity for tests–tests that could determine when a medicine would well in a patient. These tests suggested that medication had to be personalized to an individual based on their specific chemistry and more importantly, their specific genetic structure. While the differences in a human being’s genetic structure is rather small, its significant for the medicines we create.

The field of personalized medicine was born. While the concept of personalizing medicine to a specific patient is age-old, personalized medicine today really implies the use of newer technology or processes to customize treatment for a patient. It recognizes that many complex human diseases and issues require a deeper understanding of what will work on a patient and that each patient is different.

And if we look at the area of customer marketing, where the most important trends in the past 20 years has been around taking marketing efforts down to the level of the individual. To know as much as you can about one person in order to better communicate with them, is all based on the idea that if you can personalize the messages based on the actual marketing target (the shopper for example) then that message will be heard and be much more effective at changing behavior–which in this case is to purchase a product.

The iphone is another example. The iphone is really about mass customization–the ability to create a platform that can be further customized by “apps.” The apps are the customization that tailors the phone to the needs of each individual. The iphone’s success is testament to this core idea.

But in the education area, according to the TEDTalk, its all about standardization. What is needed, according to the talk, is personalized education. In the same way that healthcare, customer marketing and many other areas of human endeavor have found, when something is personalized, it often performs better or is more relevant or is more beneficial.

What is stopping personalized education? Is it the bureaucracy? The teachers? The principals? The federal laws?

Surely, all of these probably play a role. But perhaps the larger issue, and one that underpins healthcare, customer marketing and the iphone is that there has to be a platform of productivity behind the customization. Education has none, but one is on the horizon.

Healthcare has a platform of science, with a fairly rich (and growing)  backbone of chemistry and genetics. Healthcare has fundamental tools that can be manipulated like building blocks to work through the discovery process. It can run many experiments to optimize itself to the problem.

Customer marketing, often driven by digital marketing, can run thousands of experiments (placement, color, message, etc..) as well as fairly solid technical commonality around message delivery and presentation–the technology is fairly mature and continually maturing.

The iphone is an obvious platform that others can innovate on directly. Its pretty obvious.

What about education? Can it rely on a few building blocks? Is there a mechanism by which the productivity of teachers or of the education process can dramatically accelerate? Does it have to be the same chalkboard and lecture format?

There are positive signs of a “platform” on the horizon. In my opinion, the highly disruptive power of digital based education (video lectures, tests, etc.) could be the platform that is needed. By siphoning off core learning activities of the mechanics in some fields, for example, addition and subtraction, calculate and many other areas, the education process’s productivity could rise. Teachers need to focus on the hardest educational processes and the toughest subject areas and customize the dialogue with the student based on what the student needs. But all of this customization, this personalization, takes significant amounts of time. Where will that time come from? It’s not going to come from teaching a standard curriculum using traditional techniques. We know that this does not work.

What is clear that teacher engagement makes all the difference. And teacher productivity is key. Yes teacher’s need better oversight, just as principals need better oversight and the school bureaucracy and school boards are neither healthy or make sense. But the largest impact we can have and the most important impact is to give teachers a productivity boost. And the best way to do that is to let the easy stuff be handled through other teaching techniques and focus teachers on the student’s at a more personalized level.

We see this in colleges today. Despite the incredibly poor management at colleges and universities today (it really is quite disappointing), they do have one thing right although for the wrong reasons. There is a shift to computer based training for standard material. This frees up the professor’s time. Of course, most universities should just be shut down for being so poorly managed but that’s another story. Clay Christensen talks about disruption in the education market at the senior educational level. It could wipe out many inefficient and ineffective universities today (hurray!) and actually improve education for all of society instead of the few that can pay increasingly large amounts of money.

We need some of that disruption at the elementary and high-school levels as well in order to allow personalized education to be used at the level where it can be influential. We need that today.

Graph databases, metadata management and social networks

I was speaking to a friend the other day and they mentioned they were working on some metadata analysis. He had built a MS Access database to import the metadata. He found the going quick tricky as the analysis they were performing is called “data lineage” and they were having difficulty. He also wanted to analyze mappings between fields in the database as well as mappings between lists of values (a list of value is like the set of values you see in a dropdown box on a user interface). All of this seemed like social networking to me.

The way to think about is that (and I could use John Seely Brown’s Social Life of Information book to back me up here) the data lineage problem is just like a social network. You want to track something from its start to the next hop. The “friend” in this case, is the place where the data is transported to another system. Hence a “friend” of a piece of metadata must be another metadata item in another system or database table. Data lineage was nothing more than social networking. To me, data lineage would probably generate much simpler networks but I would guess that there are alot of grey areas about figuring out all the places that data is moved to or converted to along the way–that’s probably what makes it a much harder problem.

Naturally, not knowing whether it was possible or not I mentioned how graph databases could capture most of this data fairly easily and you could run very sophisticated queries. I had not really deeply thought about it but I had been reading up on graphs and probability & statistics, etc. So it seemed reasonable to me.

Of course, just doing a simple import of metadata into MS Access is fairly straightforward . You define some tables that capture a “table” concept and it has a bunch of relationships to “fields.” This can be modeled in RDBMS using foreign keys and such. But as you normalize out the other concepts, such as categories of tables, or try to describe different types of tables, such as views or other RDBM’ish structures, the MS Access approach starts getting a bit daunting.

But my friend wanted to deeply analyze the data and have something that could scale to much harder metadata problems. So I dipped into a neo4j manual and read some blogs. I then I ran across alot of blogs that described classification through taxonomies and ontologies and other types of very abstract ways of describing data. This became complicated very quickly and I realized that I wanted to try and do something small but not necessarily simple. I would need a graph model that was highly compact and could change as requirements changed (my friend said metadata requirements change all the time). And I would sacrifice the ease of a dedicated but highly rigid model for one that was general. I was essentially shifting complexity from the model itself to the processing layer that would sit above the model. But that’s fine if it came me something that exceptional room to grow.

So after reading the manual, the blogs and thinking about it for another hour. I realized that I could do most of what he wanted using a few very simple concepts:

  • A DataItem is a description of a data element or a value in a list of values. A DataItem could be part of multiple categories. We will call these categories DataItemSets.
  • A DataItemSet is  collection of DataItems. The sets could have a taxonomy (categories of categories) so that a set could be part of another set. I could not imagine sets of sets of sets, but it seemed that a friend could be a friend of a friend so a set could be a parent of a set.
  • DataItemRelationship will connect a set of “From” DataItems to a set of “To” DataItems. The From and To could be 1 to 1 but we wanted to keep it general. These are the edges of “LIKES” or “KNOWS” in the social network.
  • DataItemRelationshipSet will be the taxonomy for the relationships just like a DataItemSet. Unlike many social networks, you may need to classify a relationship with more information than just “LIKES.” Facebook gives you “likes” but a “like” is fairly general, you do not know how strong that like is for any given pair of nodes. So by having a taxonomy for the relationships, we can have categories of categories or whatever you want to more fully describe the relationship.

That’s it. Just 4 main graph node concepts. We will also need to label our nodes with the concept that it represents and to ensure that it has the right set of properties. So a small amount of “infrastructure” is needed to do this labeling and match a label to a set of properties that should be available on that node. For example, a DataItem that represents metadata will have different properties than a DataItem that represents a value in a list of values.

I thought that with these simple concepts we could construct everything that was needed. Since metadata is just data and list of values are just data, it seemed to me that the graph just conceptually holds data and we can treat both the same in the graph albeit with different node properties.

I’ll give it a whirl and report back by trying a very small experiment to see if this design is totally impractical to implement or if it really shines. I’ll also try to hook it up to cytoscape for visualization. However, its clear, just like with MS Access, if you want a solution quickly, just go buy a Global ID-type product.

Healthcare risk: long-term versus short-term

A long time ago, when the first financial crisis hit around the savings and loan (S&L) industry in the 1980 I remember that there was a thread of conversation around using long term interest rates to make short-term bets. The idea was that by borrowing long term at a lower rate, you could seek to play with assets on a short-term and make money. That’s not a new concept, its just that this behavior promoted playing at the boundaries of the risk envelope.

An opinion article today in the Washington Post pointed out the same thing happening today in healthcare.

The article described how the Oregon Medicaid lottery system gave new insights because it was essentially run as a large-scale randomized controlled trial (RCT). They found that there was an improvement in health when Medicaid benefits were used. It also pointed out that the benefits were a bit different than expected. There was no change in acute or chronic issues like the big ones: diabetes, hypertension, etc. Instead, they found a decrease in mental health (around 30%) issues. There were other benefits as well.

Based on my own industry experience, I know that its hard to keep Medicaid members enrolled. Since Medicaid serves the lower income group and the lower income group typically exhibits higher rates of coming in an out of managed care due to moving, jobs, ability to get to healthcare facilities (access) and other factors. Hence, the benefits you obtain from the Medicaid benefit are probably more long term (which is part of what the study found). So even in a group that is highly transitory, long term benefits were observed.

With commercial experience, the groups are less transitory to some degree. But people still do switch plans and change jobs.

Either with commercial or Medicaid, or really any system, when access to something short-term provides long term benefits, how to people who put in the effort (in this case insurance companies who shoulder the risk) manage to match that against the long-term benefits. Of course, actuaries play with this equation all the time.

But imagine this essential tension playing out at the macro-scale. You have systematic and exceptional large and complex resource spent today that have a long-term pay-off. How do you run a business that way?

You can of course charge more today or you can hope that on average, even in face of all the churn, that the churn gives you a net neutral. But this type of risk calculus can lead to a policy of minimizing investments today perhaps to the point of depressing outcomes in the future.

It seems that incentives are out of whack and until the incentives change and come into alignment, there may not be alot of progress.

The past 10 years of data warehousing has been all wrong…

This is an “ideas” post. One where I am trying to work out some ideas that have been bouncing around my head since this morning.

Essentially, the past 10 years of data warehousing have been wrong. Wrong in the sense that the area of data warehousing has not adapted to newer technologies that would solve fundamental data warehousing issues and increase the opportunity for success. Data warehousing projects come in all shapes an sizes. The smaller they are and the more focused the problem they are trying to solve, typically the more successful they are. This is because data warehousing has many non-technical issues that cause it to fail including issues such as : failure of the business to listen & communicate and the failure of IT to listen & communicate, requirements change as fast as the business changes (and that’s pretty fast for many areas such as sales and marketing) as well as sponsorship, sustainable funding and short-term commitment mentalities. Many of these factors are mitigated by smaller projects.

Hence, the application of “lean” and “agile” methodologies to data warehousing. These approaches are really a learning by doing model where you do a small amount of work, receive feedback, do a small amount of work, receive feedback, do a small amount of work….and so forth. Many tiny cycles with feedback help promote alignment between the data warehouse (and its current iteration) and what the business wants or thinks it wants. These approaches have helped but at the trade-off that its difficult to implement very large scale projects across different location models where developers are spread out around the world. So its helped, but large, complex projects must still be conducted and its clear coordinating a large team is just really hard.

Data warehousing technology has not substantially helped solve these problems. Today, larger databases that run very fast are available, but they are built using the old approach e.g. data models, ETL, etc. So those components just run faster. That helps of course because there is less time spent trying to optimize everything and therefore more time spent on other tasks, such as working with the business. But the current use of technology is not really solving lifecycle issues, it actually makes it worse. You have data modeling teams, ETL teams, architect teams, analyst teams–all of which have to piece together their components and have something large work. It is like building a rocket ship without large government funding.

BigData has stepped in and has made available other tools. But they are often applied and targeted at a very specific workflow–a  specific type of analysis–that can be programmed into what are generally fairly immature tools. So BigData is helping because it helps loosen up an architect’s thinking around how to put together solutions as well as employ non-traditional technologies.

So what would help? Let’s consider a world where compute power for each user is effectively infinite. We are not saying its free, just that its relatively easy to get enough compute power to solve specific types of problems. Lets also assume that the non-technical issues will not change its an invariant in this scenario. And lets assume we want to use some elements of technology to address non-technical issues.

In this scenario, we really need a solution that has a few parts to it.

  • We need better tools & technologies that allow us to deliver solutions but deliver solutions under a rapid pace with significantly more updates than even today’s technologies. Lets assume that the word “update” means both the data updates frequently as well as the structure of the data changes frequently.
  • We need to be able to use one environment so that people creating the solutions do not have change the toolset and make diverse toolsets work together. This is one of the reasons why SAS is so popular–you can stay in one toolset.
  • We also need technologies that allow a lifecycle process to work with small teams who combine their solution components together more easily and whenever they are ready–versus when large, team milestones say that components have to be integrated.
  • We need to support processes that span the globe with people who contribute both technical and domain knowledge. We want to support decoupling the teams.

Let’s imagine a solution then. Let’s assume that every piece of data coming into our technology solution is tagged. This one value is tagged as being part of a healthcare claim and represent a diagnosis code. You tag it as being a diagnosis code, as being part of a claim, as being a number, etc. You can describe that relationship. Let’s tag all the data this way. Essentially, you are expanding the context of the data. Now lets assume that we can establish these tags and hence, relationships, between all the data elements and lets also assume that we have a tool that can change these relationships dynamically so that we can create new relationships (pathways) between the data. Of course, ETL conceptually does not go away, but lets assume that ETL becomes more of a process operating at different scales, the data element level, the relationships level, the aggregate of data tag level, etc.

Now, because we have infinite computing resources, we can start assembling the data the way we would like. If we are technologist, perhaps we assemble the way that is helpful for putting together a production report. If we are an analyst, we might assemble it in a way that helps us determine if an outcome measurement improved based on an intervention (which has its own set of tags). When we assemble, we actual describe how data is grouped together to form a hierarchy of concepts. A DX code is a field that belongs to a claim, or a field that belongs to clinical indicators. Indicators are related to procedures through a probabilistic relationship based on past-seen relationships or programmed relationships.

Given that we can assemble and reassemble, let’s also imagine that at any time we can copy all of the data and all the tags. We can go to a master area and just say, I would like to copy it so I can fiddle with my tags and if everyone says they like my tags, I may contribute them back to the master. And lets assume that if the master dataset is updated with recent data, I can just merge those data into my working set. Essentially, we have checked out the entire dataset, track our changes to it, update it with other changes from other people and can check our changes back in for other to use–very much like a data change management solution. As the tags evolve, other people can assemble and reassemble the data in new ways.

So one solution to help fix data warehousing is to employ BigData technology but in a way that allows us to assemble and analyze the data they each individual wants to. And when that individual creates something useful, to share it with others so they can use it. The NoSQL database conceptually give us this capability especially when the data is represented by something as simple as a key+value. Source code control systems like “git” (large scale, distributed management system) give us a model to shoot for but at the data warehouse level and the current crop of ETL programs inform us of the types of changes that need to be made to be data to improve the quality for use.

Many of the ingredients exist today we just need the innovation to happen.

Social media, BigData and Effort-Return

The classic  question we ask about marketing, or really any form of outreach, is given the effort I expend, what is my return. This Effort-Return question is at the heart of ROI, value proposition and the general, down-to-earth question of “Was it worth it?”

That’s the essential question my clients have always asked me and I think that’s a big question that is being formed around the entire area of social media and BigData. Its clear that social media is here to stay. The idea that “people like us” create our own content is very powerful. It gives us voice, it gives us a communication platform and it gives us, essentially, power. The power of the consumer voice is amplified.

Instead of 10 people hearing us when we are upset at something, we can have 1,000,00 hear us. That’s power. And the cost of “hearing” the content, of finding it, is dropping dramatically. It is still not free to access content, you still have enormous search costs (we really need contextual search–searching those resources most relevant to me instead of searching the world). But search costs are dropping, navigation costs are dropping. Every year, those 1,000,000 can listen and filter another 1,000,000 messages.

BigData has come our rescue…in a way. It gives more tools to the programmers who want to shape that search data, who want to help us listen and reach out. There’s alot of hype out there and the technology is moving very fast, so fast, that new projects, new frameworks and new approaches are popping out every day.

But is it worth the effort? If so, just how much is it worth it? That’s still a key question. The amount of innovation in this area is tremendous and its not unlike the innovation I see occurring in the healthcare space. Everyone is trying something which means that everything is being tried somewhere.

That’s good. But it pretty clear already that while we can now communicate in many new channels unavailable 5 years ago, we can communicate easier, more frequently and find things that interest us, does it really pay-off? Do companies benefit by investing now and trying to get ahead or do they just try to keep pace and meet some, but not all, customer expectations. Will entire companies fall because of social media and BigData?

Those are hard questions to answer, but I think we can look at other patterns out there and see that even with all the hype today, it will be worth it. Companies should invest. But many should not over invest. When the internet was first widely used, it was thought that it would change the world. Well, it did. It just too 2 decades to do that instead of the few years that futurists predicted. But with a higher density of connectivity today, innovations can roll out faster.

But I think that’s where the answer is. If you are in the business of social, then being on that wave and pushing the edge is good. If your business is BigData tools and technologies, then yes, you need to invest and recognize that it will be worth it in the long run if you survive. But many brands (companies) can just strive to keep pace and do okay. There are exceptions of course. Product quality and features still dominate purchase decisions. Yes people are moved by viral videos and bad customer service or bad products, but companies with long-standing brands whose products are core, can afford to spend to meet expectations versus having to over invest. They will do just fine keeping pace and continuing to focus on the product as well as marketing. For example, does Coca-Cola expect to double market share because they are better at social media than others? Will they grow significantly because of it? Its not clear but for some segment of products, the spending pattern does not have to be extraordinary. It just needs to keep pace and be reasonable.

This gets us back to the question of social media, BigData and Effort-Return. Effort-Return is important to calculate because brands should not over invest. They need to manage their investments. Is social media and BigData worth the investment? Absolutely, its really a question of degree.

Lean startups

The May issue of HBR has an article on the learn startup model. Essentially, you need to prototype something, find clients quickly, get feedback and iterate again. The idea of “lean” is that you forgo deep planning and marketing that may not make sense since most plans change rapidly anyway.

There is alot of truth in that. I’ve helped startups (even in my early college and grad-school days) and certainly it makes sense to try something and keep iterating. I found that to be true with writing, software, management consulting ideas and a variety of areas.

However, its not universally true. You really do need deep thinking in some cases and some industries or in situations where just getting to the first prototype will consume significant capital. Hence, the idea is a good one, but should be judiciously applied. That’s not to say that getting continuous feedback is ever bad, it’s just that you need more than a prototype.

There is an old management principal around innovation. There is “learning while doing” model that says you cannot know everything or even 1/10th of what you need to know, so its better to get started and learn as you go. That’s the basic concept behind the lean startup (see here for more info on learning-by-doing which is concept from the Toyota system).

The concept is bouncing around the technical crowds as well. This article makes the case that you need to “learn fast” versus “fail fast, fail often” which is in the spirit of the lean startup. In fact, now there are learn canvases that you can put together. While there are alot of good ideas here, I think the only rule about employing them is “pick your rules carefully.”