Healthcare, customer marketing, iphone and education? Can we personalize all of these?

I was listening to a TEDTalk recently about education. One of the ideas in the talk was that children are not receiving the education they need. Many of the education programs, especially at the federal level, try to force a common structure, standardized tests, on students and this has the effect that teachers teach to the test. By teaching to the test, the curriculum normalizes to a focus on the test content. Essentially this has led to the same curriculum for all students. After all, they all need to take the test.

The logic reminded me of healthcare conversations. Today, medicine, by and large, is applied across wide swaths of people. But a funny thing happened along they way to medication heaven–some medicines worked better in some individuals than others. Over time, it was found that the more complex the medicine, the more that it worked well in some and not at all in others and perhaps even hurt the patient.

This created an opportunity for tests–tests that could determine when a medicine would well in a patient. These tests suggested that medication had to be personalized to an individual based on their specific chemistry and more importantly, their specific genetic structure. While the differences in a human being’s genetic structure is rather small, its significant for the medicines we create.

The field of personalized medicine was born. While the concept of personalizing medicine to a specific patient is age-old, personalized medicine today really implies the use of newer technology or processes to customize treatment for a patient. It recognizes that many complex human diseases and issues require a deeper understanding of what will work on a patient and that each patient is different.

And if we look at the area of customer marketing, where the most important trends in the past 20 years has been around taking marketing efforts down to the level of the individual. To know as much as you can about one person in order to better communicate with them, is all based on the idea that if you can personalize the messages based on the actual marketing target (the shopper for example) then that message will be heard and be much more effective at changing behavior–which in this case is to purchase a product.

The iphone is another example. The iphone is really about mass customization–the ability to create a platform that can be further customized by “apps.” The apps are the customization that tailors the phone to the needs of each individual. The iphone’s success is testament to this core idea.

But in the education area, according to the TEDTalk, its all about standardization. What is needed, according to the talk, is personalized education. In the same way that healthcare, customer marketing and many other areas of human endeavor have found, when something is personalized, it often performs better or is more relevant or is more beneficial.

What is stopping personalized education? Is it the bureaucracy? The teachers? The principals? The federal laws?

Surely, all of these probably play a role. But perhaps the larger issue, and one that underpins healthcare, customer marketing and the iphone is that there has to be a platform of productivity behind the customization. Education has none, but one is on the horizon.

Healthcare has a platform of science, with a fairly rich (and growing)  backbone of chemistry and genetics. Healthcare has fundamental tools that can be manipulated like building blocks to work through the discovery process. It can run many experiments to optimize itself to the problem.

Customer marketing, often driven by digital marketing, can run thousands of experiments (placement, color, message, etc..) as well as fairly solid technical commonality around message delivery and presentation–the technology is fairly mature and continually maturing.

The iphone is an obvious platform that others can innovate on directly. Its pretty obvious.

What about education? Can it rely on a few building blocks? Is there a mechanism by which the productivity of teachers or of the education process can dramatically accelerate? Does it have to be the same chalkboard and lecture format?

There are positive signs of a “platform” on the horizon. In my opinion, the highly disruptive power of digital based education (video lectures, tests, etc.) could be the platform that is needed. By siphoning off core learning activities of the mechanics in some fields, for example, addition and subtraction, calculate and many other areas, the education process’s productivity could rise. Teachers need to focus on the hardest educational processes and the toughest subject areas and customize the dialogue with the student based on what the student needs. But all of this customization, this personalization, takes significant amounts of time. Where will that time come from? It’s not going to come from teaching a standard curriculum using traditional techniques. We know that this does not work.

What is clear that teacher engagement makes all the difference. And teacher productivity is key. Yes teacher’s need better oversight, just as principals need better oversight and the school bureaucracy and school boards are neither healthy or make sense. But the largest impact we can have and the most important impact is to give teachers a productivity boost. And the best way to do that is to let the easy stuff be handled through other teaching techniques and focus teachers on the student’s at a more personalized level.

We see this in colleges today. Despite the incredibly poor management at colleges and universities today (it really is quite disappointing), they do have one thing right although for the wrong reasons. There is a shift to computer based training for standard material. This frees up the professor’s time. Of course, most universities should just be shut down for being so poorly managed but that’s another story. Clay Christensen talks about disruption in the education market at the senior educational level. It could wipe out many inefficient and ineffective universities today (hurray!) and actually improve education for all of society instead of the few that can pay increasingly large amounts of money.

We need some of that disruption at the elementary and high-school levels as well in order to allow personalized education to be used at the level where it can be influential. We need that today.

Graph databases, metadata management and social networks

I was speaking to a friend the other day and they mentioned they were working on some metadata analysis. He had built a MS Access database to import the metadata. He found the going quick tricky as the analysis they were performing is called “data lineage” and they were having difficulty. He also wanted to analyze mappings between fields in the database as well as mappings between lists of values (a list of value is like the set of values you see in a dropdown box on a user interface). All of this seemed like social networking to me.

The way to think about is that (and I could use John Seely Brown’s Social Life of Information book to back me up here) the data lineage problem is just like a social network. You want to track something from its start to the next hop. The “friend” in this case, is the place where the data is transported to another system. Hence a “friend” of a piece of metadata must be another metadata item in another system or database table. Data lineage was nothing more than social networking. To me, data lineage would probably generate much simpler networks but I would guess that there are alot of grey areas about figuring out all the places that data is moved to or converted to along the way–that’s probably what makes it a much harder problem.

Naturally, not knowing whether it was possible or not I mentioned how graph databases could capture most of this data fairly easily and you could run very sophisticated queries. I had not really deeply thought about it but I had been reading up on graphs and probability & statistics, etc. So it seemed reasonable to me.

Of course, just doing a simple import of metadata into MS Access is fairly straightforward . You define some tables that capture a “table” concept and it has a bunch of relationships to “fields.” This can be modeled in RDBMS using foreign keys and such. But as you normalize out the other concepts, such as categories of tables, or try to describe different types of tables, such as views or other RDBM’ish structures, the MS Access approach starts getting a bit daunting.

But my friend wanted to deeply analyze the data and have something that could scale to much harder metadata problems. So I dipped into a neo4j manual and read some blogs. I then I ran across alot of blogs that described classification through taxonomies and ontologies and other types of very abstract ways of describing data. This became complicated very quickly and I realized that I wanted to try and do something small but not necessarily simple. I would need a graph model that was highly compact and could change as requirements changed (my friend said metadata requirements change all the time). And I would sacrifice the ease of a dedicated but highly rigid model for one that was general. I was essentially shifting complexity from the model itself to the processing layer that would sit above the model. But that’s fine if it came me something that exceptional room to grow.

So after reading the manual, the blogs and thinking about it for another hour. I realized that I could do most of what he wanted using a few very simple concepts:

  • A DataItem is a description of a data element or a value in a list of values. A DataItem could be part of multiple categories. We will call these categories DataItemSets.
  • A DataItemSet is  collection of DataItems. The sets could have a taxonomy (categories of categories) so that a set could be part of another set. I could not imagine sets of sets of sets, but it seemed that a friend could be a friend of a friend so a set could be a parent of a set.
  • DataItemRelationship will connect a set of “From” DataItems to a set of “To” DataItems. The From and To could be 1 to 1 but we wanted to keep it general. These are the edges of “LIKES” or “KNOWS” in the social network.
  • DataItemRelationshipSet will be the taxonomy for the relationships just like a DataItemSet. Unlike many social networks, you may need to classify a relationship with more information than just “LIKES.” Facebook gives you “likes” but a “like” is fairly general, you do not know how strong that like is for any given pair of nodes. So by having a taxonomy for the relationships, we can have categories of categories or whatever you want to more fully describe the relationship.

That’s it. Just 4 main graph node concepts. We will also need to label our nodes with the concept that it represents and to ensure that it has the right set of properties. So a small amount of “infrastructure” is needed to do this labeling and match a label to a set of properties that should be available on that node. For example, a DataItem that represents metadata will have different properties than a DataItem that represents a value in a list of values.

I thought that with these simple concepts we could construct everything that was needed. Since metadata is just data and list of values are just data, it seemed to me that the graph just conceptually holds data and we can treat both the same in the graph albeit with different node properties.

I’ll give it a whirl and report back by trying a very small experiment to see if this design is totally impractical to implement or if it really shines. I’ll also try to hook it up to cytoscape for visualization. However, its clear, just like with MS Access, if you want a solution quickly, just go buy a Global ID-type product.

Healthcare risk: long-term versus short-term

A long time ago, when the first financial crisis hit around the savings and loan (S&L) industry in the 1980 I remember that there was a thread of conversation around using long term interest rates to make short-term bets. The idea was that by borrowing long term at a lower rate, you could seek to play with assets on a short-term and make money. That’s not a new concept, its just that this behavior promoted playing at the boundaries of the risk envelope.

An opinion article today in the Washington Post pointed out the same thing happening today in healthcare.

The article described how the Oregon Medicaid lottery system gave new insights because it was essentially run as a large-scale randomized controlled trial (RCT). They found that there was an improvement in health when Medicaid benefits were used. It also pointed out that the benefits were a bit different than expected. There was no change in acute or chronic issues like the big ones: diabetes, hypertension, etc. Instead, they found a decrease in mental health (around 30%) issues. There were other benefits as well.

Based on my own industry experience, I know that its hard to keep Medicaid members enrolled. Since Medicaid serves the lower income group and the lower income group typically exhibits higher rates of coming in an out of managed care due to moving, jobs, ability to get to healthcare facilities (access) and other factors. Hence, the benefits you obtain from the Medicaid benefit are probably more long term (which is part of what the study found). So even in a group that is highly transitory, long term benefits were observed.

With commercial experience, the groups are less transitory to some degree. But people still do switch plans and change jobs.

Either with commercial or Medicaid, or really any system, when access to something short-term provides long term benefits, how to people who put in the effort (in this case insurance companies who shoulder the risk) manage to match that against the long-term benefits. Of course, actuaries play with this equation all the time.

But imagine this essential tension playing out at the macro-scale. You have systematic and exceptional large and complex resource spent today that have a long-term pay-off. How do you run a business that way?

You can of course charge more today or you can hope that on average, even in face of all the churn, that the churn gives you a net neutral. But this type of risk calculus can lead to a policy of minimizing investments today perhaps to the point of depressing outcomes in the future.

It seems that incentives are out of whack and until the incentives change and come into alignment, there may not be alot of progress.

The past 10 years of data warehousing has been all wrong…

This is an “ideas” post. One where I am trying to work out some ideas that have been bouncing around my head since this morning.

Essentially, the past 10 years of data warehousing have been wrong. Wrong in the sense that the area of data warehousing has not adapted to newer technologies that would solve fundamental data warehousing issues and increase the opportunity for success. Data warehousing projects come in all shapes an sizes. The smaller they are and the more focused the problem they are trying to solve, typically the more successful they are. This is because data warehousing has many non-technical issues that cause it to fail including issues such as : failure of the business to listen & communicate and the failure of IT to listen & communicate, requirements change as fast as the business changes (and that’s pretty fast for many areas such as sales and marketing) as well as sponsorship, sustainable funding and short-term commitment mentalities. Many of these factors are mitigated by smaller projects.

Hence, the application of “lean” and “agile” methodologies to data warehousing. These approaches are really a learning by doing model where you do a small amount of work, receive feedback, do a small amount of work, receive feedback, do a small amount of work….and so forth. Many tiny cycles with feedback help promote alignment between the data warehouse (and its current iteration) and what the business wants or thinks it wants. These approaches have helped but at the trade-off that its difficult to implement very large scale projects across different location models where developers are spread out around the world. So its helped, but large, complex projects must still be conducted and its clear coordinating a large team is just really hard.

Data warehousing technology has not substantially helped solve these problems. Today, larger databases that run very fast are available, but they are built using the old approach e.g. data models, ETL, etc. So those components just run faster. That helps of course because there is less time spent trying to optimize everything and therefore more time spent on other tasks, such as working with the business. But the current use of technology is not really solving lifecycle issues, it actually makes it worse. You have data modeling teams, ETL teams, architect teams, analyst teams–all of which have to piece together their components and have something large work. It is like building a rocket ship without large government funding.

BigData has stepped in and has made available other tools. But they are often applied and targeted at a very specific workflow–a  specific type of analysis–that can be programmed into what are generally fairly immature tools. So BigData is helping because it helps loosen up an architect’s thinking around how to put together solutions as well as employ non-traditional technologies.

So what would help? Let’s consider a world where compute power for each user is effectively infinite. We are not saying its free, just that its relatively easy to get enough compute power to solve specific types of problems. Lets also assume that the non-technical issues will not change its an invariant in this scenario. And lets assume we want to use some elements of technology to address non-technical issues.

In this scenario, we really need a solution that has a few parts to it.

  • We need better tools & technologies that allow us to deliver solutions but deliver solutions under a rapid pace with significantly more updates than even today’s technologies. Lets assume that the word “update” means both the data updates frequently as well as the structure of the data changes frequently.
  • We need to be able to use one environment so that people creating the solutions do not have change the toolset and make diverse toolsets work together. This is one of the reasons why SAS is so popular–you can stay in one toolset.
  • We also need technologies that allow a lifecycle process to work with small teams who combine their solution components together more easily and whenever they are ready–versus when large, team milestones say that components have to be integrated.
  • We need to support processes that span the globe with people who contribute both technical and domain knowledge. We want to support decoupling the teams.

Let’s imagine a solution then. Let’s assume that every piece of data coming into our technology solution is tagged. This one value is tagged as being part of a healthcare claim and represent a diagnosis code. You tag it as being a diagnosis code, as being part of a claim, as being a number, etc. You can describe that relationship. Let’s tag all the data this way. Essentially, you are expanding the context of the data. Now lets assume that we can establish these tags and hence, relationships, between all the data elements and lets also assume that we have a tool that can change these relationships dynamically so that we can create new relationships (pathways) between the data. Of course, ETL conceptually does not go away, but lets assume that ETL becomes more of a process operating at different scales, the data element level, the relationships level, the aggregate of data tag level, etc.

Now, because we have infinite computing resources, we can start assembling the data the way we would like. If we are technologist, perhaps we assemble the way that is helpful for putting together a production report. If we are an analyst, we might assemble it in a way that helps us determine if an outcome measurement improved based on an intervention (which has its own set of tags). When we assemble, we actual describe how data is grouped together to form a hierarchy of concepts. A DX code is a field that belongs to a claim, or a field that belongs to clinical indicators. Indicators are related to procedures through a probabilistic relationship based on past-seen relationships or programmed relationships.

Given that we can assemble and reassemble, let’s also imagine that at any time we can copy all of the data and all the tags. We can go to a master area and just say, I would like to copy it so I can fiddle with my tags and if everyone says they like my tags, I may contribute them back to the master. And lets assume that if the master dataset is updated with recent data, I can just merge those data into my working set. Essentially, we have checked out the entire dataset, track our changes to it, update it with other changes from other people and can check our changes back in for other to use–very much like a data change management solution. As the tags evolve, other people can assemble and reassemble the data in new ways.

So one solution to help fix data warehousing is to employ BigData technology but in a way that allows us to assemble and analyze the data they each individual wants to. And when that individual creates something useful, to share it with others so they can use it. The NoSQL database conceptually give us this capability especially when the data is represented by something as simple as a key+value. Source code control systems like “git” (large scale, distributed management system) give us a model to shoot for but at the data warehouse level and the current crop of ETL programs inform us of the types of changes that need to be made to be data to improve the quality for use.

Many of the ingredients exist today we just need the innovation to happen.

Social media, BigData and Effort-Return

The classic  question we ask about marketing, or really any form of outreach, is given the effort I expend, what is my return. This Effort-Return question is at the heart of ROI, value proposition and the general, down-to-earth question of “Was it worth it?”

That’s the essential question my clients have always asked me and I think that’s a big question that is being formed around the entire area of social media and BigData. Its clear that social media is here to stay. The idea that “people like us” create our own content is very powerful. It gives us voice, it gives us a communication platform and it gives us, essentially, power. The power of the consumer voice is amplified.

Instead of 10 people hearing us when we are upset at something, we can have 1,000,00 hear us. That’s power. And the cost of “hearing” the content, of finding it, is dropping dramatically. It is still not free to access content, you still have enormous search costs (we really need contextual search–searching those resources most relevant to me instead of searching the world). But search costs are dropping, navigation costs are dropping. Every year, those 1,000,000 can listen and filter another 1,000,000 messages.

BigData has come our rescue…in a way. It gives more tools to the programmers who want to shape that search data, who want to help us listen and reach out. There’s alot of hype out there and the technology is moving very fast, so fast, that new projects, new frameworks and new approaches are popping out every day.

But is it worth the effort? If so, just how much is it worth it? That’s still a key question. The amount of innovation in this area is tremendous and its not unlike the innovation I see occurring in the healthcare space. Everyone is trying something which means that everything is being tried somewhere.

That’s good. But it pretty clear already that while we can now communicate in many new channels unavailable 5 years ago, we can communicate easier, more frequently and find things that interest us, does it really pay-off? Do companies benefit by investing now and trying to get ahead or do they just try to keep pace and meet some, but not all, customer expectations. Will entire companies fall because of social media and BigData?

Those are hard questions to answer, but I think we can look at other patterns out there and see that even with all the hype today, it will be worth it. Companies should invest. But many should not over invest. When the internet was first widely used, it was thought that it would change the world. Well, it did. It just too 2 decades to do that instead of the few years that futurists predicted. But with a higher density of connectivity today, innovations can roll out faster.

But I think that’s where the answer is. If you are in the business of social, then being on that wave and pushing the edge is good. If your business is BigData tools and technologies, then yes, you need to invest and recognize that it will be worth it in the long run if you survive. But many brands (companies) can just strive to keep pace and do okay. There are exceptions of course. Product quality and features still dominate purchase decisions. Yes people are moved by viral videos and bad customer service or bad products, but companies with long-standing brands whose products are core, can afford to spend to meet expectations versus having to over invest. They will do just fine keeping pace and continuing to focus on the product as well as marketing. For example, does Coca-Cola expect to double market share because they are better at social media than others? Will they grow significantly because of it? Its not clear but for some segment of products, the spending pattern does not have to be extraordinary. It just needs to keep pace and be reasonable.

This gets us back to the question of social media, BigData and Effort-Return. Effort-Return is important to calculate because brands should not over invest. They need to manage their investments. Is social media and BigData worth the investment? Absolutely, its really a question of degree.

Lean startups

The May issue of HBR has an article on the learn startup model. Essentially, you need to prototype something, find clients quickly, get feedback and iterate again. The idea of “lean” is that you forgo deep planning and marketing that may not make sense since most plans change rapidly anyway.

There is alot of truth in that. I’ve helped startups (even in my early college and grad-school days) and certainly it makes sense to try something and keep iterating. I found that to be true with writing, software, management consulting ideas and a variety of areas.

However, its not universally true. You really do need deep thinking in some cases and some industries or in situations where just getting to the first prototype will consume significant capital. Hence, the idea is a good one, but should be judiciously applied. That’s not to say that getting continuous feedback is ever bad, it’s just that you need more than a prototype.

There is an old management principal around innovation. There is “learning while doing” model that says you cannot know everything or even 1/10th of what you need to know, so its better to get started and learn as you go. That’s the basic concept behind the lean startup (see here for more info on learning-by-doing which is concept from the Toyota system).

The concept is bouncing around the technical crowds as well. This article makes the case that you need to “learn fast” versus “fail fast, fail often” which is in the spirit of the lean startup. In fact, now there are learn canvases that you can put together. While there are alot of good ideas here, I think the only rule about employing them is “pick your rules carefully.”

Hospital profits and trust in the medical community

The Washington Post had an article that covered how hospital profits are increased when there are complications with surgery. I do not think that the health care system causes complications intentionally, but for me this links back to arguments made by Lawerence Lessig.

His argument in his book “Republic Lost” is that the presence of money (profits), in the wrong location (complications related to surgery) causes us to think differently about the relationship between those that provide care and patients (givers and receivers). He believes that the mere presence of money causes us to change our trust relationship with other party.

There does seem to be some evidence of physician-led abuses in the care community. But abusers are more than just providers. An entire ecosystem of characters are at work trying to get a slice of what is an overwhelmingly large slice of money in the US economy.

It is a large pie and so we should expect to have some abuses by all parties involved–including patients! The issue is really about how the presence of money and of stories that discuss it like the above, distorts our trust relationship. According to Lessig, it is this distortion that is decreasing our trust relationship.

Using Lessig’s argument, it is not that we think Providers are needlessly causing complications to obtain more profit, but the presence of money for the wrong incentive causes us to think twice. This is the essence of his argument of “dependency corruption.”

To remove these incentives and restore trust–is the solution an integrated capitated model like Kaiser Permanente? Where Provider & Payer are one and the same and hence, there is a motivation to reduce costs and improve outcomes because those who save dollars get to “cash the check?”

If you believe that this incentive model is the only one that could restore trust, what is the eventual outcome? Could it be that the entire healthcare insurance market will fragment into thousands of small plans perhaps like Canada where there is a central payer and then thousands of healthcare plans to fill in the cracks?

Or is the only way to restore trust to go to a national payer system so that a majority of the healthcare delivered would have an integrated incentive?

Are there any in-between models that work?

Its not clear what will happen but it does seem that trust, as abstract as it sounds, could lead to major structural shifts in the industry just as trust in today’s government seems to be greatly diminished (citizens think that the government is captive to special interest groups and lobbyists who finance their campaigns).

Regardless, I think that smart information management technologies can support that type of fragmentation and still be efficient so we should not let technology limit the best model for healthcare delivery. After all, the new health care law (patient protection act) is attempting to create a nationwide individual market (almost overnight) and the plans must meet minimum standards. They will also include little gap-fillers to fill in the gaps for those that want delta coverage.

We’ll see.

Twitter and the source of real-time news – is it the new emergency broadcast system?

The Washington Post’s Outlook section mentioned that twitter has become the first source of news ahead of the major networks and news organizations. While this is certainly true for many public events, such as the Boston Marathon bombing events, its probably more likely that twitter is the first source of news for certain news area segments. Public events and events where smartphones can operate such as public spaces or urban settings, twitter can be the first to report issues because the witnesses or even the participants can self-report. For other types of news, such as corporate internal news, certain types of business news and government news, finding stories and issues can take time and deeper digging–activities that are not so heavily aligned with twitter’s instant communication model.

Of course, with the power to be real-time comes the responsibility to not use it for fraudulent purposes. Twitter puts significant “power” into the hands of the individual and power can be used for reporting problems and issues but it can also be used to amplify untruths or fraudulent information. All communication channels have this balance to some degree. With Twitter, the amplification effect and balance must be managed much more closely.

Although not everyone has a smartphone or receives twitter alerts many companies and government groups do monitor twitter, in essence, it is now a new socially driven, emergency broadcast system.

Largescale Healthcare sensor networks and BigData

Lately, there have been announcements that could make large-scale, healthcare focused sensor network much more of a reality. A healthcare monitoring network could drive substantial improvements in care and reductions in cost. Today, if you are in a hospital, you are plugged into the sensor network that is relatively stationary and highly controlled (for obvious reasons). But there are many more healthcare, consumer-level networks that could be created. Here’s a mention of the world’s smallest blood monitoring implant and other heart rate monitoring capabilities based on visual monitoring techniques:

Putting together a big data solution here means a solution that can scale out. Batch technologies are not the answer here so frameworks like hadoop directly are not the primary component. Other analytical frameworks like Storm, Dempsy, Apache S4, Esper, OpenMDAO, Stormkeeper or the eclipse m2m framework are needed.

In this case, BigData is about scaling out solutions for sensor networks and piecing together analytical processing nodes to create a workflow that accomplishes the analysis.

But healthcare sensor networks are not without their challenges. Here’s some links that describe the issues in more detail and the research going on in this area